CPU-pinning on Compute Engine instances (Linux)

869 views

Skip to first unread message

Robin Whittle

unread,

Jun 28, 2017, 8:57:30 AM6/28/17

to gce-discussion

I am running 64 "vCPU" instances which means 32 physical cores of the underlying Xeon CPU chips, using 64 bit Debian 8.

The GCE virtual machine is abstracted from the physical machine in some way. Based on some of the results from cat /proc/cpuinfo:

model name      : Intel(R) Xeon(R) CPU @ 2.20GHz
cpu MHz         : 2200.000
cache size      : 56320 KB

I guess that physical machine (uswest1-b) is a dual Xeon E5-3699 v4 server, where each chip has 22 cores. I am running 20 to 32 instances of a number-crunching program (I guess 20 will be faster than 32) and each instance takes approximately the same time to finish its work. I want them all to finish ASAP, both to get the results quickly and so I can close down the server once they are finished. These runs may last hours or tens of hours.

The Linux kernel typically moves these running processes around the "CPUs" it sees, so I plan to use regularly use "CPU" pinning with taskset or numactl to put them back where I want them, to avoid two instances sharing a physical core by running on two "CPUs" which are actually the A and B halves of a single hyperthreaded core.

numactl F reports that all the "CPUs" 0 to 63 are in the one node, whereas on a local server (running Debian natively) with dual 6 core Xeons, it reports two nodes, each with 12 "CPUs", numbered 0 1 2 3 4 5 12 13 14 15 16 17 for node 0 and 6 7 8 9 10 11 18 19 20 21 22 23 for node 1. I don't know of any Intel-specific Linux documentation about which of these are the pairs belonging to physical cores. I should be able to establish by experimentation how this works on the local machine, and for now I suspect the A halves of the 12 physical cores are numbered here as 0 to 11 and the B halves 12 to 23. Assuming this is the case, when running the program on the local server, I will pin the 12 instances to "CPUs" 0 to 11, which means each one gets its own physical core. This requires regular repinning, since the kernel's scheduler can move them around from one minute to the next, though it usually doesn't.

It is important to pin the instances in this way to avoid a any one core hving both of its halves, A and B, running an instance of the program. This will leave another core idle, and will slow down both these instances in a manner I am very keen to avoid.

Is there any certainty or documentation regarding which of the 64 "CPUs" in these GCE instances correspond to the hyperthreaded pairs of physical cores? I will be able to find out by experimentation, and for now I am assuming that if I pin 32 instances to "CPUs" 0 to 31, that this will achieve my aim.

- Robin

Faizan (Google Cloud Support)

unread,

Jun 29, 2017, 5:00:13 PM6/29/17

to gce-discussion

Hello Robin,

Google Cloud uses the open-source KVM hypervisor[1][2] for GCE VMs. To make the hypervisor more secure Google does not use QEMU. The host CPU does use hyperthreading as such, for 64 threads you're paying for 32 cores. You will be fine to run <32 processes @100% CPU, you can let KVM sort that out for you. You can also pin them to vCPU it shill should be fine because KVM will still sort it out for you, though maybe you'll spend a few ns less in virtual-context-switch territory. You can refer to this[3] post for more information on vCPU pinning with hyperthreading.

Faizan

[1] https://cloud.google.com/compute/docs/faq

[2] https://cloudplatform.googleblog.com/2017/01/7-ways-we-harden-our-KVM-hypervisor-at-Google-Cloud-security-in-plaintext.html

[3] https://serverfault.com/questions/698888/virtualization-vcpu-pinning-with-hyperthreading-host-cpu

Reply all

Reply to author

Forward

0 new messages