--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--Matt
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Absolutely. Different PCIe sockets can be closer to different CPU sockets just like memory (NUMA).
Related to this question, does Xeon processors require "symmetric" memory config? I was thinking of putting only like 4 GB on junk core as you call it and then like 64 GB connected to second socket.
--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/SnJ6LTKCjEU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.
There's a number of kernel tasks that are implicitly bound to cpu0. For an example of one have a look at rcu offloading and its restrictions.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/SnJ6LTKCjEU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-sympathy+unsub...@googlegroups.com.
There's lots of work being scheduled on even isolated cpus. If you are not running a tickless kernel, you should see around 1000 local timer interrupts per second (by default). You will also see soft irqs (if you haven't affinitized them with some housekeeping cpu), non maskable interrupts/machine check errors, work queue tasks, etc.
As for rcu, even with offloading you will see the isolated cores performing work required to schedule the callbacks on the offloaded cpus. You can solve that by switching to rcu callback polling, but my point here is, there's a number of different types of tasks that will run on isolated cpus.
Thanks Wojciech for the quick reply. I read about rcu offloading and in my testing, can confirm that kernel rcu threads are scheduled on core 0 even if core 0 is isolated. Another thing I observed was that there are some kworker threads which run on isolated cpus other than 0. Is this expected behavior, because I used to think that isolated cpus are not touched by the kernel. These kworker threads will definitely lead to context switches and hamper performance a little bit. And I am afraid we can do nothing to get rid of them.Himanshu SharmaOn Mon, May 22, 2017 at 2:08 PM, Wojciech Kudla <wojciec...@gmail.com> wrote:
There's a number of kernel tasks that are implicitly bound to cpu0. For an example of one have a look at rcu offloading and its restrictions.
On Mon, 22 May 2017, 08:59 Himanshu Sharma, <imhima...@gmail.com> wrote:
Hi MichaelDid you find a satisfactory reason for not isolating cpu 0, maybe some low level OS code that is bound to run on core 0? I am also stuck at this question right now and am thinking you might have an answer.ThanksHimanshu--
On Wednesday, March 4, 2015 at 8:08:22 PM UTC+5:30, Michael Mattoss wrote:Hi guys,I'm in the process of setting up a new dual-socket server for a low latency workload.The application will run exclusively on one CPU and everything else (i.e. OS, non-critical processes) will run on the other CPU to avoid cache pollution.I was wondering if it makes any difference as to which one of the 2 CPU's is chosen for the workload.Theoretically, there should be no difference but I was wondering if there is some low-level stuff (e.g. core OS code, system management interrupts handlers) that is statically allocated to CPU-0 as every system has at least 1 CPU.Of course, if that's the case then CPU-1 is the better choice.Any thoughts/suggestions?Thanks,Michael
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/SnJ6LTKCjEU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
There's lots of work being scheduled on even isolated cpus.
If you are not running a tickless kernel, you should see around 1000 local timer interrupts per second (by default).
You will also see soft irqs (if you haven't affinitized them with some housekeeping cpu), non maskable interrupts/machine check errors, work queue tasks, etc.
As for rcu, even with offloading you will see the isolated cores performing work required to schedule the callbacks on the offloaded cpus. You can solve that by switching to rcu callback polling, but my point here is, there's a number of different types of tasks that will run on isolated cpus.
On Mon, 22 May 2017, 11:41 Himanshu Sharma, <imhima...@gmail.com> wrote:
Thanks Wojciech for the quick reply. I read about rcu offloading and in my testing, can confirm that kernel rcu threads are scheduled on core 0 even if core 0 is isolated. Another thing I observed was that there are some kworker threads which run on isolated cpus other than 0. Is this expected behavior, because I used to think that isolated cpus are not touched by the kernel. These kworker threads will definitely lead to context switches and hamper performance a little bit. And I am afraid we can do nothing to get rid of them.Himanshu SharmaOn Mon, May 22, 2017 at 2:08 PM, Wojciech Kudla <wojciec...@gmail.com> wrote:
There's a number of kernel tasks that are implicitly bound to cpu0. For an example of one have a look at rcu offloading and its restrictions.
On Mon, 22 May 2017, 08:59 Himanshu Sharma, <imhima...@gmail.com> wrote:
Hi MichaelDid you find a satisfactory reason for not isolating cpu 0, maybe some low level OS code that is bound to run on core 0? I am also stuck at this question right now and am thinking you might have an answer.ThanksHimanshu--
On Wednesday, March 4, 2015 at 8:08:22 PM UTC+5:30, Michael Mattoss wrote:Hi guys,I'm in the process of setting up a new dual-socket server for a low latency workload.The application will run exclusively on one CPU and everything else (i.e. OS, non-critical processes) will run on the other CPU to avoid cache pollution.I was wondering if it makes any difference as to which one of the 2 CPU's is chosen for the workload.Theoretically, there should be no difference but I was wondering if there is some low-level stuff (e.g. core OS code, system management interrupts handlers) that is statically allocated to CPU-0 as every system has at least 1 CPU.Of course, if that's the case then CPU-1 is the better choice.Any thoughts/suggestions?Thanks,Michael
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/SnJ6LTKCjEU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
> Since Sandy Bridge at least, each CPU has
its own PCIe interface. Presumably, if you're doing user-space kernel bypass IO you want your workload on the same CPU that your IO devices are connected to.
I think you meant the whole socket here. Yes, this is one of the reasons why many shops move away from 4-socket rigs as it sometimes gets really challenging to partition PCIe/cpu/memory resources when running multiple latency critical processes.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
There's lots of work being scheduled on even isolated cpus. If you are not running a tickless kernel, you should see around 1000 local timer interrupts per second (by default). You will also see soft irqs (if you haven't affinitized them with some housekeeping cpu), non maskable interrupts/machine check errors, work queue tasks, etc.
As for rcu, even with offloading you will see the isolated cores performing work required to schedule the callbacks on the offloaded cpus. You can solve that by switching to rcu callback polling, but my point here is, there's a number of different types of tasks that will run on isolated cpus.
Yes, that's why blacklisting workqueues from critical cpus should be on the jitter elimination check list.
They can be affinitized just like irqs
--