scheduling policies in linux

860 views
Skip to first unread message

Kirti Teja Rao

unread,
Aug 6, 2015, 4:37:53 PM8/6/15
to mechanica...@googlegroups.com
Hi,

Did anyone do any experiments or measurements with scheduling policies and their impact on micro jitter, especially with SCHED_FIFO/SCHED_RR vs SCHED_OTHER?
If i am pin my threads and keep them spinning then is SCHED_FIFO better in theory?

Thanks,
Teja

Matt Godbolt

unread,
Aug 7, 2015, 7:42:33 AM8/7/15
to mechanica...@googlegroups.com
If you're planning on spinning your threads anyway, the SCHED_FIFO and SCHED_RR probably aren't a good choice. One of two things will happen, in my experience:

* if you have any other threads that need to run on the CPU you're scheduled on, they may not run at all. If you have one of the lower-priority kernel threads that needs to run, it'll starve out and eventually you'll end up wedging the OS. I've seen this happen with deferred work queue threads, for example the Solar Flare deferred work threads (the [onload_wqueue] threads). The OS wedges when it runs out of the hardware resources that were supposed to be released by the work queue. I realise this is quite SolarFlare specific, but I'm sure there's other kernel processes that could yield the same results.
* if, on the other hand you ensure nothing else important can run on your CPU (e.g. you isolate it at boot, and IRQ steer away, and ensure there's no work queue processes scheduled), then as best I can tell there's no benefit to running FIFO or RR: there's nothing else to run anyway.

If, however, you are going to block in your FIFO or RR thread somewhere, there may be a benefit to using those kernel priorities. But my instinct is the jitter will be unacceptably high even with FIFO or RR.

Hope that gives some help, Matt

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Matt

Gil Tene

unread,
Aug 7, 2015, 12:17:19 PM8/7/15
to mechanical-sympathy
As Matt notes, there is likely no benefit for using scheduling policies on spinning threads.

Think of it this way: When you have a constantly spinning thread, you generally do so to avoid blocking or polling (with variants of "sleep"). When is it ok for that thread to have it's CPU randomly taken away for 10msec? 

A scheduling policy (and related priority) only controls what happens when the cpu was taken away from you, and there is no other cpu for you to go to. Since your goal (when spinning) is generally to avoid that ever happening, the scheduling policy is generally irrelevant to that goal. The easiest (least config and I-need-special-privs heavy) way to avoid losing the cpu is to make sure the system has plenty of cores and that they are never over-used. Failing to do that, your next set of options usually come in the form of dedicating or isolating sets of cores. numactl and taskset are good easy steps in that direction, but when using them you need to keep in mind that they serve to limit (not guarantee) things: it doesn't really help to limit your thread to using a core or set of cores if anyone else (including cron jobs) get to share that core with you. You only get real benefit from controlling your set of cores if the rest of the system is restricted to be using a non-overlapping set of cores. There are various ways to make that happen (e.g. task-setting init early in the boot process and having everyone inherit that set), but they usually require some level of I-have-special-privs control...

Then there is isolcpus. It's the simplest way to isolate a core make sure nobody messes with it, and probably the easiest way to make sure a pinned spinning thread won't be bothered. Just remember that isolcpus don't come in "sets", and that each thread needs to be specifically assigned to a separate core (that no other thread has been assigned to) for the trick to be effective. If you assign more than one thread to an isolcpus core, or try to use a set of isolcpus "together" for a process, expect much worse behaviors than if you had done nothing at all... I find it very sad that you can taskset a process to use a group of isolcpus cores (and that nothing screams at you when you do that).

Kirti Teja Rao

unread,
Aug 7, 2015, 3:15:27 PM8/7/15
to mechanica...@googlegroups.com
Thank you for the response Matt and Gil, that answers my initial question. That question is only from curiosity rather than a need or an observation.

However, i have a follow up question on Gil's response. If I use isolcpus to isolate at set of cores and use Peter Lawrey's Java-Thread-Affinity library to pin the apps threads on to them that still leaves JVMs threads (like compiler or GC threads) unpinned. I have tried the taskset -c but noticed all threads are pinned to first core i give. So how can i distribute JVM threads across my cores if i am using hotspot JVM? Also, does Zing JVM do anything different for these usecases?

Thanks,
Teja

--

Vitaly Davidovich

unread,
Aug 7, 2015, 3:31:17 PM8/7/15
to mechanical-sympathy
However, i have a follow up question on Gil's response. If I use isolcpus to isolate at set of cores and use Peter Lawrey's Java-Thread-Affinity library to pin the apps threads on to them that still leaves JVMs threads (like compiler or GC threads) unpinned. I have tried the taskset -c but noticed all threads are pinned to first core i give. So how can i distribute JVM threads across my cores if i am using hotspot JVM? Also, does Zing JVM do anything different for these usecases?

If you're launching your java app under taskset, you shouldn't have to do anything special to get the non-pinned threads to have the specified cpus in their affinity mask.  I'm assuming you specified just a small set of cpus in the isolcpus setting and the taskset you're using to launch the java app contains non-isol'd cpus, right?

Kirti Teja Rao

unread,
Aug 7, 2015, 4:05:45 PM8/7/15
to mechanica...@googlegroups.com
Hi Vitaly,

Shouldn't i be using isol'd cups in the taskset to launch the java app? 

I have a two socket, 6-core on each socket and hyperthreading enabled. In grub.conf, I isolate the cores using isolcpus=1,3,5,7,9,11,13,15,17,19,21,23.
I launch my app using taskset -c 1,3,5,7,9,11,13,15,17,19,21,23 java Test.

I am trying to run the app on socket 1 and keep the kernel threads on socket 0. I notice all my threads are running on core 1 of socket 1 if i do not pick a thread and pin it to another core using taskset -p. I was hoping i can use Java Thread Affinity to do that.

Thanks,
Teja


Vitaly Davidovich

unread,
Aug 7, 2015, 4:23:45 PM8/7/15
to mechanical-sympathy
Ok, I think I misunderstood what you are trying to do.  I thought you wanted to have your critical app thread(s) isolated to some cpu(s), and then have your JVM/other app threads float around.  taskset and isolcpus don't work well -- if you specify more than 1 cpu to taskset when the list contains isolated cpus, it'll just pick the 1st one (try to change the order of the cpus in your list, you'll see them pinned to whatever's first then).

What you can try instead is:
set isolcpus=<the cpus you want to reserve for your critical threads that will be pinned>
run your app with taskset -c <socket 1 cpus without the cpus you isolated>

For the critical threads, manually set their affinity in code to the isolated cpus.

Kirti Teja Rao

unread,
Aug 7, 2015, 8:20:01 PM8/7/15
to mechanica...@googlegroups.com
That seems to work well. Thank you Vitaly.

Gil Tene

unread,
Aug 7, 2015, 8:45:14 PM8/7/15
to mechanical-sympathy
+1000 (to Vitaly's response below). That's how you should go about it.

This is exactly the problem I was referring to in the last paragraph of my earlier post. Especially with the "...I find it very sad that you can taskset a process to use a group of isolcpus cores (and that nothing screams at you when you do that)." statement. Looks like Teja fell into that exact unmarked trap... I've lost count of the number of time I've seen someone taskset a process to a group of isolcpus thinking (understandably) that this means the process will use those CPUs as a set. Unfortunately, it doesn't work that way, but shows no clear signs of "being wrong" when you do it.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Vitaly Davidovich

unread,
Aug 8, 2015, 1:07:08 PM8/8/15
to mechanical-sympathy

Yeah, I agree - can't think of a good reason one would want this behavior.

sent from my phone

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Jean-Philippe BEMPEL

unread,
Aug 9, 2015, 4:27:02 PM8/9/15
to mechanical-sympathy
If I am understanding well, isolcpus indicates to the Kernel that it does not use at all those cpu for scheduling. so no automatic scheduling for those cpu (cores).
If you put threads on those cpus you are on your own to schedule them (with taskset/sched_setaffinity), core by core.

Is it correct?

Martin Thompson

unread,
Aug 9, 2015, 4:29:08 PM8/9/15
to mechanica...@googlegroups.com
That is my understanding of ISOCPUS. You need to take control of the placement yourself.

Vitaly Davidovich

unread,
Aug 9, 2015, 6:41:24 PM8/9/15
to mechanical-sympathy

Yes, that's correct modulo irq handling - you'll need to steer irq handling away from isolated cores (good idea anyway if you're, e.g., ingesting lots of net packets and want to handle them on the cpus closer to the NIC).  Also, I believe there are still some kernel tasks that may get scheduled on the isolated cpus (e.g. RCU, scheduler accounting code) and of course the cpu will handle IPIs coming from other cpus.  IIRC, there was some planned work to further allow minimizing kernel tasks on the isolated cpus.

sent from my phone

Kirti Teja Rao

unread,
Aug 10, 2015, 2:45:15 AM8/10/15
to mechanica...@googlegroups.com
Redhat documentation here says isolcpus prevents scheduling of any user-space threads and once isolated processes must be assigned to CPUs with numactl or CPU affinity system calls.
does that mean kernel threads like pdflush threads can still run on isolated CPUs?

Gil Tene

unread,
Aug 10, 2015, 3:05:47 AM8/10/15
to mechanical-sympathy


On Sunday, August 9, 2015 at 10:27:02 AM UTC-10, Jean-Philippe BEMPEL wrote:
If I am understanding well, isolcpus indicates to the Kernel that it does not use at all those cpu for scheduling. so no automatic scheduling for those cpu (cores).
If you put threads on those cpus you are on your own to schedule them (with taskset/sched_setaffinity), core by core.

Is it correct?

To be specific, isolcpus do not participate in cpu balancing of any kind. So no threads that run on any other cores (including other isolcpu cores) will be moved to them, and no thread that runs on them will be moved to any other core (including other isolcpu cores).

Other than that, they run scheduling just like any other core does, so there is plenty of scheduling [potentially] going on on an isolcpus core. In-core scheduling is all about managing the run queues of a specific core. The basic difference is that runnable threads sitting in the run queue will not be stolen by other idle cores, and that this core will not steal threads from other cores' run queues even when it is idle. Thread priorities and scheduling classes still have the same effect within an isolcpus core as they do on any other core.

Jean-Philippe BEMPEL

unread,
Aug 10, 2015, 3:08:34 AM8/10/15
to mechanica...@googlegroups.com
Right make senses.
Thanks Gil for the clarification!

--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/zJ86pHjvJQU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.

Vitaly Davidovich

unread,
Aug 10, 2015, 9:54:22 AM8/10/15
to mechanical-sympathy

Yes.  Here's a good blog post that touches on this aspect (amongst others): http://www.breakage.org/2013/11/15/nohz_fullgodmode/

sent from my phone

Reply all
Reply to author
Forward
0 new messages