Re: nice value of kResourceEfficient on ChromeOS

436 views
Skip to first unread message

François Doray

unread,
Nov 14, 2022, 10:34:24 AM11/14/22
to Chen, Zheda, Bruce Dawson, Gabriel Charette, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Youssef Esmat, scheduler-dev
+scheduler-dev for archival

Summary of the current state as I understand it:
  • On ChromeOS (other platforms not affected):
    • A thread can be moved to the chrome/non_urgent c-group to force it to execute on little cores. This is a great way to reduce power consumption.
    • A renderer process cannot change the c-group of its threads, due to sandbox restrictions. This needs to be done by the browser process.
      • zheda...@intel.com is working on a CL to register a delegate invoked from PlatformThread::SetCurrentThreadType. In renderer processes, the delegate will send a Mojo message to the browser process to request a c-group change. Namespaced tids [code] prevent a renderer process from requesting a thread type change for a thread that doesn't belong to it.
    • The browser process can request to move a renderer process to the chrome_renderers/background c-group. This changes the c-group of all threads in the process to chrome/non_urgent. This seems like an appropriate thing to do for a background renderer. However, when the renderer is foregrounded, we can't just move it back to chrome_renderers/{per renderer group}, because that would move threads that were previously in the chrome/non_urgent c-group (e.g. kResourceEfficient, kBackground) to the chrome/urgent c-group. 
Next steps, as I understand it:
  • youssefesmat@ is working on ChromeOS system changes to ensure that changing the c-group of a process to: foreground -> background -> foreground brings back all threads to their original c-group. 
    •  youssefesmat@: Is there a design doc and ETA for this work?
  • zheda...@intel.com is working on a CL to register a delegate invoked from PlatformThread::SetCurrentThreadType. In renderer processes, the delegate will send a Mojo message to the browser process to request a c-group change, to overcome sandbox restrictions.
  • zheda...@intel.com would also like to implement a workaround to restore correct thread c-groups from Chrome when the c-group of a process is moved from: foreground -> background -> foreground.
    • To determine if this work is desired, I would really like to get an ETA for youssefesmat@'s work (see above).
Relevant doc: 

zheda.chen@/youssefesmat@: Don't hesitate to correct me if I made any mistake.

On Mon, Nov 7, 2022 at 5:45 AM Chen, Zheda <zheda...@intel.com> wrote:

Could we use system capability to clear dead threads for the map ? Youssef (+ youssefesmat@) said that as for dead process, we might be able to use the filewatcher API in base to watch /proc/{pid}/tasks directory. If so we might also record pid into the map, like [ tid ---> <pid, thread_type>]?

 

Add some replies/comments to Bruce and Francois below.

 

From: Bruce Dawson <bruce...@google.com>
Sent: Saturday, November 5, 2022 8:27 AM
To: François Doray <fdo...@google.com>
Cc: Chen, Zheda <zheda...@intel.com>; Gabriel Charette <g...@google.com>; Chrome Scheduler <chrome-s...@google.com>; Markus Handell <hand...@google.com>; Etienne Pierre-doray <etie...@google.com>; fdoray-...@google.com
Subject: Re: nice value of kResourceEfficient on ChromeOS

 

Maintaining a list of threads owned by other processes or other code seems like a really bad idea. It is fundamentally racy (as soon as a TID is freed it can be reused), and a small delay makes this worse. Plus the implementation sounds very fragile.

 

[Zheda] When we use the [tid->thread_type] map, we traverse all live threads of the foregrounding renderer process via ForEachProcessTask to get their thread types [CL code ] on process launcher task runner of browser process. And renderer thread type is only set on same task runner of browser process [CL code]. So I think there’s no race here, and the browser process won’t read a dead thread.

If tid is reused by another renderer process, SetCurrentThreadType would be invoked after thread creation, and the map get refreshed.

 

 

The only way I can think of to do this which would be safe would be that threads that we own could register themselves at startup and deregister at shutdown. To avoid raciness they should not finish terminating until the browser process has definitely received the message and removed the entry, otherwise the browser process will have a reference to a dead thread which is a use-after-free of a different sort.

 

The raciness could be avoided (on Windows) by duplicating the thread handle and giving a copy to the browser process. However the whole thing still makes me very nervous and I fear a future bug farm. And, I don't know if other platforms could support these semantics.

 

If the goal is to support foregrounding/backgrounding processes then doesn't changing the process priority does that? It doesn't give us full control, but it feels like the right thing to do (work with the platform) and is simpler and less bug prone.

 

[Zheda] We had discussion on scheduler-dev groups [link] in August whether we can define a Chrome OS API, which introduces changes to OS that changing the c-group of a process from X->Y and then back from Y->X brings all threads back to their original c-group. The answer from Chrome OS team (youssefesmat@) is that defining such OS API is a long term solution. We need to fully understand the API customers and how it will map to system level constructs [link]. Since we want to ship background process scheduling feature soon, we decide to do option B user space level solution first. This is the CL [link] for user space solution.

 

 

Apologies if I'm misunderstanding the goals here.

 

On Fri, Nov 4, 2022 at 11:44 AM François Doray <fdo...@google.com> wrote:

Hi!

 

Yes, let's maintain a [thread_id->thread_type] map on the browser side.

 

I can think of two possible solution to detect a thread going away in a child process:

  • Detection code at the end of ThreadFunc() [code]
    • Pro: Arbitrary code execution allowed.
    • Con: Only works for threads created via //base APIs.
  • Detection code in the destructor of an object stored in ThreadLocalOwnedPointer for any thread that sets a thread type.
    • Pro: Works for any thread, even if not created via //base APIs.
    • Con: We need to be careful about code that runs late in the life of a thread.

The thread destruction detection code could:

  • Send a Mojo message to notify the browser process.
    • Pro: Browser process immediately notified of thread destruction.
    • Con: The overhead of a Mojo message per thread destruction may be too high.
  • Add the thread's tid to a process-global locked data structure. A Mojo message with the list of destructed threads is sent to the browser process only when it's big enough.
    • Pro: Minimal overhead if total number of threads created in a process remains below a threshold.
    • Cons:
      • Small delay to clean up the map in the browser process (but keeping a few integers in memory is not too concerning).
      •  We need to think about how this works if a tid is reused?

[Zheda] If a tid is reused by another process (a different pid), we might be able to identify it if we also record pid in the map? Is it possible to be used by same process?

 

What do you think?

 

François

 

On Thu, Nov 3, 2022 at 10:59 PM Chen, Zheda <zheda...@intel.com> wrote:

Okay, let’s try with alternative solution, probably to maintain a [thread_id->thread_type] map on browser side but need to fix the problem of map infinitely grow. Please let me know if any better proposal.

 

From: François Doray <fdo...@google.com>
Sent: Thursday, November 3, 2022 2:53 AM
To: Gabriel Charette <g...@google.com>
Cc: Chen, Zheda <zheda...@intel.com>; Bruce Dawson <bruce...@google.com>; Chrome Scheduler <chrome-s...@google.com>; Markus Handell <hand...@google.com>; Etienne Pierre-doray <etie...@google.com>; fdoray-...@google.com
Subject: Re: nice value of kResourceEfficient on ChromeOS

 

I understand that we want to keep supporting arbitrary mappings from "ThreadType" to "nice value" and "c-groups" (not add a constraint that there must be a 1:1 mapping between "nice value" and "c-group"). In that case, I'll work with Zheda Chen on a solution to restore c-groups when a process is foregrounded that doesn't depend on deriving c-groups from nice value.

 

On Mon, Oct 31, 2022 at 5:24 PM Gabriel Charette <g...@google.com> wrote:

Replied with more thoughts in a longer discussion labeled "Thread priority discuss: kUtility vs kResourceEfficient vs kDisplay".

 

With that in mind, I don't think we want to have a different nice-value on CrOS for kResourceEfficient than we do kDefault..?
Although, that is what Apple does for QOS_CLASS_UTILITY IIUC?

 

CC +Bruce Dawson +Chrome Scheduler (same as other discussion but focused here on how to recall that a thread was kResourceEfficient when re-foregrounding a renderer process; mapping the nice value was a proposal if we can have different nice values per thread types).

 

On Fri, Oct 28, 2022 at 7:43 AM Chen, Zheda <zheda...@intel.com> wrote:

This is the thread nice value to c-group mapping on CrOS below per [code]

 

kBackground (10), kResourceEfficient (0), kUtility (1, proposed in CL #3898731)

Non-urgent c-group

kDefault (0)

Normal c-group

kCompositing (-8), kDisplayCritical (-8), kRealtimeAudio (-10)

Urgent c-group

 

Per this [CL] design, when backgrouding a renderer process, the browser process moves all threads of the renderer to non-urgent c-group. And when foregrounding a process, the browser needs to re-apply original thread c-group previously set via RenderMessageFilter::SetThreadType. It makes process level hybrid scheduling on Chrome OS workable and we might consider more processes in the future [design doc].

 

To make [derive thread c-group from nice value in browser process] workable, except for lowering kResourceEfficient nice value to 1, probably there’s another short-term alternative which is to map kResourceEfficient [0] to normal c-group on CrOS ? Currently kResourceEfficient cannot be move to non-urgent c-group via SetCurrentThreadType on Linux/CrOS due to renderer sandboxed restriction. The result of this short term solution is consistent with current behavior.

 

Re Markus’s question “What’s the ETA for that work”: the [CL] to apply C-Group updates from the browser process is in review.

 

Regards

Zheda

 

From: Markus Handell <hand...@google.com>
Sent: Friday, October 28, 2022 4:31 PM
To: Gabriel Charette <g...@google.com>
Cc: François Doray <fdo...@google.com>; Chen, Zheda <zheda...@intel.com>; Etienne Pierre-doray <etie...@google.com>; fdoray-...@google.com
Subject: Re: nice value of kResourceEfficient on ChromeOS

 

If we don't mind running a thread slower to save resources (kResourceEfficient), I think this implies that its priority is slightly lower than a kDefault thread? And as such, changing the nice value from "0" to "1" is ok?

> It's also my opinion that kResourceEfficient could be lower priority, I forget why @Markus Handell  felt strongly otherwise for WebRTC? This is why we added kUtility FWIW.

 

Because belating scheduling means increasing end-to-end transmission delay by the same amount, and results in increased size of remote jitter buffers.

Priority is orthogonal to energy use, and matters only when there's more than 1 choice of a thread to run for a processor. Currently all WebRTC thread workloads need to be regarded as latency-sensitive.

WebRTC threads are not CPU intensive, but they do need to run timely.

 

Side note: Today, the only use of kResourceEfficient is by WebRTC, from renderers [code]. Since the nice value for kResourceEfficient is the same as for kDefault [code] and changing C-group from a renderer is prevented by the sandbox, the "WebRtcThreadsUseResourceEfficientType" feature is a no-op on ChromeOS. With @Chen, Zheda's work to apply C-Group updates from the browser process, this feature will no longer be a no-op on ChromeOS.

 

Hm interesting... I was not aware. What's the ETA for that work?

The experiment (50% beta since a while) includes very few data points for Intel 12th gen (5k on CrOS, ~650 (!) on Windows) and I was about to roll to stable to get more.

 

 

On Thu, Oct 27, 2022 at 10:54 PM Gabriel Charette <g...@google.com> wrote:

 

Le jeu. 27 oct. 2022, 15 h 05, François Doray <fdo...@google.com> a écrit :

If we don't mind running a thread slower to save resources (kResourceEfficient), I think this implies that its priority is slightly lower than a kDefault thread? And as such, changing the nice value from "0" to "1" is ok?

 

It's also my opinion that kResourceEfficient could be lower priority, I forget why @Markus Handell  felt strongly otherwise for WebRTC? This is why we added kUtility FWIW.

 

Perhaps it's because we're okay delaying scheduling but then once it's running we want a normal priority thread (i.e. equivalent of base::ThreadPolicy::USE_FOREGROUND_THREADS)?

 

 

Side note: Today, the only use of kResourceEfficient is by WebRTC, from renderers [code]. Since the nice value for kResourceEfficient is the same as for kDefault [code] and changing C-group from a renderer is prevented by the sandbox, the "WebRtcThreadsUseResourceEfficientType" feature is a no-op on ChromeOS. With @Chen, Zheda's work to apply C-Group updates from the browser process, this feature will no longer be a no-op on ChromeOS.

 

Nice catch! We do analyze this experiment beyond CrOS though so it's still relevant in today's chrome I think?

 

 

On Mon, Oct 24, 2022 at 1:20 PM Gabriel Charette <g...@google.com> wrote:

Sounds like a reasonable experiment to me but +Markus Handell was mentioning in our ongoing discussion @ https://chromium-review.googlesource.com/c/chromium/src/+/3898731/comment/f63d4e70_136c1e7c/

that Meet would expect kResourceEfficient threads to be running at normal priority (and nice value I assume). We haven't yet settled on this but this information contributes to the set of things to consider.

 

On Mon, Oct 24, 2022 at 10:55 AM François Doray <fdo...@google.com> wrote:

Hi Gab/Etienne,

 

Background:

  • On ChromeOS, the sandbox prevents setting a thread's C-Group from a renderer. On the other hand, setting a thread's nice value works fine.
  • Zheda (Intel) works on a "Thread Type Change Delegate" which will make a Mojo call from the renderer to the browser on Thread Type change, to let the browser change the C-Group.
  • When a process is backgrounded, the C-Group of all threads in the process is set to "non-urgent". When a process is foregrounded, the C-Group of each of its threads must be restored explicitly (otherwise, it stays "non-urgent").

Zheda proposed to maintain a [tid -> ThreadType] map in the browser process and use it to restore correct C-Groups. Problem: There is no simple way to observe all thread deaths, so the map may grow indefinitely.

 

I proposed to read a thread's nice value (not affected by process backgrounding) and derive the C-Group to restore from it. This is non-racy if all Thread C-Group and Nice value manipulations happen from the same sequence in the browser process. Problem: kResourceEfficient and kDefault have the same nice value, but different C-Groups.

 

Question:

  • What do you think of changing the nice value of kResourceEfficient from "0" to "1" to allow deriving C-Groups from Nice values (and eliminate the need to maintain per-thread state in the browser process)?

Thanks,

 

François


 

--

Bruce Dawson, he/him

 

Youssef Esmat

unread,
Nov 15, 2022, 5:54:16 PM11/15/22
to François Doray, Chen, Zheda, Bruce Dawson, Gabriel Charette, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, scheduler-dev
To determine if this work is desired, I would really like to get an ETA for youssefesmat@'s work (see above)

This work is in the process of being staffed, it's hard to get an ETA at the moment. I can try to get a little more clarity next week.

Do we think that the effort of the workaround is too large to implement temporarily?

François Doray

unread,
Nov 16, 2022, 2:02:46 PM11/16/22
to Youssef Esmat, Chen, Zheda, Bruce Dawson, Gabriel Charette, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, scheduler-dev
No, I support trying this if it will take more than 6 months to get the OS change. I'm still interested in an ETA for the OS change.
 

Youssef Esmat

unread,
Nov 21, 2022, 11:59:06 AM11/21/22
to François Doray, Chen, Zheda, Bruce Dawson, Gabriel Charette, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, scheduler-dev
Another approach we can use is to encode the thread type in the name of the thread. Might be hacky but compared to the complexity of the map might be worth it.

Zheda Chen

unread,
May 8, 2023, 7:32:00 AM5/8/23
to scheduler-dev, Youssef Esmat, Zheda Chen, Bruce Dawson, Gabriel Charette, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, scheduler-dev, François Doray
Hi Francois, Youssef, Gabriel, Etienne,

We discussed about the issue of background process to non-urgent cgroup on ChromeOS in November. I'd like to revisit this issue to see if we still need the work below. If yes, I'm okay to try to implement the feature this month.
TODO: implement a workaround to restore correct thread c-groups from Chrome when the c-group of a process is moved from: foreground -> background -> foreground.

To determine if the work is desired, it depends on the status of OS priority API development.
youssefesmat@, could u please share the current status of ChromeOS process & thread priority API ?


Thanks
Zheda

Gabriel Charette

unread,
May 8, 2023, 11:44:57 AM5/8/23
to Zheda Chen, scheduler-dev, Youssef Esmat, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, François Doray
Yes, that work is still desired. CrOS is the only platform which doesn't support this.

Youssef Esmat

unread,
May 11, 2023, 11:47:06 AM5/11/23
to Gabriel Charette, Zheda Chen, scheduler-dev, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, François Doray
Sorry for the late reply.
We actually started working on the resourced API. But the work is planned in stages so far:
1. Get the API setup with correct security restrictions and only implement the Process state change (renderer background<->foreground) using the cpu cgroup as is done today (ie moving a lot of the process_linux.cc to resourced).

2. We wanted to change the cgroup structure of UI and foreground renderers.

3. Start moving the thread level APIs to resourced.

In terms of timeline we are hoping to have number 1 and 2 within a few months. Maybe #3 towards the end of the year. But please take these as very rough estimates.

In summary, to have most of the infrastructure moved to resourced we are talking about the end of the year most likely. I am not sure if that aligns with your plans Zheda?
- We can also work together to move the rest of the thread level API to resourced if you would like. We are trying to have the basic API for the process level in review in a about month time (maybe less). So we can start to work on the threading API after that.
- Or you can implement the changes in chrome and we can move them later. (might have some wasted effort).


Yes, that work is still desired. CrOS is the only platform which doesn't support this.

From another thread discussion, it seems like Windows side might need similar changes when the renderer moves to the background. From the latest thread from Sandeep, it seems like changing the QoS of the process does not change the thread priorities on Windows. Can you set the max priority in Windows for the whole process?

Thanks,
Youssef

François Doray

unread,
May 12, 2023, 3:16:53 PM5/12/23
to Youssef Esmat, Gabriel Charette, Zheda Chen, scheduler-dev, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com
Hi everyone,

Zheda has a workaround solution to manipulate thread settings from user space when process settings change. Given that it will take ~1 year to have the "right" solution in the ChromeOS kernel (thanks Youssef for working on this!), I think that having this workaround solution is worthwhile. I offered to review the code.

Have a nice day,

François

Zheda Chen

unread,
May 15, 2023, 2:09:10 AM5/15/23
to scheduler-dev, François Doray, Gabriel Charette, Zheda Chen, scheduler-dev, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Youssef Esmat
Incorporating all comments from Gabriel, Youssef and Francois, it's desired to have the workaround solution in Chrome browser before CrOS API is ready. Will continue this work.

I'm also interested in the work of stage 3 ---- thread level API to resourced, but I guess it would be several months later? If I have bandwidth at that time, I'm glad to help and contribute.

With regard to priority on Windows, I find SetProcessBackgrounded will also set idle priority for background process on Windows, apart from EcoQoS setting. Looks like it's already done ? 

François Doray

unread,
May 15, 2023, 1:19:03 PM5/15/23
to Zheda Chen, scheduler-dev, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Youssef Esmat
On Mon, May 15, 2023 at 2:09 AM Zheda Chen <zheda...@intel.com> wrote:
Incorporating all comments from Gabriel, Youssef and Francois, it's desired to have the workaround solution in Chrome browser before CrOS API is ready. Will continue this work.

I'm also interested in the work of stage 3 ---- thread level API to resourced, but I guess it would be several months later? If I have bandwidth at that time, I'm glad to help and contribute.

With regard to priority on Windows, I find SetProcessBackgrounded will also set idle priority for background process on Windows, apart from EcoQoS setting. Looks like it's already done ? 
Yes, we lower priority and enable power throttling for "background processes".

Youssef Esmat

unread,
May 15, 2023, 5:07:00 PM5/15/23
to François Doray, Zheda Chen, scheduler-dev, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com
Oh I missed the call to SetPriorityClass. Thanks!

Gabriel Charette

unread,
May 17, 2023, 8:51:21 AM5/17/23
to Zheda Chen, scheduler-dev, Youssef Esmat, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, François Doray
LG. I prefer 2B) because we're evolving which threads map to which ThreadType and hardcoding thread names with 2A would leave CrOS behind, potentially mixing behavioral changes when experimenting with the final solution.

Le mer. 17 mai 2023, 08 h 05, Zheda Chen <zheda...@intel.com> a écrit :
The workaround solution in Chrome browser is proposed below. Please take a look and give any comment.

Target: implement a workaround to restore correct thread c-groups from Chrome when the c-group of a process is moved from: foreground->background->foreground

Currently, the thread type behaviours on Chrome OS are listed below
1) kRealtimeAudio, kDisplayCritical, kCompositing: nice values < 0, placed in the urgent c-group
2) kDefault: nice value = 0, placed in normal c-group
3) kResourceEfficient: nice value = 0, placed in non-urgent c-group
4) kUtility, kBackground: nice values > 0, placed in non-urgent c-group

Since it's a temporary solution before CrOS priority API is fully ready, probably the design don't need to be very elaborate. So the proposal to recall original thread c-groups
when re-foregrounding a renderer process might be:

Step 1) Derive the correct thread c-groups from nice value, if the thread nice value < 0 or > 0.
nice value < 0 -> urgent c-group
nice value > 0 -> non-urgent c-group

Step 2) Restore the thread c-group for kResourceEfficient type threads (nice value is 0)
Option 2.A: hack approach. Encode kResourceEfficient type in the name of the thread.
AFAIK currently only WebRTC threads (webrtc_network, webrtc_signaling, webrtc_worker) use this type. We can identify this type of threads through thread names.

Option 2.B: maintain a data structure in browser process to store kResourceEfficient type, and clear dead threads timely.
storage[pid][tid] = [kRE_thread_type]
When the renderer process exits (RenderProcessHostObserver::RenderProcessExited), remove all values of corresponding pid
When storage[pid][*] elements exceeds the threshold, watch for changes in /proc/{pid}/tasks to remove dead threads

Option A is much simpler. Since it's for temporary solution, probably option A might be worth it ?

Step 3) kNormal type threads (nice value is 0), no action since no need of extra thread c-group manipulation.

SetProcessBackgrounded(bool), thread type setting, thread c-group manipulation are all done in process launcher task runner of browser process. So we can guarantee the sequences.

Youssef Esmat

unread,
May 17, 2023, 12:21:20 PM5/17/23
to Gabriel Charette, Zheda Chen, scheduler-dev, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, François Doray
For 2B. We can open a pidfd (pidfd_open) to the thread and wait on the file handle to know when the thread exits. Might be easier than waiting on /proc/{pid}/tasks.

Youssef Esmat

unread,
May 17, 2023, 3:32:46 PM5/17/23
to Gabriel Charette, Zheda Chen, scheduler-dev, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, François Doray
Actually some devices are on older kernels and do not have that syscall yet.

François Doray

unread,
May 17, 2023, 3:53:34 PM5/17/23
to Youssef Esmat, Gabriel Charette, Zheda Chen, scheduler-dev, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com
I also prefer 2B) since it's the most robust.

Youssef Esmat

unread,
May 30, 2023, 2:31:42 PM5/30/23
to Zheda Chen, Joel Fernandes, scheduler-dev, François Doray, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com

On Thu, May 18, 2023 at 1:58 AM Zheda Chen <zheda...@intel.com> wrote:
Thanks for your comments ! Let's go with 2B and I will implement the solution per this proposal.

Joel Fernandes

unread,
May 30, 2023, 2:48:31 PM5/30/23
to Youssef Esmat, Zheda Chen, scheduler-dev, François Doray, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com
Hello,
Thank you for adding me. I was going through various documents and
looking for more of a background on why this issue happens on Linux.

So is the problem: backgrounding a process (via the *CPU* controller's
cgroup.procs) somehow also overrides the urgent/non-urgent *cpuset*
assignment? Is that happening implicitly in the kernel (which is then
a kernel bug IMO), or is that explicitly done by Chrome?

Thank you,

- Joel

Joel Fernandes

unread,
May 30, 2023, 7:17:56 PM5/30/23
to Youssef Esmat, Zheda Chen, scheduler-dev, François Doray, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai
On Tue, May 30, 2023 at 2:48 PM Joel Fernandes <joe...@google.com> wrote:
>
> Hello,
> Thank you for adding me. I was going through various documents and
> looking for more of a background on why this issue happens on Linux.
>
> So is the problem: backgrounding a process (via the *CPU* controller's
> cgroup.procs) somehow also overrides the urgent/non-urgent *cpuset*
> assignment? Is that happening implicitly in the kernel (which is then
> a kernel bug IMO), or is that explicitly done by Chrome?

Answering my own question, I believe Zheda is wanting to set all
threads to non-urgent cpuset, and on the reverse side he can't
determine whether it's urgent or not without knowing the ThreadType.

Actually we are about to have the same issue as we're looking into
forcing threads in "background" renderers to a different set of
priorities than the foreground ones. So we can piggyback on the same
mechanism when it is available.

Please count me in for any reviews and such. +Vineeth Pillai as well
(full thread below).

Thanks,

Zheda Chen

unread,
May 31, 2023, 1:27:46 AM5/31/23
to scheduler-dev, François Doray, Gabriel Charette, Zheda Chen, scheduler-dev, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Youssef Esmat
Thanks for your comments ! Let's go with 2B and I will implement the solution per this proposal.

Zheda Chen

unread,
May 31, 2023, 1:27:59 AM5/31/23
to scheduler-dev, Youssef Esmat, Zheda Chen, scheduler-dev, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, François Doray
The workaround solution in Chrome browser is proposed below. Please take a look and give any comment.

Target: to implement a workaround to restore correct thread c-groups from Chrome when the c-group of a process is moved from: foreground->background->foreground


Currently, the thread type behaviours on Chrome OS are listed below
1) kRealtimeAudio, kDisplayCritical, kCompositing: nice values < 0, placed in the urgent c-group
2) kDefault: nice value = 0, placed in normal c-group
3) kResourceEfficient: nice value = 0, placed in non-urgent c-group
4) kUtility, kBackground: nice values > 0, placed in non-urgent c-group

Since it's a temporary solution before CrOS priority API is fully ready, probably the design don't need to be very elaborate. So the proposal to recall original thread c-groups
when re-foregrounding a renderer process might be:

Step 1) Derive the correct thread c-groups from nice value, if the thread nice value < 0 or > 0.
nice value < 0 -> urgent c-group
nice value > 0 -> non-urgent c-group

Step 2) Restore the thread c-group for kResourceEfficient type threads (nice value = 0)
Option 2.A: Hack approach. Encode kResourceEfficient type in the name of the thread.
AFAIK currently only WebRTC threads (webrtc_network, webrtc_signaling, webrtc_worker) use this type. Probably we can identify this type of threads through thread names.

Option 2.B: Maintain a data structure in browser process to store kResourceEfficient type, and clear dead threads timely.

storage[pid][tid] = [kRE_thread_type]
When the renderer process exits (RenderProcessHostObserver::RenderProcessExited), remove all values of corresponding pid.
When storage[pid][*] elements exceed the size threshold, watch for changes in /proc/{pid}/tasks to remove dead threads.

Option A is much simpler. Since it's for temporary solution, option A might be worth it?

Step 3) kNormal type threads (nice value = 0), no action since no need of extra thread c-group manipulation.


SetProcessBackgrounded(bool), thread type setting, thread c-group manipulation are all done in process launcher task runner of browser process. So we can guarantee the sequences.

Zheda Chen

unread,
May 31, 2023, 1:28:15 AM5/31/23
to scheduler-dev, Joel Fernandes, Zheda Chen, scheduler-dev, François Doray, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
Yes currently the cpuset manipulation is done by Chrome, not the kernel. 
Okay I will add both of you when the patch is ready for code review.

Joel Fernandes

unread,
May 31, 2023, 8:43:51 AM5/31/23
to Zheda Chen, scheduler-dev, François Doray, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
I am not a big fan of the pid -> thread-type map idea because you
could have 100s of threads and maintaining the map sounds like a lot
of complexity (and bugs in the process). Plus, pid-reuse issues (which
yes pidfd can mitigate but that's not available on all kernels as
Youssef mentioned).

Instead we could encode the BG priority in the nice value itself and
then use getpriority() to determine the type. That makes sense to me
also because the lower the priority, the deeper the thread can be
considered to be in the background when the renderer is backgrounded,
so even from a priority standpoint it makes sense.

So in the background, for kDisplayCritical, set it to nice +10.
Kcompositing, set it to nice +11 and so on.

That is way simpler, solves the problem and does not have any of the
map drawbacks. It is also lower memory footprint since there's no map
and lesser userspace code.

Thoughts?

Joel Fernandes

unread,
May 31, 2023, 2:57:41 PM5/31/23
to Zheda Chen, scheduler-dev, François Doray, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
On Wed, May 31, 2023 at 9:58 AM Zheda Chen <zheda...@intel.com> wrote:
[..]

> As for the idea to encode BG priority in the nice value, do you mean to lower the priority (increase nice value) for all threads of background, and different nice values for different types ? The > current kThreadTypeToNiceValueMap is listed here.

Just for some background on our issue (slightly different from yours).
We are in the process of moving threads >= kCompositing etc to
SCHED_FIFO due to the improvement scheduling and latency behavior seen
(several kernel patches in the works).

Unfortunately when things move to SCHED_FIFO, we can no longer rely on
/cgroup.procs for limiting the CPU of renders in the background. Why?
Because /cgroup.procs does not care about RT. So we have to limit the
CPU in a different way (perhaps by changing the nice value of the
threads of the backgrounded renderer).

If we change the nice values, we run into the same issue you are
running into which is -- on re-foregrounding, we have no idea what the
thread type is so we don't know what to restore the priority to in the
foreground renderer.

So my proposal is pretty simple, on backgrounding a renderer, change
all the thread's nice value to kBackground nice value, but encode the
index of the thread type into the nice value.

enum class ThreadType : int {
kBackground, - nice 15
kUtility, - nice 14
kResourceEfficient, - nice 13
kDefault, - nice 12
kCompositing, - nice 11
kDisplayCritical, - nice 10
kRealtimeAudio,
kMaxValue = kRealtimeAudio,
};

You don't need to change the thread type on backgrounding, just
re-encode the nice value (which we anyway sort of do because of
limiting the renderer by using /cgroup.procs).

Then on foregrounding, use getpriority() to get the ThreadType.

This solves both our problems, and does not need any new map, pidfd or
anything else to lifecycle manage the pids.

Encoding information into values is a pretty common technique in the
Linux kernel for a number of usecases. And in this case, it actually
makes sense from a priority standpoint because it emulates the
/cgroup.procs behavior as well.

Thoughts?

Thanks,

- Joel


Actually we only need to differentiate between kResourceEfficient and
kNormal type, i would advise to increase kResourceEfficient type nice
value (such as +2) for background process if to consider this idea
direction. And when foregrounding the process we restore
kResourceEfficient nice value to 0.
What do you think?

Joel Fernandes

unread,
Jun 1, 2023, 1:11:31 PM6/1/23
to Zheda Chen, scheduler-dev, François Doray, Gabriel Charette, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
On Wed, May 31, 2023 at 11:50 PM Zheda Chen <zheda...@intel.com> wrote:
>
> Thanks for sharing the context. If to encode the index of the thread type into the nice value (from 10 to 15), it would have different CPU time ratio compared with default CPU time ratio in foreground, not sure if we care about this relative ratio in background.

FWIW -- For ChromeOS, if I understand correctly we already put the
backgrounded renderer in a background CGroup with tiny amounts of
shares. On my Brya it is set to 10 shares. So I don't think it matters
much what the nice value is in the background is, AFAICS. It should
have already gotten a lot of testing in the wild.

Thanks.

Gabriel Charette

unread,
Jun 1, 2023, 3:09:23 PM6/1/23
to Joel Fernandes, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
Using nice-value in bg threads to encode thread type is fine (I don't think we care about the ratio spread in bg processes).

However, I'm not clear how we map thread-type => bg nice-value as we're not currently storing the thread type on foreground threads either (some thread types map to the same nice values). I guess we could merely map fg-nice-values to bg-nice-values? i.e. happens under the hood, ThreadType never changes? I think that makes sense to me.

Youssef Esmat

unread,
Jun 1, 2023, 3:39:06 PM6/1/23
to Gabriel Charette, Joel Fernandes, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai
That would be fine for our case but would not work for finding the correct cpuset.

Another option would be to find the correct bg nice value by using the fg nice value and cpuset. Based on current settings that would uniquely map to a thread type.

Joel Fernandes

unread,
Jun 1, 2023, 4:10:57 PM6/1/23
to Gabriel Charette, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
Hi Gab,
On Thu, Jun 1, 2023 at 3:09 PM Gabriel Charette <g...@google.com> wrote:
>
> Using nice-value in bg threads to encode thread type is fine (I don't think we care about the ratio spread in bg processes).
>
> However, I'm not clear how we map thread-type => bg nice-value as we're not currently storing the thread type on foreground threads either (some thread types map to the same nice values). I guess we could merely map fg-nice-values to bg-nice-values? i.e. happens under the hood, ThreadType never changes? I think that makes sense to me.
>

Hmm, but don't we have a map of the thread type to nice values
already? We could just scan that during encoding. Just to show the
idea, below is the patch. The encoding logic is probably not very
clean and could/should be re-written (we could perhaps simplify the
encoding by just rearranging the type->nice table).

Patch link: https://paste.googleplex.com/4592843438948352?raw (Only
build tested, and wrote it in about an hour or so. There could be bugs
in there, but I could work on it more if this is something that seems
a reasonable direction).

Here are some pros and cons:
Pros:
- no pidfd needed
- no map no matter how many threads
- no need tracking process/renderer exits to clean up map.
- relatively simple approach.

Cons:
- In some point of the future if we have too many thread types, we
can't encode all of them in nice values. That does seem far away for
now.
- The type -> nice value mapping has to be unique.
- Assumption that Real-time audio is the last entry in the type->nice map.
- have to read procfs which can be slow (for the thread iterator)

Thanks,

- Joel

Joel Fernandes

unread,
Jun 1, 2023, 4:15:53 PM6/1/23
to Youssef Esmat, Gabriel Charette, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai
Hi Youssef,

On Thu, Jun 1, 2023 at 3:39 PM Youssef Esmat <yousse...@google.com> wrote:
>
> That would be fine for our case but would not work for finding the correct cpuset.

Once we know the type, we can determine the cpuset? So when
backgrounding, force it to the non-urgent cpuset and on foregrounding,
call SetThreadType() the usual way.

> Another option would be to find the correct bg nice value by using the fg nice value and cpuset. Based on current settings that would uniquely map to a thread type.

If the fg nice values are unique, it becomes a direct nice -> thread
type mapping which can be encoded.

Thanks,

- Joel

Youssef Esmat

unread,
Jun 1, 2023, 5:32:09 PM6/1/23
to Joel Fernandes, Gabriel Charette, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai
Today foreground nice values do not map to unique thread type. ResourceEfficient and Normal share a nice value but differ in CPU set.
So to uniquely identify thread type we would have to use foreground nice value and cpuset. That then could map to a unique background nice value.

Joel Fernandes

unread,
Jun 2, 2023, 10:11:51 AM6/2/23
to Youssef Esmat, Gabriel Charette, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai
On Thu, Jun 1, 2023 at 5:32 PM Youssef Esmat <yousse...@google.com> wrote:
>
> Today foreground nice values do not map to unique thread type. ResourceEfficient and Normal share a nice value but differ in CPU set.
> So to uniquely identify thread type we would have to use foreground nice value and cpuset. That then could map to a unique background nice value.

Exactly, that's why I separated the nice values in my patch in the
last email. If you see in my patch, I did:

const ThreadTypeToNiceValuePair kThreadTypeToNiceValueMap[7] = {
- {ThreadType::kBackground, 10}, {ThreadType::kUtility, 1},
- {ThreadType::kResourceEfficient, 0}, {ThreadType::kDefault, 0},
+ {ThreadType::kBackground, 10}, {ThreadType::kUtility, 2},
+ {ThreadType::kResourceEfficient, 1}, {ThreadType::kDefault, 0},
#if BUILDFLAG(IS_CHROMEOS)
{ThreadType::kCompositing, -8},

If something is "resource efficient", it could/should perhaps use less
CPU than the default? Not sure. From what I see, the
features::WebRtcThreadsUseResourceEfficientType uses it which is
default-disabled.

But yes, I prefer making nice unique (and thus keeping it simple) over
making sure "nice + cpuset" is unique as there's no guarantee that
that combination will always be unique anyway. Admittedly my method is
also not without its own set of cons (and pros) as I mentioned in my
other email.

I realize Zheda have access to the gpaste, so I instead put it here
now: https://hastebin.com/share/eperajejor.diff

Thanks,

- Joel

Gabriel Charette

unread,
Jun 2, 2023, 5:17:56 PM6/2/23
to Joel Fernandes, Youssef Esmat, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai
How much does a +1/-1 nice-value affect things? Could it cause priority inversions or is it really just a small % tweak on CPU disponibility under contention?

Joel Fernandes

unread,
Jun 2, 2023, 6:19:11 PM6/2/23
to Gabriel Charette, Youssef Esmat, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai
Hi Gabriel,
For nice +1/-1, the difference is about 10% CPU, so where 2 tasks with
nice = 0 will both get 50% each on the same CPU - the same tasks with
nice = 0 and nice = +1 will get 60% and 40% CPU each. It is also
relative, so like 2 nice = +1 tasks will still get 50% each.

I don't think that should be too concerning to cause a priority
inversion issue. I would be slightly more concerned about priority
inversions if the delta in CPU time was too high. I have mostly heard
of priority inversion issues when something is below CFS (like SCHED_IDLE)
and owns a lock, or something is in the background CGroup with tiny
shares and takes a lock. There is ongoing work to fix these with a
general-purpose mechanism for Linux priority inversions in the community
[1] known as proxy execution.

Looking forward to your inputs,

thanks,

- Joel

[1] This is slightly older article and Juri's work is being revived by
John Stultz from Android kernel team: https://lwn.net/Articles/793502/

Youssef Esmat

unread,
Jun 2, 2023, 6:24:33 PM6/2/23
to Gabriel Charette, Joel Fernandes, Zheda Chen, scheduler-dev, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai
If you only have two threads one at 0 and the other at 1 then the one at 0 should get 10% more cpu.

Markus Handell

unread,
Jun 5, 2023, 6:11:29 AM6/5/23
to Zheda Chen, scheduler-dev, Youssef Esmat, Joel Fernandes, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Gabriel Charette
Generally, nice value mappings can work but I have some problems with the mappings and cpuset names.

Re kResourceEfficient and WebRTC. We did start running the experiment (see go/intel-kresource-efficient-rollout-plan) but there were very few capable endpoints at the time and no endpoints available in our performance lab, and the experiment expired so it's open for change.

Given the current proposed values kResourceEfficient for WebRTC threads make sense if it leads to processing on low-frequency or E-cores (because they don't need high performance, typically very short activations), but it does not make sense that they are placed in a "non-urgent" c-group or that priority is lower than default. The ideal way to describe these threads is that they need responsiveness in presence of other activity, but not high performance.

Zheda,
1. Can you clarify the meaning of the "non-urgent"/"normal" and "urgent" c-groups? The names convey priority to me, not power efficiency profile.
2. Maybe let's redo WebRtcThreadsUseResourceEfficientType with a new ThreadType where we can map the desired behavior later, and define?


On Mon, Jun 5, 2023 at 8:53 AM Zheda Chen <zheda...@intel.com> wrote:
I read Joel's patch and make a table below. In October 2022, there were some discussions (Gabriel, Francois, Bruce, Markus, and me) about whether should change nice value of kResourceEfficient type. At that time we wanted to support arbitrary mappings from "ThreadType" to "nice value" and "c-group". Probably also ask Bruce / Markus to take a look at this idea ?

thread-type-nice-values-FG-BG.JPG

Besides, a few comments about the code:
1. There's one small problem that kCompositing and kDisplayCritical have the same nice value (and cpuset) on ChromeOS, so GetThreadType (nice value -> thread type) never returns kDisplayCritical type (type = 5). But it doesn't affect nice value transition between foreground and background.
2. In the patch, background -> foreground transition code doesn't cover kRealtimeAudio yet.

Markus Handell

unread,
Jun 7, 2023, 4:52:44 AM6/7/23
to Zheda Chen, scheduler-dev, Youssef Esmat, Joel Fernandes, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Gabriel Charette
> Both behaviors are desired?

Yes but per the kResourceEfficient description: "normal priority but not latency sensitive" - this is not true for WebRTC network/worker threads (but acceptable for the WebRTC signaling thread). That's why I think we should add a "kRtc" ThreadType to which we can assign them. In the frame of the proposal with unique nice values, they could be at for example FG nice -4 but live in "non-urgent". Wdyt?

On Mon, Jun 5, 2023 at 2:20 PM Zheda Chen <zheda...@intel.com> wrote:
Re Markus's question.
1. Can you clarify the meaning of the "non-urgent"/"normal" and "urgent" c-groups? The names convey priority to me, not power efficiency profile.
Actually the names are a bit confusing here, probably need re-naming. The cpuset means to bind process / thread in a c-group to a specified set of CPUs and NUMA nodes. On ChromeOS, only CPU binding is used. Chrome uses cpuset to control the core placement of urgent / normal / non-urgent threads. On hybrid core systems, non-urgent threads are placed on E-cores, while normal / urgent threads able to run on all logical cores.

2. Maybe let's redo WebRtcThreadsUseResourceEfficientType with a new ThreadType where we can map the desired behavior later, and define?
Per current logic, if WebRTC threads use kResourceEfficient type, it has same ratio of CPU time as normal priority threads, and it runs only on E-cores to save power. Both behaviors are desired?


If we cannot use unique nice value per thread type, probably have to use nice value + cpuset to identify a type during foreground <---> background transition for the patch https://hastebin.com/share/eperajejor.diff.

Youssef Esmat

unread,
Jun 7, 2023, 10:22:49 AM6/7/23
to Zheda Chen, scheduler-dev, Markus Handell, Joel Fernandes, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Gabriel Charette
What would these threads do?
How many threads per meeting typically? 
These threads are latency sensitive but do not take a lot of runtime?

In large meetings on lower end (non-big/little) machines we see significant keypress latencies. Those latencies were significantly reduced by using kCompositing (nice -8) for the renderer main. I worry that if we go with nice -4 those latencies would come back.


On Wed, Jun 7, 2023 at 6:21 AM Zheda Chen <zheda...@intel.com> wrote:
>  That's why I think we should add a "kRtc" ThreadType to which we can assign them. In the frame of the proposal with unique nice values, they could be at for example FG nice -4 but live in "non-urgent". Wdyt?

Make FG nice slightly less than 0 + live in "non-urgent" for kRTC looks good to our web media team. If we add more thread types in the future, we might also need to make their nice unique.
Looking forward to inputs from other experts.

Markus Handell

unread,
Jun 7, 2023, 10:43:48 AM6/7/23
to Youssef Esmat, Zheda Chen, scheduler-dev, Joel Fernandes, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Gabriel Charette
> What would these threads do?

These threads process incoming and outgoing network packets, do transport crypt, and assembles/packetizes encoded frames and orders them for encoding/decoding, which happens on other threads (ThreadPool/custom for SW codecs, offloaded to various GPU threads for HW).

> How many threads per meeting typically? 
It's max 2 per renderer (I actually launched a merge to a single thread now) and shared between all peerconnections in that renderer.

> These threads are latency sensitive but do not take a lot of runtime?

Correct, here's it's nice to be able to lean on data - see microsecond duration distributions at https://uma.googleplex.com/p/chrome/timeline_v2?sid=4f6e67ce45eb6f10499a92372425398d.

> In large meetings on lower end (non-big/little) machines we see significant keypress latencies. Those latencies were significantly reduced by using kCompositing (nice -8) for the renderer main. I worry that if we go with nice -4 those latencies would come back.

Given the very short average execution times and the fact that we control all aspects of what's running there I don't think that will be a problem.
I'm actually more worried of hearing about renderer main as -8 actually.. since we have no control over what JS land might be doing? 

Joel Fernandes

unread,
Jun 9, 2023, 7:12:59 PM6/9/23
to Zheda Chen, scheduler-dev, Youssef Esmat, François Doray, Bruce Dawson, Chrome Scheduler, Markus Handell, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Gabriel Charette
Hello!
I just returned from a holiday and back to work today. Sorry for the delays.

On Mon, Jun 5, 2023 at 2:53 AM Zheda Chen <zheda...@intel.com> wrote:
>
> I read Joel's patch and make a table below. In October 2022, there were some discussions (Gabriel, Francois, Bruce, Markus, and me) about whether should change nice value of kResourceEfficient type. At that time we wanted to support arbitrary mappings from "ThreadType" to "nice value" and "c-group". Probably also ask Bruce / Markus to take a look at this idea ?
>
>
>
> Besides, a few comments about the code:
> 1. There's one small problem that kCompositing and kDisplayCritical have the same nice value (and cpuset) on ChromeOS, so GetThreadType (nice value -> thread type) never returns kDisplayCritical type (type = 5). But it doesn't affect nice value transition between foreground and background.

Thanks for pointing this out. I was wondering if we can solve that by
just eliminating one of the types. I am not a Chrome expert but is
that an option? If both thread types have the same scheduler settings,
then in theory as far as scheduling goes the types are
interchangeable. If they can be merged, that will be best. But are
the 2 types used elsewhere in, say, the display code for other
purposes?

>
> 2. In the patch, background -> foreground transition code doesn't cover kRealtimeAudio yet.

Right, that is intentional. We can cover kRealtimeAudio too with this
but my understanding is we want backgrounded renderers with audio
threads to maintain their foreground settings. Correct me if I'm wrong
though.

Thanks,

- Joel

Joel Fernandes

unread,
Jun 9, 2023, 7:16:38 PM6/9/23
to Youssef Esmat, Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Gabriel Charette
On Wed, Jun 7, 2023 at 10:22 AM Youssef Esmat <yousse...@google.com> wrote:

In large meetings on lower end (non-big/little) machines we see significant keypress latencies. Those latencies were significantly reduced by using kCompositing (nice -8) for the renderer main. I worry that if we go with nice -4 those latencies would come back.

Can the 2 types (display critical / compositing) be merged? The main thread often has a critical role to play for display purposes when the compositor cannot handle the drawing on its own.

Thanks.

Joel Fernandes

unread,
Jun 12, 2023, 11:11:22 AM6/12/23
to Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Gabriel Charette, Youssef Esmat
Hello Zheda,

On Mon, Jun 12, 2023 at 8:36 AM Zheda Chen <zheda...@intel.com> wrote:
[..]
> > Can the 2 types (display critical / compositing) be merged? The main thread often has a critical role to play for display purposes when the compositor cannot handle the drawing on its own.
>
> Per ThreadType API doc (link below), currently kCompositing and kDisplayCritical have different nice values and c-group on Linux, experiments are needed to bring IS_LINUX inline with IS_CHROMEOS, I guess the experiments are not started yet?
> Meanwhile, they also have different QoS behaviors on MacOS. I think it would take some time for experiments.
> @Gabriel has more context about 2 types.

Thank you for pointing this out. I see that unlike ChromeOS, the
priorities for those 2 categories are different on Mac/Linux. That
hints to me that perhaps it is OK to make kDisplayCritical a higher
priority than kCompositing for ChromeOS as well and use that as an
opportunity to make the nice values unique.

>
> ThreadType API doc: https://docs.google.com/document/d/15O6EkVtNQz1PYl3LL_uM4_5muylEKk_vPvHUbetf8q0/edit#heading=h.tl8zyg9vbi3z
>
> BTW, looks like the PoC patch link is expired? I cannot access the patch now.
> https://hastebin.com/share/eperajejor.diff

Here is the updated CL (mainly added more error handling and handled
the issue you mentioned):
https://chromium-review.googlesource.com/c/chromium/src/+/4605124

I would like to point out that at high nice value delta, the CFS
scheduler tends to have less of a behavior difference. So in that
sense, I do not see an issue with keeping kDisplayCritical at -9 and
kCompositing at -8. That's inline with the difference in priorities
with the other OS'es you pointed out.

Would you be willing to give this CL a try on your side?

Thank you,

- Joel

Gabriel Charette

unread,
Jun 12, 2023, 3:02:39 PM6/12/23
to Joel Fernandes, Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
Ultimately kDisplayCricital and kCompositing are the same, the difference is mostly historical, kDisplayCritical was used to flag threads that were unconditionally DISPLAY priority whereas compositing ones had platform-specific priorities.

I could see kDisplayCritical being slightly higher in that it's usually used for communication threads but I also then worry about communication overwhelming critical path...
Mac differs here for aforementioned historical reasons, we intend to experiment with bringing kCompositing inline with kDisplayCritical there, not the other way around.

All that to say, I worry about a proliferation of slightly different priority levels and hope we can align platforms and coalesce thread types over time rather than do the opposite.

Joel Fernandes

unread,
Jun 12, 2023, 3:30:21 PM6/12/23
to Gabriel Charette, Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
On Mon, Jun 12, 2023 at 3:02 PM Gabriel Charette <g...@google.com> wrote:
>
> Ultimately kDisplayCricital and kCompositing are the same, the difference is mostly historical, kDisplayCritical was used to flag threads that were unconditionally DISPLAY priority whereas compositing ones had platform-specific priorities.
>
> I could see kDisplayCritical being slightly higher in that it's usually used for communication threads but I also then worry about communication overwhelming critical path...
> Mac differs here for aforementioned historical reasons, we intend to experiment with bringing kCompositing inline with kDisplayCritical there, not the other way around.
>
> All that to say, I worry about a proliferation of slightly different priority levels and hope we can align platforms and coalesce thread types over time rather than do the opposite.
>

Thanks for the history on that. I'd think it will also simplify the
Chrome code a little bit more.

One thing annoying to me is all the #ifdef in platform_thread_linux.cc
and I was wondering if we can clean that up a bit using C++
abstraction features and unify ChromeOS / Linux a bit more. I am happy
to give that a shot if it is doable. Another way to get rid of
ifdeffery could be to just define new functions for the different
purposes rather than commonalizing the functions with too much ifdef
in them.

Gabriel Charette

unread,
Jun 12, 2023, 3:46:59 PM6/12/23
to Joel Fernandes, Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
+1 to aligning CrOS and Linux.

Etienne Pierre-doray

unread,
Jun 12, 2023, 4:16:19 PM6/12/23
to Gabriel Charette, Joel Fernandes, Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
Ultimately kDisplayCricital and kCompositing are the same, the difference is mostly historical,
Agreed, I'd plan on merging them; and perhaps eventually make a different split that might make more sense. 
 
Meanwhile, they also have different QoS behaviors on MacOS. I think it would take some time for experiments.
I started UserInteractiveCompositingMac for this.
I don't know if it's worth doing an experiment on Linux; population is small and presumably the fact that it was a good choice on ChromeOs means it should be on Linux as well.

Gabriel Charette

unread,
Jun 12, 2023, 5:14:03 PM6/12/23
to Etienne Pierre-doray, Joel Fernandes, Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
On Mon, Jun 12, 2023 at 4:16 PM Etienne Pierre-doray <etie...@google.com> wrote:
Ultimately kDisplayCricital and kCompositing are the same, the difference is mostly historical,
Agreed, I'd plan on merging them; and perhaps eventually make a different split that might make more sense. 

+1 on an eventual different split, i.e. presumably there's a unique ThreadType for all IO threads (now that they only process mojo and not run tasks, they're getting closer to "real time" dispatch-only threads)
 
 
Meanwhile, they also have different QoS behaviors on MacOS. I think it would take some time for experiments.
I started UserInteractiveCompositingMac for this.

Great :)
 
I don't know if it's worth doing an experiment on Linux; population is small and presumably the fact that it was a good choice on ChromeOs means it should be on Linux as well.

Good question, having Linux reflect CrOS will actually improve our experimentation power (assuming both behave similarly enough). The one risk with merging without experimentation is that we don't want Chrome for Googlers to vary greatly under contention from other platforms. Might be worth a small (dogfood?) experiment?

Markus Handell

unread,
Jun 13, 2023, 11:27:24 AM6/13/23
to Zheda Chen, scheduler-dev, Gabriel Charette, Joel Fernandes, François Doray, Bruce Dawson, Chrome Scheduler, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat, Etienne Pierre-doray
Gabriel,

> All that to say, I worry about a proliferation of slightly different priority levels and hope we can align platforms and coalesce thread types over time rather than do the opposite.

Does that mean you're opposed to a kRtc thread type? Something that fits the case is missing now.

On Tue, Jun 13, 2023 at 3:41 PM Zheda Chen <zheda...@intel.com> wrote:
Thanks Gabriel and Etienne for sharing the context and status of two thread types. 
When thread types evolve (either merge or meaningful split over time), we will also need to update the logic of foreground <----> background nice change.

> Here is the updated CL (mainly added more error handling and handled
> the issue you mentioned):
https://chromium-review.googlesource.com/c/chromium/src/+/4605124
> Would you be willing to give this CL a try on your side?

Yes I'm glad to have a try, and will comment on the CL soon.

Joel Fernandes

unread,
Jun 13, 2023, 10:57:41 PM6/13/23
to Gabriel Charette, Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
I did some cleanups for platform_thread*. For Linux it removes the
ifdefs from platform_thread_linux.cc as well:
https://chromium-review.googlesource.com/q/topic:%22linux-threading-cleanups%22

This topic contains the cleanups + the background priorities patch.

I am still working on testing it for Windows, Mac, AIX etc. Any
initial reviews, comments, testing are welcomed.

Thanks,

- Joel

Joel Fernandes

unread,
Jun 15, 2023, 1:10:16 AM6/15/23
to Gabriel Charette, Zheda Chen, scheduler-dev, Markus Handell, François Doray, Bruce Dawson, Chrome Scheduler, Etienne Pierre-doray, fdoray-...@google.com, Vineeth Pillai, Youssef Esmat
On Tue, Jun 13, 2023 at 10:57 PM Joel Fernandes <joe...@google.com> wrote:
>
> On Mon, Jun 12, 2023 at 3:30 PM Joel Fernandes <joe...@google.com> wrote:
> >
> > On Mon, Jun 12, 2023 at 3:02 PM Gabriel Charette <g...@google.com> wrote:
> > >
> > > Ultimately kDisplayCricital and kCompositing are the same, the difference is mostly historical, kDisplayCritical was used to flag threads that were unconditionally DISPLAY priority whereas compositing ones had platform-specific priorities.
> > >
> > > I could see kDisplayCritical being slightly higher in that it's usually used for communication threads but I also then worry about communication overwhelming critical path...
> > > Mac differs here for aforementioned historical reasons, we intend to experiment with bringing kCompositing inline with kDisplayCritical there, not the other way around.
> > >
> > > All that to say, I worry about a proliferation of slightly different priority levels and hope we can align platforms and coalesce thread types over time rather than do the opposite.
> > >
> >
> > Thanks for the history on that. I'd think it will also simplify the
> > Chrome code a little bit more.
> >
> > One thing annoying to me is all the #ifdef in platform_thread_linux.cc
> > and I was wondering if we can clean that up a bit using C++
> > abstraction features and unify ChromeOS / Linux a bit more. I am happy
> > to give that a shot if it is doable. Another way to get rid of
> > ifdeffery could be to just define new functions for the different
> > purposes rather than commonalizing the functions with too much ifdef
> > in them.
>
> I did some cleanups for platform_thread*. For Linux it removes the
> ifdefs from platform_thread_linux.cc as well:
> https://chromium-review.googlesource.com/q/topic:%22linux-threading-cleanups%22

I got the build with the cleanups working on Linux, Mac, Windows,
ChromeOS. No idea how to build AIX (let me know if there are any
instructions but I couldn't find any). I will do iOS and Android as
well.

- Joel
Reply all
Reply to author
Forward
0 new messages