Re: [blink-dev] Re: Intent to Implement: Cooperative Scheduling

3 views
Skip to first unread message

Alexander Timin

unread,
Feb 2, 2018, 1:50:33 PM2/2/18
to Kentaro Hara, Taiju Tsuiki, Boris Zbarsky, blink-dev, scheduler-dev, Gabriel Charette, Alex Clarke, Sami Kyostila
Cooperative scheduling sounds like a great way to localize performance on mobile!

However, I'm very scared of nested message loops (we've had a great deal of problems with them already in the scheduler) and I'd like to avoid increasing their usage in Chromium by a magnitude.

What we want to do here is to implement user-space execution context switching. I wonder if instead using message loops we could implement it using coroutine-like approach by saving the stack state using setjmp/longjmp and yielding control back to the scheduler?

The benefits include reduced number of problems with reentrancy and explicit control of the execution flow from the scheduler (e.g. better control when we can resume execution of an interrupted task).
The other issue that this approach addresses is security -- this way we won't end up with frames both from main frame and cross-origin iframe on the same stack.
+alexclarke@, skyostil@, gab@ and scheduler-dev@.

P. S. I'd say that this has low to medium compat risks -- at the moment web developers do not expect that their scripts can be interrupted in the middle and this can break some metrics and analytics. It's likely that we'll have to expose some information about preemption to web devs.


On 2 February 2018 at 16:38, Kentaro Hara <har...@chromium.org> wrote:
Thanks Boris!

And once site isolation happens, the potential wins here need to be reevaluated, of course.

Yeah, our plan is to have both OOPIF and cooperative scheduling. For example, it would be hard to enable OOPIF on low-memory mobile devices. Even on desktops it's not uncommon that one renderer process is shared by multiple tabs. My assumption is that the cooperative scheduling would be useful for those scenarios. In other words, the cooperative scheduling is expected to solve jank issues that cannot be solved by OOPIF :)


On Sat, Feb 3, 2018 at 1:20 AM, Boris Zbarsky <bzba...@mit.edu> wrote:
On 2/2/18 10:15 AM, Kentaro Hara wrote:
I'm just curious but does this mean that you didn't get a clear performance win (if you don't mind sharing it)?

I haven't been following this closely, but my impression is that at least for the moment the ratio of potential wins (that would be the initial measurements) to engineering effort is less than for other things we can work on, so we're focusing on those.

And once site isolation happens, the potential wins here need to be reevaluated, of course.

-Boris



--
Kentaro Hara, Tokyo, Japan

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CABg10jxfR5TbO7ABwvvk_BaoAsZwB344dq3KJJnuUMNuEnpYNw%40mail.gmail.com.

Daniel Cheng

unread,
Feb 2, 2018, 2:20:01 PM2/2/18
to Alexander Timin, Kentaro Hara, Taiju Tsuiki, Boris Zbarsky, blink-dev, scheduler-dev, Gabriel Charette, Alex Clarke, Sami Kyostila
On Fri, Feb 2, 2018 at 10:50 AM Alexander Timin <alt...@chromium.org> wrote:
Cooperative scheduling sounds like a great way to localize performance on mobile!

However, I'm very scared of nested message loops (we've had a great deal of problems with them already in the scheduler) and I'd like to avoid increasing their usage in Chromium by a magnitude.

What we want to do here is to implement user-space execution context switching. I wonder if instead using message loops we could implement it using coroutine-like approach by saving the stack state using setjmp/longjmp and yielding control back to the scheduler?

The benefits include reduced number of problems with reentrancy and explicit control of the execution flow from the scheduler (e.g. better control when we can resume execution of an interrupted task).

From my understanding of the proposal, we don't allow unlimited re-entrancy. I'd also be quite nervous about setjmp() and longjmp() and how they interact with things like RAII and destructors.
 
The other issue that this approach addresses is security -- this way we won't end up with frames both from main frame and cross-origin iframe on the same stack.
+alexclarke@, skyostil@, gab@ and scheduler-dev@.

I'm not sure how setjmp() and longjmp() would help with this: the state I'd be most concerned about is global state (e.g. what is the current execution context).
 

P. S. I'd say that this has low to medium compat risks -- at the moment web developers do not expect that their scripts can be interrupted in the middle and this can break some metrics and analytics. It's likely that we'll have to expose some information about preemption to web devs.

I don't think this would break things any more than OOPIF would, since a context can only yield to a cross-site context.

Daniel
 


On 2 February 2018 at 16:38, Kentaro Hara <har...@chromium.org> wrote:
Thanks Boris!

And once site isolation happens, the potential wins here need to be reevaluated, of course.

Yeah, our plan is to have both OOPIF and cooperative scheduling. For example, it would be hard to enable OOPIF on low-memory mobile devices. Even on desktops it's not uncommon that one renderer process is shared by multiple tabs. My assumption is that the cooperative scheduling would be useful for those scenarios. In other words, the cooperative scheduling is expected to solve jank issues that cannot be solved by OOPIF :)


On Sat, Feb 3, 2018 at 1:20 AM, Boris Zbarsky <bzba...@mit.edu> wrote:
On 2/2/18 10:15 AM, Kentaro Hara wrote:
I'm just curious but does this mean that you didn't get a clear performance win (if you don't mind sharing it)?

I haven't been following this closely, but my impression is that at least for the moment the ratio of potential wins (that would be the initial measurements) to engineering effort is less than for other things we can work on, so we're focusing on those.

And once site isolation happens, the potential wins here need to be reevaluated, of course.

-Boris



--
Kentaro Hara, Tokyo, Japan

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CABg10jxfR5TbO7ABwvvk_BaoAsZwB344dq3KJJnuUMNuEnpYNw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "scheduler-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheduler-de...@chromium.org.
To post to this group, send email to schedu...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/CALHg4nmCqhtY9%2BOb%2BojR9JySvVMiikXSw8tKKtWLD-oa7q4zbw%40mail.gmail.com.

Kentaro Hara

unread,
Feb 4, 2018, 8:44:45 PM2/4/18
to Daniel Cheng, Alexander Timin, Taiju Tsuiki, Boris Zbarsky, blink-dev, scheduler-dev, Gabriel Charette, Alex Clarke, Sami Kyostila
However, I'm very scared of nested message loops (we've had a great deal of problems with them already in the scheduler) and I'd like to avoid increasing their usage in Chromium by a magnitude.

Thanks Alexander -- this is a valid concern.

To mitigate the concern, in short term I'm planning to enable the cooperative scheduling only in the following stack:

  (no Blink C++ stack) => cross-origin V8 => (yield) => main thread task

From the performance perspective, I think this is already useful to dramatically reduce janks caused by cross-origin frames.

Do you think it will help?

========

If we want to support more cases, we can rewrite Blink so that a cross-origin V8 execution starts without having a Blink C++ stack. For example, the proposed cooperative scheduling cannot support a case where Blink runs a parser-blocking script of a cross-origin frame, because it has the following stack:

  (Blink C++ stack) => cross-origin V8 => (yield) => main thread task

Then we rewrite Blink as follows:

  (Blink C++ stack) => (post a task to run the parser-blocking script)
  (no Blink C++ stack) => cross-origin V8 => (yield) => main thread task

Then the cooperative scheduling can be enabled :)




On Sat, Feb 3, 2018 at 4:19 AM, Daniel Cheng <dch...@chromium.org> wrote:
On Fri, Feb 2, 2018 at 10:50 AM Alexander Timin <alt...@chromium.org> wrote:
Cooperative scheduling sounds like a great way to localize performance on mobile!

However, I'm very scared of nested message loops (we've had a great deal of problems with them already in the scheduler) and I'd like to avoid increasing their usage in Chromium by a magnitude.

What we want to do here is to implement user-space execution context switching. I wonder if instead using message loops we could implement it using coroutine-like approach by saving the stack state using setjmp/longjmp and yielding control back to the scheduler?

The benefits include reduced number of problems with reentrancy and explicit control of the execution flow from the scheduler (e.g. better control when we can resume execution of an interrupted task).

From my understanding of the proposal, we don't allow unlimited re-entrancy. I'd also be quite nervous about setjmp() and longjmp() and how they interact with things like RAII and destructors.

Yeah, at first I was thinking about introducing user-level context switching (or, Green Threads) but realized that it's pretty complex.

 
The other issue that this approach addresses is security -- this way we won't end up with frames both from main frame and cross-origin iframe on the same stack.
+alexclarke@, skyostil@, gab@ and scheduler-dev@.

I'm not sure how setjmp() and longjmp() would help with this: the state I'd be most concerned about is global state (e.g. what is the current execution context).

Agreed.


 

P. S. I'd say that this has low to medium compat risks -- at the moment web developers do not expect that their scripts can be interrupted in the middle and this can break some metrics and analytics. It's likely that we'll have to expose some information about preemption to web devs.

I don't think this would break things any more than OOPIF would, since a context can only yield to a cross-site context.

Daniel
 


On 2 February 2018 at 16:38, Kentaro Hara <har...@chromium.org> wrote:
Thanks Boris!

And once site isolation happens, the potential wins here need to be reevaluated, of course.

Yeah, our plan is to have both OOPIF and cooperative scheduling. For example, it would be hard to enable OOPIF on low-memory mobile devices. Even on desktops it's not uncommon that one renderer process is shared by multiple tabs. My assumption is that the cooperative scheduling would be useful for those scenarios. In other words, the cooperative scheduling is expected to solve jank issues that cannot be solved by OOPIF :)


On Sat, Feb 3, 2018 at 1:20 AM, Boris Zbarsky <bzba...@mit.edu> wrote:
On 2/2/18 10:15 AM, Kentaro Hara wrote:
I'm just curious but does this mean that you didn't get a clear performance win (if you don't mind sharing it)?

I haven't been following this closely, but my impression is that at least for the moment the ratio of potential wins (that would be the initial measurements) to engineering effort is less than for other things we can work on, so we're focusing on those.

And once site isolation happens, the potential wins here need to be reevaluated, of course.

-Boris



--
Kentaro Hara, Tokyo, Japan

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CABg10jxfR5TbO7ABwvvk_BaoAsZwB344dq3KJJnuUMNuEnpYNw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "scheduler-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheduler-dev+unsubscribe@chromium.org.
Reply all
Reply to author
Forward
0 new messages