Pledge for help completing migration to TaskScheduler!

Gabriel Charette

unread,

Jun 12, 2017, 3:28:31 PM6/12/17

to Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

Greetings everyone,

tl;dr; please grab ownership of your files in this sheet to help us complete migration to TaskScheduler for PostTask across the codebase :).

we scripted much of the migration from old APIs to TaskScheduler but we are getting to a point where remaining callers require the eyes of a developer. It's fairly simple to do but there's too many instances for the TaskScheduler team to handle it on their own.

And since it requires component specific knowledge, the best owners are the OWNERS themselves :).

Remaining are:

1) BlockingPool: plumbed into components (can clean all of that plumbing up :)! simple but hard to script...)

2) BrowserThreads (except UI/IO)

We recently posted brand new extensive documentation about:

* TaskScheduler migration

* Threading and Tasks paradigms in Chrome

We built a list of all the files left with things to migrate in this sheet. Please grab ownership of items in your area -- we will be chasing owners for unclaimed items shortly ;).

This is important for:

* Code health (1 API is better than 3¹)

* Separation of concern (splitting work on a dedicated thread in independent sequences).

* Scheduling opportunities

* Pre-req for ongoing task metrics work (having all tasks go through task_scheduler::internal::TaskTracker).

Thanks for your help,

Gab and the TaskScheduler team

¹ Or 4 actually, but we are almost done getting rid of base::WorkerPool and didn't include it in this list.

Egor Pasko

unread,

Jun 14, 2017, 2:17:34 PM6/14/17

to g...@chromium.org, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

I've got a review for some bits of DB thread migration and I realized that I do not know enough about the DB thread to make the right call.

We can make a sequence per sqlite DB, this would be pretty straightforward. Is it the current recommendation? While this helps with code health and potential metrics, I don't think it would open any new scheduling opportunities.

The pattern I saw in a few places is:

1. read all contents of the DB into RAM on initialization

2. on updates of this working set, throw a task to DB thread to flush the delta to disk

3. (optionally) throttle somewhat to reduce amount of flushes (powering up disks, etc), but still provide some persistency guarantees (which are specific to particular sqlite DB)

4. the DB thread is mostly idle most of the time, and blocks nothing user-visible

Do you think this needs more advanced scheduling other than one thread for all the sqlite DBs? Do we expect the new metrics to show opportunities here?

Can we do this without measuring? One particular dragon I am seeing: parallelizing the tasks on the DB thread may have undesired consequences of having more parallel disk I/O (example: spinny disks under heavy seek storms that would block critical reads from the filesystem, such as HTTP disk cache, similar story for cheap eMMC on Android), does the scheduler API have any primitives to address this potential concern?

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAJTZ7LJiDrU4i6vyfeK35wf18ntXOXO_b-RXUxKfzZfnAKRF%3Dw%40mail.gmail.com.

--

Egor Pasko

Gabriel Charette

unread,

Jun 14, 2017, 2:31:26 PM6/14/17

to Egor Pasko, g...@chromium.org, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

On Wed, Jun 14, 2017 at 2:16 PM Egor Pasko <pa...@google.com> wrote:

I've got a review for some bits of DB thread migration and I realized that I do not know enough about the DB thread to make the right call.

We can make a sequence per sqlite DB, this would be pretty straightforward. Is it the current recommendation? While this helps with code health and potential metrics, I don't think it would open any new scheduling opportunities.

The pattern I saw in a few places is:

1. read all contents of the DB into RAM on initialization

2. on updates of this working set, throw a task to DB thread to flush the delta to disk

3. (optionally) throttle somewhat to reduce amount of flushes (powering up disks, etc), but still provide some persistency guarantees (which are specific to particular sqlite DB)

4. the DB thread is mostly idle most of the time, and blocks nothing user-visible

Do you think this needs more advanced scheduling other than one thread for all the sqlite DBs? Do we expect the new metrics to show opportunities here?

If there isn't much contention on that thread already then I think it's fine moving it all to the same SequencedTaskRunner. What matters is that it's provided by post_task.h so that 1) we use a common API everywhere and 2) tasks go through TaskTracker which is where all modern instrumentation/profiling is being added.

Ref. lazy_task_runner.h for an API that allows you to have one TaskRunner accessible statically for all your tasks. Having dedicated objects for each logically associated set of tasks is cleaner (even when ignoring scheduling) -- e.g. this is like when we went from global notifications -> ObserverLists.

Can we do this without measuring? One particular dragon I am seeing: parallelizing the tasks on the DB thread may have undesired consequences of having more parallel disk I/O (example: spinny disks under heavy seek storms that would block critical reads from the filesystem, such as HTTP disk cache, similar story for cheap eMMC on Android), does the scheduler API have any primitives to address this potential concern?

Yes, that's actually another argument for bringing these tasks explicitly to the TaskScheduler. Once the TaskScheduler is aware of all the work, everything is possible. It can dynamically adjust to the current machine's resources (disk, CPU, etc.). Just because things are split on different sequences and *could* run in parallel doesn't mean they actually will, that's the scheduler's decision and it can make a better decision here if there aren't other threads running wild doing their own thing.

In practice, TaskScheduler v1 has as many worker threads available for I/O as BlockingPool + CachePool combined previously did, aiming to maintain the status quo initially (or slightly improve it as task priority still has a chance to help in some scenarios), this is nowhere near optimized though -- we will run experiments once we have a full workload, getting to a full workload without regressions is the current goal.

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAJTZ7LJiDrU4i6vyfeK35wf18ntXOXO_b-RXUxKfzZfnAKRF%3Dw%40mail.gmail.com.

--
Egor Pasko

Egor Pasko

unread,

Jun 14, 2017, 2:56:06 PM6/14/17

to Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

On Wed, Jun 14, 2017 at 8:30 PM, Gabriel Charette <g...@chromium.org> wrote:

On Wed, Jun 14, 2017 at 2:16 PM Egor Pasko <pa...@google.com> wrote:
I've got a review for some bits of DB thread migration and I realized that I do not know enough about the DB thread to make the right call.

We can make a sequence per sqlite DB, this would be pretty straightforward. Is it the current recommendation? While this helps with code health and potential metrics, I don't think it would open any new scheduling opportunities.

The pattern I saw in a few places is:

1. read all contents of the DB into RAM on initialization

2. on updates of this working set, throw a task to DB thread to flush the delta to disk

3. (optionally) throttle somewhat to reduce amount of flushes (powering up disks, etc), but still provide some persistency guarantees (which are specific to particular sqlite DB)

4. the DB thread is mostly idle most of the time, and blocks nothing user-visible

Do you think this needs more advanced scheduling other than one thread for all the sqlite DBs? Do we expect the new metrics to show opportunities here?

If there isn't much contention on that thread already then I think it's fine moving it all to the same SequencedTaskRunner.

I don't know if there is much contention on DB thread. Hopefully someone can chime in and confirm that the above is what actually happens.

If there is contention, it still may make sense to have a single sequence for all. For example we may want to throttle sqlite in a limited threadpool (1 thread?), but at the same time allow disk cache to utilize as many parallel threads as possible for blocking I/O. Without knowing the access pattern I am inclined to just match the current behavior (i.e. one sequence for all DB tasks) and leave the experimentation to whoever is brave (maybe myself, but certainly later). Ideally with Finch.

What matters is that it's provided by post_task.h so that 1) we use a common API everywhere and 2) tasks go through TaskTracker which is where all modern instrumentation/profiling is being added.

Ref. lazy_task_runner.h for an API that allows you to have one TaskRunner accessible statically for all your tasks. Having dedicated objects for each logically associated set of tasks is cleaner (even when ignoring scheduling) -- e.g. this is like when we went from global notifications -> ObserverLists.

Can we do this without measuring? One particular dragon I am seeing: parallelizing the tasks on the DB thread may have undesired consequences of having more parallel disk I/O (example: spinny disks under heavy seek storms that would block critical reads from the filesystem, such as HTTP disk cache, similar story for cheap eMMC on Android), does the scheduler API have any primitives to address this potential concern?

Yes, that's actually another argument for bringing these tasks explicitly to the TaskScheduler. Once the TaskScheduler is aware of all the work, everything is possible.

At this stage I am worried about adding more occasional and untestable parallelism on disk without measuring. Also, unclear on how much to invest into expressing parallelization constraints between tasks. With more effort, I can express more, but part of this effort may be wasted.

It can dynamically adjust to the current machine's resources (disk, CPU, etc.). Just because things are split on different sequences and *could* run in parallel doesn't mean they actually will, that's the scheduler's decision and it can make a better decision here if there aren't other threads running wild doing their own thing.

In practice, TaskScheduler v1 has as many worker threads available for I/O as BlockingPool + CachePool combined previously did, aiming to maintain the status quo initially (or slightly improve it as task priority still has a chance to help in some scenarios), this is nowhere near optimized though -- we will run experiments once we have a full workload, getting to a full workload without regressions is the current goal.

please define "full workload". Will it be really status quo? Splitting FILE/DB threads into multiple sequences governed by a single threadpool is a different system that may perform significantly differently, no?

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.

To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAJTZ7LJiDrU4i6vyfeK35wf18ntXOXO_b-RXUxKfzZfnAKRF%3Dw%40mail.gmail.com.

--
Egor Pasko

--

Egor Pasko

Gabriel Charette

unread,

Jun 14, 2017, 3:55:20 PM6/14/17

to Egor Pasko, Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

Top-level comment: note that splitting concerns isn't only about being able to run more things in parallel; it's also (and even more so) about being able to run only what matters under crunch. Currently we have to run everything to get to what matters because work is queued in FIFO on a single sequence (which also causes jank dependencies between unrelated systems).

On Wed, Jun 14, 2017 at 2:55 PM Egor Pasko <pa...@google.com> wrote:

On Wed, Jun 14, 2017 at 8:30 PM, Gabriel Charette <g...@chromium.org> wrote:

On Wed, Jun 14, 2017 at 2:16 PM Egor Pasko <pa...@google.com> wrote:
I've got a review for some bits of DB thread migration and I realized that I do not know enough about the DB thread to make the right call.

We can make a sequence per sqlite DB, this would be pretty straightforward. Is it the current recommendation? While this helps with code health and potential metrics, I don't think it would open any new scheduling opportunities.

The pattern I saw in a few places is:

1. read all contents of the DB into RAM on initialization

2. on updates of this working set, throw a task to DB thread to flush the delta to disk

3. (optionally) throttle somewhat to reduce amount of flushes (powering up disks, etc), but still provide some persistency guarantees (which are specific to particular sqlite DB)

4. the DB thread is mostly idle most of the time, and blocks nothing user-visible

Do you think this needs more advanced scheduling other than one thread for all the sqlite DBs? Do we expect the new metrics to show opportunities here?

If there isn't much contention on that thread already then I think it's fine moving it all to the same SequencedTaskRunner.

I don't know if there is much contention on DB thread. Hopefully someone can chime in and confirm that the above is what actually happens.

If there is contention, it still may make sense to have a single sequence for all. For example we may want to throttle sqlite in a limited threadpool (1 thread?), but at the same time allow disk cache to utilize as many parallel threads as possible for blocking I/O. Without knowing the access pattern I am inclined to just match the current behavior (i.e. one sequence for all DB tasks) and leave the experimentation to whoever is brave (maybe myself, but certainly later). Ideally with Finch.

Disk cache is currently running at TaskPriority::USER_BLOCKING so it will get ahead of anything marked with a lower priority (even up to using all the available workers under load). I'm fine leaving it as-is with a static task runner, at this point we mostly care about getting rid of the static BrowserThread::IDs.

What matters is that it's provided by post_task.h so that 1) we use a common API everywhere and 2) tasks go through TaskTracker which is where all modern instrumentation/profiling is being added.

Ref. lazy_task_runner.h for an API that allows you to have one TaskRunner accessible statically for all your tasks. Having dedicated objects for each logically associated set of tasks is cleaner (even when ignoring scheduling) -- e.g. this is like when we went from global notifications -> ObserverLists.

Can we do this without measuring? One particular dragon I am seeing: parallelizing the tasks on the DB thread may have undesired consequences of having more parallel disk I/O (example: spinny disks under heavy seek storms that would block critical reads from the filesystem, such as HTTP disk cache, similar story for cheap eMMC on Android), does the scheduler API have any primitives to address this potential concern?

Yes, that's actually another argument for bringing these tasks explicitly to the TaskScheduler. Once the TaskScheduler is aware of all the work, everything is possible.

At this stage I am worried about adding more occasional and untestable parallelism on disk without measuring. Also, unclear on how much to invest into expressing parallelization constraints between tasks. With more effort, I can express more, but part of this effort may be wasted.

It can dynamically adjust to the current machine's resources (disk, CPU, etc.). Just because things are split on different sequences and *could* run in parallel doesn't mean they actually will, that's the scheduler's decision and it can make a better decision here if there aren't other threads running wild doing their own thing.

In practice, TaskScheduler v1 has as many worker threads available for I/O as BlockingPool + CachePool combined previously did, aiming to maintain the status quo initially (or slightly improve it as task priority still has a chance to help in some scenarios), this is nowhere near optimized though -- we will run experiments once we have a full workload, getting to a full workload without regressions is the current goal.

please define "full workload". Will it be really status quo? Splitting FILE/DB threads into multiple sequences governed by a single threadpool is a different system that may perform significantly differently, no?

That's a good point, we could restrict work that isn't TaskPriority::USER_BLOCKING to a max of 3 threads to simulate the current system if that feels more "status quo". AFAIK though disks perform better when the kernel is aware of all the pending I/O requests so parallelization shouldn't be a bad thing (especially when coupled with I/O priority at kernel level but I think that's a Windows-only thing).

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAJTZ7LJiDrU4i6vyfeK35wf18ntXOXO_b-RXUxKfzZfnAKRF%3Dw%40mail.gmail.com.

--
Egor Pasko

--
Egor Pasko

Egor Pasko

unread,

Jun 15, 2017, 5:53:33 AM6/15/17

to Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

On Wed, Jun 14, 2017 at 9:54 PM, Gabriel Charette <g...@chromium.org> wrote:

Top-level comment: note that splitting concerns isn't only about being able to run more things in parallel; it's also (and even more so) about being able to run only what matters under crunch. Currently we have to run everything to get to what matters because work is queued in FIFO on a single sequence (which also causes jank dependencies between unrelated systems).

On Wed, Jun 14, 2017 at 2:55 PM Egor Pasko <pa...@google.com> wrote:
On Wed, Jun 14, 2017 at 8:30 PM, Gabriel Charette <g...@chromium.org> wrote:

On Wed, Jun 14, 2017 at 2:16 PM Egor Pasko <pa...@google.com> wrote:
I've got a review for some bits of DB thread migration and I realized that I do not know enough about the DB thread to make the right call.

We can make a sequence per sqlite DB, this would be pretty straightforward. Is it the current recommendation? While this helps with code health and potential metrics, I don't think it would open any new scheduling opportunities.

The pattern I saw in a few places is:

1. read all contents of the DB into RAM on initialization

2. on updates of this working set, throw a task to DB thread to flush the delta to disk

3. (optionally) throttle somewhat to reduce amount of flushes (powering up disks, etc), but still provide some persistency guarantees (which are specific to particular sqlite DB)

4. the DB thread is mostly idle most of the time, and blocks nothing user-visible

Do you think this needs more advanced scheduling other than one thread for all the sqlite DBs? Do we expect the new metrics to show opportunities here?

If there isn't much contention on that thread already then I think it's fine moving it all to the same SequencedTaskRunner.

I don't know if there is much contention on DB thread. Hopefully someone can chime in and confirm that the above is what actually happens.

If there is contention, it still may make sense to have a single sequence for all. For example we may want to throttle sqlite in a limited threadpool (1 thread?), but at the same time allow disk cache to utilize as many parallel threads as possible for blocking I/O. Without knowing the access pattern I am inclined to just match the current behavior (i.e. one sequence for all DB tasks) and leave the experimentation to whoever is brave (maybe myself, but certainly later). Ideally with Finch.

Disk cache is currently running at TaskPriority::USER_BLOCKING so it will get ahead of anything marked with a lower priority (even up to using all the available workers under load). I'm fine leaving it as-is with a static task runner, at this point we mostly care about getting rid of the static BrowserThread::IDs.

I think this is not true. I believe the disk subsystem in the kernel ignores this priority setting. So in effect, even though the cache tasks will be scheduled, their access to disk would be slowed down if a lot of concurrent disk I/O is happening, regardless of how low the priorities of those other threads are.

What matters is that it's provided by post_task.h so that 1) we use a common API everywhere and 2) tasks go through TaskTracker which is where all modern instrumentation/profiling is being added.

Ref. lazy_task_runner.h for an API that allows you to have one TaskRunner accessible statically for all your tasks. Having dedicated objects for each logically associated set of tasks is cleaner (even when ignoring scheduling) -- e.g. this is like when we went from global notifications -> ObserverLists.

Can we do this without measuring? One particular dragon I am seeing: parallelizing the tasks on the DB thread may have undesired consequences of having more parallel disk I/O (example: spinny disks under heavy seek storms that would block critical reads from the filesystem, such as HTTP disk cache, similar story for cheap eMMC on Android), does the scheduler API have any primitives to address this potential concern?

Yes, that's actually another argument for bringing these tasks explicitly to the TaskScheduler. Once the TaskScheduler is aware of all the work, everything is possible.

At this stage I am worried about adding more occasional and untestable parallelism on disk without measuring. Also, unclear on how much to invest into expressing parallelization constraints between tasks. With more effort, I can express more, but part of this effort may be wasted.

It can dynamically adjust to the current machine's resources (disk, CPU, etc.). Just because things are split on different sequences and *could* run in parallel doesn't mean they actually will, that's the scheduler's decision and it can make a better decision here if there aren't other threads running wild doing their own thing.

In practice, TaskScheduler v1 has as many worker threads available for I/O as BlockingPool + CachePool combined previously did, aiming to maintain the status quo initially (or slightly improve it as task priority still has a chance to help in some scenarios), this is nowhere near optimized though -- we will run experiments once we have a full workload, getting to a full workload without regressions is the current goal.

please define "full workload". Will it be really status quo? Splitting FILE/DB threads into multiple sequences governed by a single threadpool is a different system that may perform significantly differently, no?

That's a good point, we could restrict work that isn't TaskPriority::USER_BLOCKING to a max of 3 threads to simulate the current system if that feels more "status quo".

Strictly speaking it's not 100% the same as the current setup, but it's super close and is a good recommendation, hence it answers my original question. I think we should add it to the migration docs about FILE/DB migration (unless someone else on this list shares better insights).

AFAIK though disks perform better when the kernel is aware of all the pending I/O requests

This is certainly not true :) The kernel does not know what the FTL is doing and makes poor choices in a lot of cases. I should, of course, be equally skeptical about Chrome's ability to schedule disk I/O jobs in FTL-friendly way, and in userspace, oh oh. The only thing we can do better is to know what is user-blocking.

so parallelization shouldn't be a bad thing (especially when coupled with I/O priority at kernel level but I think that's a Windows-only thing).

On Linux/Android the information about which thread requested/dirtied the OS page cache is mostly ignored at I/O scheduling level. CFQ has some knobs for per-cgroup scheduling, but it's not used on Android, I believe.

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.

To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAJTZ7LJiDrU4i6vyfeK35wf18ntXOXO_b-RXUxKfzZfnAKRF%3Dw%40mail.gmail.com.

--
Egor Pasko

--
Egor Pasko

--

Egor Pasko

Gabriel Charette

unread,

Jun 15, 2017, 10:10:05 AM6/15/17

to Egor Pasko, Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

Thanks Egor, let's continue the platform-specific discussion on another thread. Looks like there are discrepancies between what's best on each platform. Overall I think bringing all tasks under a single component is still the right step forward that way each platform's scheduling impl can be specialized for what's best there (even if it turns out that having only a single thread doing I/O is the best, we can easily do that under the hood -- and change our minds later without having to rejig the entire codebase :)). This migration is about labeling every task with TaskTraits and sequencing requirements, that doesn't necessarily mean more parallelism, it merely means more ability to control.

On Thu, Jun 15, 2017 at 5:52 AM Egor Pasko <pa...@google.com> wrote:

On Wed, Jun 14, 2017 at 9:54 PM, Gabriel Charette <g...@chromium.org> wrote:
Top-level comment: note that splitting concerns isn't only about being able to run more things in parallel; it's also (and even more so) about being able to run only what matters under crunch. Currently we have to run everything to get to what matters because work is queued in FIFO on a single sequence (which also causes jank dependencies between unrelated systems).

On Wed, Jun 14, 2017 at 2:55 PM Egor Pasko <pa...@google.com> wrote:
On Wed, Jun 14, 2017 at 8:30 PM, Gabriel Charette <g...@chromium.org> wrote:

On Wed, Jun 14, 2017 at 2:16 PM Egor Pasko <pa...@google.com> wrote:
I've got a review for some bits of DB thread migration and I realized that I do not know enough about the DB thread to make the right call.

We can make a sequence per sqlite DB, this would be pretty straightforward. Is it the current recommendation? While this helps with code health and potential metrics, I don't think it would open any new scheduling opportunities.

The pattern I saw in a few places is:

1. read all contents of the DB into RAM on initialization

2. on updates of this working set, throw a task to DB thread to flush the delta to disk

3. (optionally) throttle somewhat to reduce amount of flushes (powering up disks, etc), but still provide some persistency guarantees (which are specific to particular sqlite DB)

4. the DB thread is mostly idle most of the time, and blocks nothing user-visible

Do you think this needs more advanced scheduling other than one thread for all the sqlite DBs? Do we expect the new metrics to show opportunities here?

If there isn't much contention on that thread already then I think it's fine moving it all to the same SequencedTaskRunner.

I don't know if there is much contention on DB thread. Hopefully someone can chime in and confirm that the above is what actually happens.

If there is contention, it still may make sense to have a single sequence for all. For example we may want to throttle sqlite in a limited threadpool (1 thread?), but at the same time allow disk cache to utilize as many parallel threads as possible for blocking I/O. Without knowing the access pattern I am inclined to just match the current behavior (i.e. one sequence for all DB tasks) and leave the experimentation to whoever is brave (maybe myself, but certainly later). Ideally with Finch.

Disk cache is currently running at TaskPriority::USER_BLOCKING so it will get ahead of anything marked with a lower priority (even up to using all the available workers under load). I'm fine leaving it as-is with a static task runner, at this point we mostly care about getting rid of the static BrowserThread::IDs.

I think this is not true. I believe the disk subsystem in the kernel ignores this priority setting. So in effect, even though the cache tasks will be scheduled, their access to disk would be slowed down if a lot of concurrent disk I/O is happening, regardless of how low the priorities of those other threads are.

This last paragraph was referring to Chrome's TaskScheduler so I can vouch that it is true on all platforms.

What matters is that it's provided by post_task.h so that 1) we use a common API everywhere and 2) tasks go through TaskTracker which is where all modern instrumentation/profiling is being added.

Ref. lazy_task_runner.h for an API that allows you to have one TaskRunner accessible statically for all your tasks. Having dedicated objects for each logically associated set of tasks is cleaner (even when ignoring scheduling) -- e.g. this is like when we went from global notifications -> ObserverLists.

Can we do this without measuring? One particular dragon I am seeing: parallelizing the tasks on the DB thread may have undesired consequences of having more parallel disk I/O (example: spinny disks under heavy seek storms that would block critical reads from the filesystem, such as HTTP disk cache, similar story for cheap eMMC on Android), does the scheduler API have any primitives to address this potential concern?

Yes, that's actually another argument for bringing these tasks explicitly to the TaskScheduler. Once the TaskScheduler is aware of all the work, everything is possible.

At this stage I am worried about adding more occasional and untestable parallelism on disk without measuring. Also, unclear on how much to invest into expressing parallelization constraints between tasks. With more effort, I can express more, but part of this effort may be wasted.

It can dynamically adjust to the current machine's resources (disk, CPU, etc.). Just because things are split on different sequences and *could* run in parallel doesn't mean they actually will, that's the scheduler's decision and it can make a better decision here if there aren't other threads running wild doing their own thing.

In practice, TaskScheduler v1 has as many worker threads available for I/O as BlockingPool + CachePool combined previously did, aiming to maintain the status quo initially (or slightly improve it as task priority still has a chance to help in some scenarios), this is nowhere near optimized though -- we will run experiments once we have a full workload, getting to a full workload without regressions is the current goal.

please define "full workload". Will it be really status quo? Splitting FILE/DB threads into multiple sequences governed by a single threadpool is a different system that may perform significantly differently, no?

That's a good point, we could restrict work that isn't TaskPriority::USER_BLOCKING to a max of 3 threads to simulate the current system if that feels more "status quo".

Strictly speaking it's not 100% the same as the current setup, but it's super close and is a good recommendation, hence it answers my original question. I think we should add it to the migration docs about FILE/DB migration (unless someone else on this list shares better insights).

How to best optimize I/O is a platform-specific decision, I'm not sure why we'd need to document it in the migration docs (which are about bringing all tasks to the API).

AFAIK though disks perform better when the kernel is aware of all the pending I/O requests

This is certainly not true :) The kernel does not know what the FTL is doing and makes poor choices in a lot of cases. I should, of course, be equally skeptical about Chrome's ability to schedule disk I/O jobs in FTL-friendly way, and in userspace, oh oh. The only thing we can do better is to know what is user-blocking.

so parallelization shouldn't be a bad thing (especially when coupled with I/O priority at kernel level but I think that's a Windows-only thing).

On Linux/Android the information about which thread requested/dirtied the OS page cache is mostly ignored at I/O scheduling level. CFQ has some knobs for per-cgroup scheduling, but it's not used on Android, I believe.

Interesting, it's certainly true on Windows (have seen it many times in ETW (low-level) traces -- the kernel is amazing at juggling I/O priority (which is different from thread priority on Windows).

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAJTZ7LJiDrU4i6vyfeK35wf18ntXOXO_b-RXUxKfzZfnAKRF%3Dw%40mail.gmail.com.

--
Egor Pasko

--
Egor Pasko

--
Egor Pasko

Gabriel Charette

unread,

Jun 15, 2017, 10:22:09 AM6/15/17

to Egor Pasko, Gabriel Charette, scheduler-dev, fdo...@chromium.org, Robert Liao, benh...@chromium.org

+scheduler-dev , chromium-dev to bcc

Hello scheduler peeps, Egor rose some interesting points about Android's I/O scheduling properties which seem to differ wildly from what we see on Windows (and have based on original TaskScheduler tuning upon).

The initial goal is merely to maintain status quo as we migrate all SequencedWorkerPool/BrowserThreads to TaskScheduler but Egor is worried that we won't maintain status quo if we shard the FILE/DB threads from 2 to many more when breaking down the sequence.

We can easily control the number of physical threads via InitParams (can add more params if need be).

One thing is clear to me: platforms differing wildly is another reason to migrate to TaskScheduler (not one to slow down) as it will allow us to actually have platform specific approaches (which is not possible in current system per there being no central coordination point).

We need to be careful with this tuning for M61 as more and more things are sharded off the FILE thread.

+Francois Pierre Doray : something to keep in mind as you look into tuning the scheduler internals to better handle background tasks on platforms where !base::Lock::HandlesMultipleThreadPriorities() (which prevents us from using thread priority on scheduler workers).

Cheers,

Gab

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAJTZ7LJiDrU4i6vyfeK35wf18ntXOXO_b-RXUxKfzZfnAKRF%3Dw%40mail.gmail.com.

--
Egor Pasko

--
Egor Pasko

--
Egor Pasko

Egor Pasko

unread,

Jun 15, 2017, 11:06:15 AM6/15/17

to Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, Ben Henry

On Thu, Jun 15, 2017 at 4:09 PM, Gabriel Charette <g...@chromium.org> wrote:

Thanks Egor, let's continue the platform-specific discussion on another thread. Looks like there are discrepancies between what's best on each platform. Overall I think bringing all tasks under a single component is still the right step forward that way each platform's scheduling impl can be specialized for what's best there (even if it turns out that having only a single thread doing I/O is the best, we can easily do that under the hood -- and change our minds later without having to rejig the entire codebase :)). This migration is about labeling every task with TaskTraits and sequencing requirements, that doesn't necessarily mean more parallelism, it merely means more ability to control.

On Thu, Jun 15, 2017 at 5:52 AM Egor Pasko <pa...@google.com> wrote:
On Wed, Jun 14, 2017 at 9:54 PM, Gabriel Charette <g...@chromium.org> wrote:
Top-level comment: note that splitting concerns isn't only about being able to run more things in parallel; it's also (and even more so) about being able to run only what matters under crunch. Currently we have to run everything to get to what matters because work is queued in FIFO on a single sequence (which also causes jank dependencies between unrelated systems).

On Wed, Jun 14, 2017 at 2:55 PM Egor Pasko <pa...@google.com> wrote:
On Wed, Jun 14, 2017 at 8:30 PM, Gabriel Charette <g...@chromium.org> wrote:

On Wed, Jun 14, 2017 at 2:16 PM Egor Pasko <pa...@google.com> wrote:
I've got a review for some bits of DB thread migration and I realized that I do not know enough about the DB thread to make the right call.

We can make a sequence per sqlite DB, this would be pretty straightforward. Is it the current recommendation? While this helps with code health and potential metrics, I don't think it would open any new scheduling opportunities.

The pattern I saw in a few places is:

1. read all contents of the DB into RAM on initialization

2. on updates of this working set, throw a task to DB thread to flush the delta to disk

3. (optionally) throttle somewhat to reduce amount of flushes (powering up disks, etc), but still provide some persistency guarantees (which are specific to particular sqlite DB)

4. the DB thread is mostly idle most of the time, and blocks nothing user-visible

Do you think this needs more advanced scheduling other than one thread for all the sqlite DBs? Do we expect the new metrics to show opportunities here?

If there isn't much contention on that thread already then I think it's fine moving it all to the same SequencedTaskRunner.

I don't know if there is much contention on DB thread. Hopefully someone can chime in and confirm that the above is what actually happens.

If there is contention, it still may make sense to have a single sequence for all. For example we may want to throttle sqlite in a limited threadpool (1 thread?), but at the same time allow disk cache to utilize as many parallel threads as possible for blocking I/O. Without knowing the access pattern I am inclined to just match the current behavior (i.e. one sequence for all DB tasks) and leave the experimentation to whoever is brave (maybe myself, but certainly later). Ideally with Finch.

Disk cache is currently running at TaskPriority::USER_BLOCKING so it will get ahead of anything marked with a lower priority (even up to using all the available workers under load). I'm fine leaving it as-is with a static task runner, at this point we mostly care about getting rid of the static BrowserThread::IDs.

I think this is not true. I believe the disk subsystem in the kernel ignores this priority setting. So in effect, even though the cache tasks will be scheduled, their access to disk would be slowed down if a lot of concurrent disk I/O is happening, regardless of how low the priorities of those other threads are.

This last paragraph was referring to Chrome's TaskScheduler so I can vouch that it is true on all platforms.

It is probably our different understanding of "get ahead". You seem to mean that the task scheduler would prioritize _starting_ tasks with TaskPriority::USER_BLOCKING in presence of other tasks doing DB work. OK, this is true. I interpreted it as saying that those tasks would be _fast_ regardless of DB tasks thrashing disk in the background. The latter seems not true.

In general, we should prefer to focus more on task speed and responsiveness over optimizing the scheduling for when the task _starts_. Starting earlier does not mean finishing earlier. Etc etc.

What matters is that it's provided by post_task.h so that 1) we use a common API everywhere and 2) tasks go through TaskTracker which is where all modern instrumentation/profiling is being added.

Ref. lazy_task_runner.h for an API that allows you to have one TaskRunner accessible statically for all your tasks. Having dedicated objects for each logically associated set of tasks is cleaner (even when ignoring scheduling) -- e.g. this is like when we went from global notifications -> ObserverLists.

Can we do this without measuring? One particular dragon I am seeing: parallelizing the tasks on the DB thread may have undesired consequences of having more parallel disk I/O (example: spinny disks under heavy seek storms that would block critical reads from the filesystem, such as HTTP disk cache, similar story for cheap eMMC on Android), does the scheduler API have any primitives to address this potential concern?

Yes, that's actually another argument for bringing these tasks explicitly to the TaskScheduler. Once the TaskScheduler is aware of all the work, everything is possible.

At this stage I am worried about adding more occasional and untestable parallelism on disk without measuring. Also, unclear on how much to invest into expressing parallelization constraints between tasks. With more effort, I can express more, but part of this effort may be wasted.

It can dynamically adjust to the current machine's resources (disk, CPU, etc.). Just because things are split on different sequences and *could* run in parallel doesn't mean they actually will, that's the scheduler's decision and it can make a better decision here if there aren't other threads running wild doing their own thing.

In practice, TaskScheduler v1 has as many worker threads available for I/O as BlockingPool + CachePool combined previously did, aiming to maintain the status quo initially (or slightly improve it as task priority still has a chance to help in some scenarios), this is nowhere near optimized though -- we will run experiments once we have a full workload, getting to a full workload without regressions is the current goal.

please define "full workload". Will it be really status quo? Splitting FILE/DB threads into multiple sequences governed by a single threadpool is a different system that may perform significantly differently, no?

That's a good point, we could restrict work that isn't TaskPriority::USER_BLOCKING to a max of 3 threads to simulate the current system if that feels more "status quo".

Strictly speaking it's not 100% the same as the current setup, but it's super close and is a good recommendation, hence it answers my original question. I think we should add it to the migration docs about FILE/DB migration (unless someone else on this list shares better insights).

How to best optimize I/O is a platform-specific decision, I'm not sure why we'd need to document it in the migration docs (which are about bringing all tasks to the API).

Documenting is important to avoid making the wrong choice for an important platform unintentionally? Do we have all the code in place to limit non-USER_BLOCKING tasks in 3 threads on Android? I am not familiar with the scheduling code, sorry.

AFAIK though disks perform better when the kernel is aware of all the pending I/O requests

This is certainly not true :) The kernel does not know what the FTL is doing and makes poor choices in a lot of cases. I should, of course, be equally skeptical about Chrome's ability to schedule disk I/O jobs in FTL-friendly way, and in userspace, oh oh. The only thing we can do better is to know what is user-blocking.

so parallelization shouldn't be a bad thing (especially when coupled with I/O priority at kernel level but I think that's a Windows-only thing).

On Linux/Android the information about which thread requested/dirtied the OS page cache is mostly ignored at I/O scheduling level. CFQ has some knobs for per-cgroup scheduling, but it's not used on Android, I believe.

Interesting, it's certainly true on Windows (have seen it many times in ETW (low-level) traces -- the kernel is amazing at juggling I/O priority (which is different from thread priority on Windows).

It may be doing great stuff with juggling priorities, but let's _not_ go into discussions on how "amazing" disk I/O performance is on Windows.

--

Egor Pasko

Greg Thompson

unread,

Jun 22, 2017, 4:29:25 AM6/22/17

to g...@chromium.org, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

Hi schedulers. I have a question about some code I'm migrating away from the blocking pool. This particular code currently uses two SequencedTaskRunners from the pool that differ only by the shutdown behavior: reads are CONTINUE_ON_SHUTDOWN, while writes are SKIP_ON_SHUTDOWN. The same sequence token is used when obtaining the two runners so that the tasks retain their proper ordering. How is this accomplished with the scheduler? I see how I can create two SequencedTaskRunners with different traits, but I don't see how to make them use the same sequence token (or something similar). Alternatively, I see that I can pick the shutdown behavior when posting via PostTaskWithTraits, but this doesn't let me put all tasks on the in the same sequence, does it?

Thanks for your help.

--

--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.

Gabriel Charette

unread,

Jun 22, 2017, 12:25:07 PM6/22/17

to Greg Thompson, g...@chromium.org, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

@grt: Interesting, this is a feature we've been contemplating for a while in TaskScheduler but haven't added to date. We call it "sequence-funneling", the idea being that you could have multiple "posting sequences" funneling into a single "execution sequence". We think this would be nice to do what you said, i.e. mix shutdown behaviours on the same execution sequence or mix task priorities on the same execution sequence (e.g. you can have writes as {TaskPriority::BACKGROUND, TaskShutdownBehavior::BLOCK_SHUTDOWN} and reads as {TaskPrirority::USER_BLOCKING, TaskShutdownBehavior::CONTINUE_ON_SHUTDOWN} while still keeping their execution sequenced for thread-safety.

Grokking the SequencedWorkerPool impl a while back I was aware that it theoretically allowed mixing shutdown behaviors on the same sequence but I was hoping this was never happening in practice. Doing so can break the SequencedTaskRunner contract which promises that tasks are run in posting order (if you have BLOCK_SHUTDOWN work mixed with SKIP_ON_SHUTDOWN work the SWP will run the blocking ones on shutdown without having run the ones that are skippable -- not an issue in the specific mix you encountered but still..).

So yeah... tl;dr; not an option today.

Back to your specific problem: How are the writes performed? I'd assume they have to be friendly to a crash halfway so CONTINUE_ON_SHUTDOWN might work for writes too? If not, so long as the reads aren't at a risk of causing shutdown hangs you can likely switch everything to SKIP_ON_SHUTDOWN.

Cheers,

Gab

Greg Thompson

unread,

Jun 23, 2017, 3:59:38 AM6/23/17

to Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

If the philosophy is "you may crash at any moment", then it seems like just about any task that doesn't touch state that will be deleted during orderly shutdown should be CONTINUE_ON_SHUTDOWN. Does this seem like good guidance? Would you go so far as to put this in a "best practices" doc?

Another question: I just reviewed some code that posts a MayBlock() task and handles the reply back on the main thread. I think this is fairly typical. In this case, the blocking function is not implemented in the same module as the code that posts the task. It seems like a layering violation for the code calling PostTask to decide whether to CONTINUE or SKIP on shutdown, since it really depends on how it's implemented. Have you scheduler folks thought about graceful ways to handle this?

Gabriel Charette

unread,

Jun 26, 2017, 4:55:51 PM6/26/17

to Greg Thompson, Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

On Fri, Jun 23, 2017 at 3:58 AM Greg Thompson <g...@chromium.org> wrote:

If the philosophy is "you may crash at any moment", then it seems like just about any task that doesn't touch state that will be deleted during orderly shutdown should be CONTINUE_ON_SHUTDOWN. Does this seem like good guidance? Would you go so far as to put this in a "best practices" doc?

The philosophy is not quite "you may crash at any moment", it's "you should be able to crash at any moment but should do your best to preserve state and coalesce things that will make the next startup smoother".

On desktop, shutdown still goes all the way until main() unfolds, static un-initialization occurs, etc. As such it's dangerous to recommend CONTINUE_ON_SHUTDOWN as a default as it can blow up for any task that touches static state or non-leaky LazyInstance/Singleton from a worker thread that outlives the main thread. I have a strong opinion that we should terminate process earlier instead of unwinding everything (don't clean the carpets before burning down the building), discussed a few times on chromium-dev already. However this is another topic and not one we were willing to mix with the task scheduler migration (big enough on its own!).

Until then, SKIP_ON_SHUTDOWN is preferred for anything that doesn't persist critical state.

Ultimately once everything is in the TaskScheduler our vision is to drive towards what we call "atomic shutdown": since there are no named threads to be joined in order, the TaskScheduler can execute all BLOCK_SHUTDOWN work and then terminate process, no service outlives another. Shutdown is then merely a shutdown broadcast signal to registred services, followed by TaskScheduler::Shutdown() which blocks until BLOCK_SHUTDOWN work is done (allowing communication amongst services to flush things), followed by TerminateProcess().

PS: Yes some ordering requirements will need to stick to make sure we don't flush the same thing multiple times on last minute updates.

Another question: I just reviewed some code that posts a MayBlock() task and handles the reply back on the main thread. I think this is fairly typical. In this case, the blocking function is not implemented in the same module as the code that posts the task. It seems like a layering violation for the code calling PostTask to decide whether to CONTINUE or SKIP on shutdown, since it really depends on how it's implemented. Have you scheduler folks thought about graceful ways to handle this?

We believe this is no different than having to request that the task run in a context that allows I/O (i.e. base::MayBlock() permits base::AssertIOAllowed()). Whether it should block shutdown is more of a decision for the caller IMO, the task itself should verify safety requirements (i.e., again, AssertIOAllowed()) as such we never believed a base::AssertShutdownBehavior(TaskShutdownBehavior behavior); method was necessary but maybe a base::AssertTaskCantOutliveMain() or something is useful (i.e. assert !CONTINUE_ON_SHUTOWN) as that's the only dangerous thing a task may have a dependency on regarding its environment's TaskShutdownBehavior. We haven't added that to date as CONTINUE_ON_SHUTDOWN == SKIP_ON_SHUTDOWN in aforementioned "atomic shutdown" world but it might be a good thing in the meantime, WDYT?

Greg Thompson

unread,

Jun 27, 2017, 5:39:56 PM6/27/17

to Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

On Mon, Jun 26, 2017 at 10:55 PM Gabriel Charette <g...@chromium.org> wrote:

On Fri, Jun 23, 2017 at 3:58 AM Greg Thompson <g...@chromium.org> wrote:
If the philosophy is "you may crash at any moment", then it seems like just about any task that doesn't touch state that will be deleted during orderly shutdown should be CONTINUE_ON_SHUTDOWN. Does this seem like good guidance? Would you go so far as to put this in a "best practices" doc?

The philosophy is not quite "you may crash at any moment", it's "you should be able to crash at any moment but should do your best to preserve state and coalesce things that will make the next startup smoother".

On desktop, shutdown still goes all the way until main() unfolds, static un-initialization occurs, etc. As such it's dangerous to recommend CONTINUE_ON_SHUTDOWN as a default as it can blow up for any task that touches static state or non-leaky LazyInstance/Singleton from a worker thread that outlives the main thread. I have a strong opinion that we should terminate process earlier instead of unwinding everything (don't clean the carpets before burning down the building), discussed a few times on chromium-dev already. However this is another topic and not one we were willing to mix with the task scheduler migration (big enough on its own!).

Until then, SKIP_ON_SHUTDOWN is preferred for anything that doesn't persist critical state.

Ultimately once everything is in the TaskScheduler our vision is to drive towards what we call "atomic shutdown": since there are no named threads to be joined in order, the TaskScheduler can execute all BLOCK_SHUTDOWN work and then terminate process, no service outlives another. Shutdown is then merely a shutdown broadcast signal to registred services, followed by TaskScheduler::Shutdown() which blocks until BLOCK_SHUTDOWN work is done (allowing communication amongst services to flush things), followed by TerminateProcess().

PS: Yes some ordering requirements will need to stick to make sure we don't flush the same thing multiple times on last minute updates.

Another question: I just reviewed some code that posts a MayBlock() task and handles the reply back on the main thread. I think this is fairly typical. In this case, the blocking function is not implemented in the same module as the code that posts the task. It seems like a layering violation for the code calling PostTask to decide whether to CONTINUE or SKIP on shutdown, since it really depends on how it's implemented. Have you scheduler folks thought about graceful ways to handle this?

We believe this is no different than having to request that the task run in a context that allows I/O (i.e. base::MayBlock() permits base::AssertIOAllowed()). Whether it should block shutdown is more of a decision for the caller IMO, the task itself should verify safety requirements (i.e., again, AssertIOAllowed()) as such we never believed a base::AssertShutdownBehavior(TaskShutdownBehavior behavior); method was necessary but maybe a base::AssertTaskCantOutliveMain() or something is useful (i.e. assert !CONTINUE_ON_SHUTOWN) as that's the only dangerous thing a task may have a dependency on regarding its environment's TaskShutdownBehavior. We haven't added that to date as CONTINUE_ON_SHUTDOWN == SKIP_ON_SHUTDOWN in aforementioned "atomic shutdown" world but it might be a good thing in the meantime, WDYT?

It still seems like a layering violation to me. Let's say I have a DoFoo method that touches something that may disappear during shutdown, so I figure it must be SKIP_ON_SHUTDOWN. I put that in a doc comment and add an AssertShutdownBehavior(SKIP_ON_SHUTDOWN) or something like that to be sure that no callers invoke it on a CONTINUE_ON_SHUTDOWN task. All is well until I "optimize" DoFoo so that it can freely operate on a C_O_S task. Now I have to trawl through the code to find all callers and, where possible, flip them from S_O_S to C_O_S. Is there a better way?

I think the shutdown behavior is a combination of implementation details of the callee (e.g., does it touch something that is destroyed during shutdown) and of the caller (e.g., is the work being posted so important that we really hope it will complete before process death). Can these two factors be expressed independently somehow?

Gabriel Charette

unread,

Jun 28, 2017, 12:38:51 PM6/28/17

to Greg Thompson, Gabriel Charette, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org

What you express is identical to the pre-existing AssertIOAllowed() though (e.g. we decided long ago that registry access on Windows didn't involve I/O and remove the restriction there but some old code still hops to the FILE thread today to do registry operations).

I'm not saying this is great, but it's a pre-existing paradigm in chromium to request an execution context from the caller and have assertions in the tasks to confirm it's what they got. We hope to make this more local with the new API (per encouraging creating a task runner in the leaf components as fits instead of requesting a plumbed runner with specific traits), but we aren't completely removing this paradigm either (and regarding shutdown behaviors, this was also true before with SequencedWorkerPool).

Re. WDYT of an assert for shutdown safety?

François Doray

unread,

Jun 28, 2017, 1:23:22 PM6/28/17

to Gabriel Charette, Greg Thompson, Chromium-dev, Robert Liao, benh...@chromium.org

I believe that the code that posts a task should decide whether the task has to be scheduled before shutdown completes while the task itself should decide whether shutdown can complete while it is running.

What do you think of this API change?

1. Replace the TaskShutdownBehavior trait with a MustBeScheduledBeforeShutdown() trait.

E.g.

// TaskA will be scheduled before shutdown completes.

base::PostTaskWithTraits(FROM_HERE, {base::MustBeScheduledBeforeShutdown()}, base::BindOnce(&TaskA));

// Shutdown can complete before TaskB is scheduled. No task is scheduled once shutdown is complete.

base::PostTaskWithTraits(FROM_HERE, {}, base::BindOnce(&TaskB));

2. Add base::AllowShutdownToComplete().

E.g.

void TaskA() {

PersistStateOnDisk();

base::AllowShutdownToComplete();

DoMoreWorkThatIsNotSoImportant();

}

void TaskB() {

DoSomeWorkThatAssumesThatShutdownIsNotComplete();

base::AllowShutdownToComplete();

LongOperationThatShouldNotBlockShutdown();

}

Unless someone can think of a reason to have a task that must be scheduled before shutdown other than to persist state, base::MustBeScheduledBeforeShutdown() could be base::PersistsState().

Gabriel Charette

unread,

Jun 28, 2017, 7:01:31 PM6/28/17

to Chromium-dev, Francois Pierre Doray, Gabriel Charette, Greg Thompson, asvi...@google.com, Ben Henry, Robert Liao

I like half of it but don't see how to make the other half work.. see below.

Le mer. 28 juin 2017 11:21, François Doray <fdo...@chromium.org> a écrit :

I believe that the code that posts a task should decide whether the task has to be scheduled before shutdown completes while the task itself should decide whether shutdown can complete while it is running.

I agree with this statement 100%.

What do you think of this API change?

1. Replace the TaskShutdownBehavior trait with a MustBeScheduledBeforeShutdown() trait.

That'd work (I'd just call it BlocksShutdown()).

E.g.

// TaskA will be scheduled before shutdown completes.
base::PostTaskWithTraits(FROM_HERE, {base::MustBeScheduledBeforeShutdown()}, base::BindOnce(&TaskA));

// Shutdown can complete before TaskB is scheduled. No task is scheduled once shutdown is complete.
base::PostTaskWithTraits(FROM_HERE, {}, base::BindOnce(&TaskB));

2. Add base::AllowShutdownToComplete().

That's not feasible IMO as it can't be the default (or it's as dangerous as before) but we also can't have every method that doesn't touch shutdown unsafe state (most) have to explicitly call this.

E.g.

void TaskA() {
PersistStateOnDisk();
base::AllowShutdownToComplete();
DoMoreWorkThatIsNotSoImportant();
}

void TaskB() {
DoSomeWorkThatAssumesThatShutdownIsNotComplete();
base::AllowShutdownToComplete();
LongOperationThatShouldNotBlockShutdown();
}

Unless someone can think of a reason to have a task that must be scheduled before shutdown other than to persist state, base::MustBeScheduledBeforeShutdown() could be base::PersistsState().

Persisting state is a reason to block shutdown (a decision for the caller), it's not the reason a method should label itself shutdown unsafe. A method is shutdown unsafe if it's accessing state uninitialized by the main thread on shutdown when main() unfolds, e.g. via AtExitManager or static uninitialization. The simplest solution is to just get rid of that possibility by terminating process instead of winding down, I/we simply have not had time to figure out the side effects of that and whether it's safe (I'm pretty sure it's okay), once nothing can be shutdown unsafe then we merely need BlocksShutdown() (i.e. skip/continue become the same)

Gabriel Charette

unread,

Jun 28, 2017, 7:10:21 PM6/28/17

to Gabriel Charette, Chromium-dev, Francois Pierre Doray, Greg Thompson, asvi...@google.com, Ben Henry, Robert Liao

PS: I can see why it's tempting to say that being in the middle of persisting state should hold shutdown but the reason I'm saying it doesn't is that 1) all writes should be resilient to crashes and 2) important writes should already be block shutdown. That leaves unimportant writes which should already be resilient to being terminated midway and I don't see why we'd want to hold shutdown iff they're in flight (SKIP_ON_SHUTDOWN semantics).

Matt Falkenhagen

unread,

Jul 23, 2017, 11:29:21 PM7/23/17

to g...@chromium.org, Greg Thompson, Chromium-dev, Robert Liao, Francois Pierre Doray, benh...@chromium.org, Hiroki Nakagawa

I have the same issue as Greg. The service worker database task manager allows for mixed shutdown behaviors in order to support the "Clear data on exit" browsing setting, which requires deleting service workers when the browser/profile exits. We settled on this design in the code review: https://codereview.chromium.org/633873002

Are you saying this design doesn't work today, and even though we use the same sequence token, the tasks are not necessarily run in the order they are posted?

Do you plan on adding the feature someday? If so, it seems like for this particular issue, we might just make all tasks BLOCK_SHUTDOWN until we can reliably make only the "Clear on exit" task block shutdown.

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CAJTZ7LJHwqSuZ3-McE7byMpmrg1eKYwWFpnKk5zyv%3D0qJPmRNQ%40mail.gmail.com.

François Doray

unread,

Jul 24, 2017, 9:49:11 AM7/24/17

to Matt Falkenhagen, g...@chromium.org, Greg Thompson, Chromium-dev, Robert Liao, benh...@chromium.org, Hiroki Nakagawa

On Sun, Jul 23, 2017 at 11:27 PM Matt Falkenhagen <fal...@chromium.org> wrote:

I have the same issue as Greg. The service worker database task manager allows for mixed shutdown behaviors in order to support the "Clear data on exit" browsing setting, which requires deleting service workers when the browser/profile exits. We settled on this design in the code review: https://codereview.chromium.org/633873002

I assume this is the code you are referring to: https://chromium.googlesource.com/chromium/src/+/ca2051b6170d833a504c8b2d9138a4171d64b8c9/content/browser/service_worker/service_worker_database_task_manager.cc#14

Are you saying this design doesn't work today, and even though we use the same sequence token, the tasks are not necessarily run in the order they are posted?

Tasks posted with the same sequence token are guaranteed to run one at a time in posting order.

However, creating SequencedTaskRunners with the same sequence token but different shutdown behaviors is not supported. All SequencedTaskRunners created with the same sequence token will have the shutdown behavior of the first SequencedTaskRunner created with that sequence token.

Do you plan on adding the feature someday? If so, it seems like for this particular issue, we might just make all tasks BLOCK_SHUTDOWN until we can reliably make only the "Clear on exit" task block shutdown.

Yes, we plan to add this feature someday.

In the meantime, making all tasks BLOCK_SHUTDOWN is a good solution (FYI, GetSequencedTaskRunner creates a BLOCK_SHUTDOWN, so all tasks have always been BLOCK_SHUTDOWN anyways source).

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Matt Falkenhagen

unread,

Jul 24, 2017, 9:59:56 AM7/24/17

to François Doray, g...@chromium.org, Greg Thompson, Chromium-dev, Robert Liao, benh...@chromium.org, Hiroki Nakagawa

On Mon, Jul 24, 2017 at 10:47 PM, François Doray <fdo...@chromium.org> wrote:

On Sun, Jul 23, 2017 at 11:27 PM Matt Falkenhagen <fal...@chromium.org> wrote:
I have the same issue as Greg. The service worker database task manager allows for mixed shutdown behaviors in order to support the "Clear data on exit" browsing setting, which requires deleting service workers when the browser/profile exits. We settled on this design in the code review: https://codereview.chromium.org/633873002

I assume this is the code you are referring to: https://chromium.googlesource.com/chromium/src/+/ca2051b6170d833a504c8b2d9138a4171d64b8c9/content/browser/service_worker/service_worker_database_task_manager.cc#14

Ah yes, that was a totally broken link above. Yes, that's the code.

Are you saying this design doesn't work today, and even though we use the same sequence token, the tasks are not necessarily run in the order they are posted?

Tasks posted with the same sequence token are guaranteed to run one at a time in posting order.

However, creating SequencedTaskRunners with the same sequence token but different shutdown behaviors is not supported. All SequencedTaskRunners created with the same sequence token will have the shutdown behavior of the first SequencedTaskRunner created with that sequence token.

Do you plan on adding the feature someday? If so, it seems like for this particular issue, we might just make all tasks BLOCK_SHUTDOWN until we can reliably make only the "Clear on exit" task block shutdown.

Yes, we plan to add this feature someday.

In the meantime, making all tasks BLOCK_SHUTDOWN is a good solution (FYI, GetSequencedTaskRunner creates a BLOCK_SHUTDOWN, so all tasks have always been BLOCK_SHUTDOWN anyways source).

Thanks, makes sense. I'll make everything BLOCK_SHUTDOWN for now.

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.

Gabriel Charette

unread,

Jul 24, 2017, 10:03:49 AM7/24/17

to Matt Falkenhagen, g...@chromium.org, Chromium-dev, Francois Pierre Doray, Greg Thompson, Hiroki Nakagawa, Robert Liao, benh...@chromium.org

(was writing this at moment of fdoray's reply, will post anyways as extra information)

Tasks posted to a SequencedTaskRunner should run in posted order (that's what the API has been stating since its inception). If a scheduler allows mixing shutdown behaviors on the same sequence then technically a BLOCK_SHUTDOWN task at the end of the queue should promote pending SKIP_ON_SHUTDOWN tasks to also block as if they're skipped the contract isn't respected. FWIW: SequencedWorkerPool was breaking the contract, unintentionally I think, by skipping tasks on shutdown without looking at sequence membership. TaskScheduler has Sequence objects as first class citizens and shutdown behaviors is associated to those (i.e. to an entire SequencedTaskRunner) instead of to task.

We could provide an API to toggle TaskShutdownBehavior of an entire sequence at runtime (we are already considering one to toggle TaskPriority (https://crbug.com/747495).

Ultimately we would like to allow sequence-funneling (multiple SequencedTaskRunners funneling to a single sequence so that you get mutual exclusion without having to force strict ordering of all mutually exclusive tasks). Then you could have two mutually exclude SequencedTaskRunners with different shutdown behaviors and the contracts would be respected.

For now, the only thing that will work is "sequence bumping". You can "promote" a sequence by replacing it so long as you control mutual exclusion. Here's how this can be implemented.

Say you post your work through a member which by default is SKIP_ON_SHUTDOWN: scoped_refptr<SequencedTaskRunner> background_runner_;

You can manually do this as a hack if you never used your sequence for delayed tasks:

// Note: this a hack and only works if no delayd tasks were ever posted to

// |background_runner_|.

scoped_refptr<base::DeferredSequencedTaskRunner> new_runner(

base::MakeRefCounted<base::DeferredSequencedTaskRunner>(

base::CreateSequencedTaskRunnerWithTraits({ ..... })));

// Only Start() |new_runner| after all pending tasks on |background_runner_|

// have completed. Note: you will also need to DetachFrom(Sequence|Thread)() any

// associated (Sequence|Thread)Checker right before Start() so you may want to

// bind a helper here instead of a direct call to Start().

background_runner_->PostTask(

FROM_HERE,

base::Bind(&base::DeferredSequencedTaskRunner::Start, new_runner));

// Post all new tasks to |new_runner|.

background_runner_ = new_runner;

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Reply all

Reply to author

Forward