TL;DR
TestingProfile may destroy the profile directory while some background task still works with files inside of the directory. This leads to obscure flaky disk I/O errors in sqlite code: https://crbug.com/839886, https://crbug.com/839371.
Many tests use TestingProfile with the TestBrowserThreadBundle, for example, all subclasses of ChromeRenderViewHostTestHarness. After a test finishes, TestingProfile is destroyed on the UI thread together with the temporary profile directory, before all background tasks are executed.
This creates a problem because the code dealing with databases that are put into the profile directory usually offloads all disk I/O operations to a background task sequence. Some of such code may be executed in unit tests, often unintentionally. If a disk I/O operation executes after the profile directory is deleted, it will fail. It wouldn't be such a problem if sqlite doesn't treat disk I/O errors as DCHECK errors. I observed a similar problem with LevelDB databases but LevelDB only produces a warning in this case.
I've prepared a CL that deflakes this error and many unit tests start failing on all platforms.
I see a couple of different solutions to this problem:
Leave this race condition as is but don't call DCHECK on disk I/O failures in sqlite. This work is tracked here: https://crbug.com/839186.
Destroy the testing profile directory only after all background tasks are executed. It can be done for ChromeRenderViewHostTestHarness but it's not clear how to achieve this across all tests because the TestingProfile and the TestBrowserThreadBundle are separate abstractions.
Do not execute the DB code in tests that don't expect that. It is hard to do given that tests can pull a lot of dependencies.
Inject main thread task runners instead of background task runners for the DB code for testing.
Replace on-disk DBs with in-memory DBs for testing.
Any ideas?
--
You received this message because you are subscribed to the Google Groups "scheduler-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheduler-de...@chromium.org.
To post to this group, send email to schedu...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/CAH542OQ1bCq%2BeVimcDvL4nC22krHitGT%2BH%3DsA1bFfwEqMx3hwg%40mail.gmail.com.
TL;DR
TestingProfile may destroy the profile directory while some background task still works with files inside of the directory. This leads to obscure flaky disk I/O errors in sqlite code: https://crbug.com/839886, https://crbug.com/839371.
Many tests use TestingProfile with the TestBrowserThreadBundle, for example, all subclasses of ChromeRenderViewHostTestHarness. After a test finishes, TestingProfile is destroyed on the UI thread together with the temporary profile directory, before all background tasks are executed.
This creates a problem because the code dealing with databases that are put into the profile directory usually offloads all disk I/O operations to a background task sequence. Some of such code may be executed in unit tests, often unintentionally. If a disk I/O operation executes after the profile directory is deleted, it will fail. It wouldn't be such a problem if sqlite doesn't treat disk I/O errors as DCHECK errors. I observed a similar problem with LevelDB databases but LevelDB only produces a warning in this case.
I've prepared a CL that deflakes this error and many unit tests start failing on all platforms.
I see a couple of different solutions to this problem:
Leave this race condition as is but don't call DCHECK on disk I/O failures in sqlite. This work is tracked here: https://crbug.com/839186.
Destroy the testing profile directory only after all background tasks are executed. It can be done for ChromeRenderViewHostTestHarness but it's not clear how to achieve this across all tests because the TestingProfile and the TestBrowserThreadBundle are separate abstractions.
On Thu, Jun 7, 2018 at 7:31 PM Alexandr Ilin <alex...@chromium.org> wrote:TL;DR
TestingProfile may destroy the profile directory while some background task still works with files inside of the directory. This leads to obscure flaky disk I/O errors in sqlite code: https://crbug.com/839886, https://crbug.com/839371.
Many tests use TestingProfile with the TestBrowserThreadBundle, for example, all subclasses of ChromeRenderViewHostTestHarness. After a test finishes, TestingProfile is destroyed on the UI thread together with the temporary profile directory, before all background tasks are executed.
This creates a problem because the code dealing with databases that are put into the profile directory usually offloads all disk I/O operations to a background task sequence. Some of such code may be executed in unit tests, often unintentionally. If a disk I/O operation executes after the profile directory is deleted, it will fail. It wouldn't be such a problem if sqlite doesn't treat disk I/O errors as DCHECK errors. I observed a similar problem with LevelDB databases but LevelDB only produces a warning in this case.
I've prepared a CL that deflakes this error and many unit tests start failing on all platforms.
Oh, that's a really nice way to show about 300 tests flaky for a known reason!PS: I thought the definition of 'deflake' implies making tests PASS, but this one makes flaky tests FAIL more reproducibly, is that correct?I see a couple of different solutions to this problem:
Leave this race condition as is but don't call DCHECK on disk I/O failures in sqlite. This work is tracked here: https://crbug.com/839186.
I think this may cause other errors with these databases, just more subtly.
Destroy the testing profile directory only after all background tasks are executed. It can be done for ChromeRenderViewHostTestHarness but it's not clear how to achieve this across all tests because the TestingProfile and the TestBrowserThreadBundle are separate abstractions.
Isn't it already done for ChromeRenderViewHostTestHarness by calling thread_bundle_.reset() in RenderViewHostTestHarness::TearDown()? If it works this way, then probably the recommended solution is to use TestBrowserThreadBundle in more places? .. though my understanding of the latter is quite limited.
I see thread_bundle_.RunUntilIdle() in various other teardowns supposedly for the same reason.
Do not execute the DB code in tests that don't expect that. It is hard to do given that tests can pull a lot of dependencies.
Inject main thread task runners instead of background task runners for the DB code for testing.
Replace on-disk DBs with in-memory DBs for testing.
Any ideas?
--
You received this message because you are subscribed to the Google Groups "scheduler-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheduler-de...@chromium.org.
To post to this group, send email to schedu...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/CAH3q7_kEtXhH_4fqUDkXP0do%3DV5Nuvcv9%2Bsbdnn%2B1kKV6H4jVg%40mail.gmail.com.