Re: RunLoops and mocked timesource

82 views
Skip to first unread message

Gabriel Charette

unread,
May 28, 2024, 10:55:17 AMMay 28
to Georg Neis, Jeroen Dhollander, scheduler-dev, Wez, Eliot Courtney
+scheduler-dev (public) as FYI to the wider group.

On Tue, May 28, 2024 at 10:52 AM Gabriel Charette <g...@google.com> wrote:
tl;dr; I've convinced myself that MOCK_TIME + RunUntil is okay and have written https://chromium-review.googlesource.com/c/chromium/src/+/5576315 to make it enable it :).
================================

Interesting, I hadn't fully grokked base::RunUntil (it's a ~recent addition to Run methods). @Jeroen Dhollander added it a year ago. Let's see...

If we drop that CHECK (allow MOCK_TIME as-is), here's what I think should happen:

MOCK_TIME causes `TaskEnvironment::MockTimeDomain` to be hooked @

which will fast-forward to the next delayed task on idle, before the CurrentThread::RegisterOnNextIdleCallback() registrations used by base::RunUntil kick in @

As intended (MOCK_TIME makes the time move forward instead of going idle).

Because RunLoopTimeouts use PostDelayedTask @

This would result in fast-forwarding to the timeout before the base::RunUntil predicate is even checked once... If it wasn't for MOCK_TIME nullifying the RunLoopTimeout @

Therefore, the RunLoop won't timeout, hence the claim that it could also never return. This doesn't seem worse than RunLoop::Run() so maybe we could loosen that requirement?

Now, why doesn't it immediately return per your deterministic predicate? Seems your predicate is backwards (returning false causes it to never be satisfied as opposed to always)?
If I make it `base::test::RunUntil([]() { return true; })` locally, it works (with both CHECKs disabled).

I've considered updating the base::RunUntil CHECKs to:
  // We expect a RunLoop timeout except under MOCK_TIME where
  // TaskEnvironment::mock_time_domain_ disables to after fast-forwarding into
  // the timeout...
  CHECK(test::ScopedRunLoopTimeout::ExistsForCurrentThread() ||
        subtle::ScopedTimeClockOverrides::overrides_active())
      << "No RunLoop timeout set, meaning `RunUntil` will hang forever on "
         "failure.";

(and it passes your modified test)

But the problem with that is that it will only work if the main thread is woken up after the condition is updated. Come to think of it though, that's also true outside of MOCK_TIME and is a limitation of that UI.

Given all of this, I've now convinced myself that MOCK_TIME is compatible with RunUntil (failure modes are the same). I've written https://chromium-review.googlesource.com/c/chromium/src/+/5576315 to convince myself and make it so.

Cheers!
Gab

PS: I'm OOO until July starting tomorrow so this is a bad timing but you piqued my curiosity and I had to get to the bottom of it before heading out. Feel free to do whatever you all please with this CL while I'm away.

On Sun, May 26, 2024 at 10:19 PM Georg Neis <ne...@google.com> wrote:
The "hang forever" comment is the CHECK failure message I posted:

2024-05-17T08:08:17.127139Z FATAL browser_tests[85646:85646]:
[run_until.cc(36)] Check failed:
!subtle::ScopedTimeClockOverrides::overrides_active(). Mocked
timesource detected, which would cause `RunUntil` to hang forever on
failure.

> It'd indeed CHECK as soon as the process is idle under mock time.

Like I said, that's not what I'm observing. Maybe I'm testing wrong,
i.e. the process doesn't become idle in the test below?

class MyUnitTest : public testing::Test {
 public:
   MyUnitTest() :
     task_environment_(base::test::TaskEnvironment::TimeSource::MOCK_TIME)
{}
 private:
    base::test::SingleThreadTaskEnvironment task_environment_;
};

TEST_F(MyUnitTest, Bla) {
  ASSERT_TRUE(base::test::RunUntil([]() { return false; }));
// after removing the CHECK there
  // Or: base::RunLoop().Run();
}

In either case I only reach the 45s test launcher timeout, no other
failure / timeout.

Georg
On Fri, May 24, 2024 at 10:58 PM Gabriel Charette <g...@google.com> wrote:
>
> Not sure which "hang forever" comment you're referring to.
>
> It'd indeed CHECK as soon as the process is idle under mock time.
>
> We can work around that by handling the timeout differently : by checking real-time and sleeping+reposting if real-time isn't yet too late.
>
> On Fri, May 24, 2024 at 4:34 AM Georg Neis <ne...@google.com> wrote:
>>
>> Hi,
>>
>> thanks for the comments. I'd be happy to work on this but I'm sure I'd
>> need help from you guys.
>>
>> > Like any other delayed task under mock time, it would be fast-forwarded to in the event this process has no more work to do (i.e. it wouldn't act as a real-time timeout).
>>
>> This sounds like the test would therefore time out (i.e. the timeout
>> task get executed) as soon as the process becomes idle. That's
>> different from "hang forever" that the CHECK comment mentions. In my
>> experiments I'm indeed observing that the runloop timeout isn't hit
>> (instead, the 45s test launcher timeout gets hit eventually). I've
>> tried with both a browser test using ScopedTimeClockOverrides and a
>> unit test using TaskEnvironment with the MOCK_TIME trait. Or did I
>> misunderstand your comment?
>>
>> Georg
>>
>>
>> On Thu, May 23, 2024 at 9:02 PM Gabriel Charette <g...@google.com> wrote:
>> >
>> > Hi Georg,
>> >
>> > These code TODOs are indeed the right place to address the problem. Oops that the progress tracking bugs on the overall features were closed without updating these to an open bug.
>> >
>> > The issue is that the timeout is currently posted as a delayed task. Like any other delayed task under mock time, it would be fast-forwarded to in the event this process has no more work to do (i.e. it wouldn't act as a real-time timeout). While there is a way to ask for the real time under mock time (subtle::TimeTicksNowIgnoringOverride()), there's currently no way to post a delayed task which ignores the override. We would want to be super careful if we added support for this as one of the design principles of mock time is that ~everything is mocked.
>> >
>> > One way we could potentially do this without modifying the underlying task system would be to check subtle::TimeTicksNowIgnoringOverride() when it fires, PlatformThread::Sleep(Milliseconds(1)) if the real-time timeout hasn't expired, and post-self again. This would infinite loop until the real-time timeout is hit (but the sleep would prevent a busy loop thrashing the CPU).
>> >
>> > Wanna take a stab at that?
>> >
>> > - Gab
>> >
>> > On Thu, May 23, 2024 at 4:12 AM Georg Neis <ne...@google.com> wrote:
>> >>
>> >> Hi Wez and Gabriel,
>> >>
>> >> I'd like to call some test utility function in (most) Lacros browser
>> >> tests. That function currently uses RunUntil internally, and my change
>> >> is failing some tests:
>> >>
>> >> 2024-05-17T08:08:17.127139Z FATAL browser_tests[85646:85646]:
>> >> [run_until.cc(36)] Check failed:
>> >> !subtle::ScopedTimeClockOverrides::overrides_active(). Mocked
>> >> timesource detected, which would cause `RunUntil` to hang forever on
>> >> failure.
>> >>
>> >> Do I understand correctly that the same problem (hanging forever on
>> >> failure) would happen in these tests if we were to replace the
>> >> RunUntil use with a regular RunLoop + observer?
>> >>
>> >> What does it take to get rid of this limitation?
>> >>
>> >> In the code, I've found these two TODOs:
>> >> https://source.chromium.org/chromium/chromium/src/+/main:base/run_loop.cc;drc=c0265133106c7647e90f9aaa4377d28190b1a6a9;l=119
>> >> https://source.chromium.org/chromium/chromium/src/+/main:base/test/task_environment.cc;drc=8c29f4a8930c3ccccdf1b66c28fe484cee7c7362;l=454
>> >>
>> >> Are they about the same thing? Both associated bugs are marked as
>> >> Fixed but I guess the issue still exists?
>> >>
>> >> Happy to do a call if it helps.
>> >>
>> >> Thanks,
>> >>  Georg
Reply all
Reply to author
Forward
0 new messages