hewro
unread,Mar 26, 2026, 11:17:14 AM (4 days ago) Mar 26Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Chromium-dev
Hi Chromium folks,
I’m working on responsiveness monitoring for a Chromium-based application. We report a UI stall when the browser main thread has no heartbeat for more than 6 seconds. The heartbeat is application-defined: the UI thread periodically posts to a watchdog thread.
A noticeable portion of these reports capture a stack that is only in the normal message-loop wait path, with no product/business logic above it.
On Windows, stacks often look like:
- `ZwUserMsgWaitForMultipleObjectsEx`
- `user32!RealMsgWaitForMultipleObjectsEx`
- `base::MessagePumpForUI::WaitForWork`
- `base::MessagePumpForUI::DoRunLoop`
On macOS, they often look like:
- `_mach_msg`
- `CFRunLoopRunSpecific`
- `-[NSApplication run]`
- `base::MessagePumpNSApplication::DoRun`
Our current interpretation is mainly:
1. the UI thread/process was not scheduled for a long time due to system-level reasons (high load, suspension, etc.), or
2. the sampled stack is not the actual hang site, and the thread had already returned to the message loop by the time we captured it.
We also account for possible watchdog-thread delay on our side, with separate telemetry to detect that.
For this kind of pure wait-path stack, is that the right way to think about it from the Chromium team’s perspective? Are there other common explanations we should consider?
Also, are there any recommended diagnostics, heuristics, or practical mitigation tricks for investigating and reducing this kind of report?
Thanks!