Re: Chrome Mac OOM + Reporting API

225 views
Skip to first unread message

Avi Drissman

unread,
Jul 3, 2024, 11:26:01 AM7/3/24
to Ian Clelland, Keishi Hattori, content-owners, Mark Mentovai, Chrome Mac Dev, Nicholas Chen, chrome-speed-metrics-core
+Keishi Hattori from PartitionAlloc
+content-owners 

This may well not be wired up. We'd need to be able to distinguish an OOM vs other assertion failure (thus PA), and we'd need to work out how to get that information to the reporting app (how does Linux do this? return code?).

Mark has relevant expertise so we may need to wait for him. 

Content owners: does anyone have knowledge about how process termination is communicated?

Avi

On Wed, Jul 3, 2024 at 11:13 AM Ian Clelland <icle...@google.com> wrote:
+some helpful mailing lists (thanks to Mark's OOO message for reminding me)

On Wed, Jul 3, 2024 at 11:09 AM Ian Clelland <icle...@google.com> wrote:
Hey Folks!

I'm working with the Workspace team to try to improve Chrome's implementation of Crash Reporting (the web-exposed reports, not go/crash), and they have noticed some issues with the reports that are sent from Mac clients.

Specifically, there are almost *no* OOM reports sent. We send three kinds of reports to reporting endpoints -- "OOM", "tab killed for being unresponsive", and a generic "something else happened" (usually a renderer CHECK crash). On MacOS, we almost only ever see the last two. (On Windows, by contrast, approximately 56% of crashes are OOM)

My suspicion is that we're just not distinguishing between a process killed with by a CHECK from a process killed because of an alloc failure, and everything is being reported as a generic renderer crash. In code, it looks like we just issue a PA_IMMEDIATE_CRASH(), which to the browser observing, does not appear any different than any other crash.

The crash reporting code starts in RenderFrameHostImpl::RenderProcessGone - at which point it looks like any distinction between crash types has been lost.

I don't know enough of MacOS internals to know whether this is something that we can fix, but Workspace is very interested in having this addressed, and it seems like a bug that we *should* fix if we can.

I'm happy to jump on chat or VC if this needs more explanation. Right now, I'm wondering - is this likely to be a simple fix, or something that takes significant engineering from the Mac team, and if it's the second, what it would take to prioritize it?

Thanks!
Ian

danakj

unread,
Jul 3, 2024, 11:48:55 AM7/3/24
to Avi Drissman, Ian Clelland, Keishi Hattori, content-owners, Mark Mentovai, Chrome Mac Dev, Nicholas Chen, chrome-speed-metrics-core
Can the location of the crash be used to determine if it's due to OOM? I would think that's how it is done on other platforms where (it sounds like) this is working.

--
You received this message because you are subscribed to the Google Groups "content-owners" group.
To unsubscribe from this group and stop receiving emails from it, send an email to content-owner...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/content-owners/CACWgwAbzrGC8bFcsdW_BBQx35qMXd8EfXb42DCccy-CEGtzi-w%40mail.gmail.com.

danakj

unread,
Jul 3, 2024, 11:50:12 AM7/3/24
to Avi Drissman, Ian Clelland, Keishi Hattori, content-owners, Mark Mentovai, Nicholas Chen
Removing the @google groups from the thread - please don't mix chromium and google groups :)

Marijn Kruisselbrink

unread,
Jul 3, 2024, 12:04:50 PM7/3/24
to Avi Drissman, Ian Clelland, Keishi Hattori, content-owners, Mark Mentovai, Chrome Mac Dev, Nicholas Chen, chrome-speed-metrics-core
I remember this question having come up previously, I think the conclusion was that on anything other than windows we can't distinguish OOM crashes from other crashes locally. I.e. in base/allocator/partition_allocator/src/partition_alloc/oom.cc windows has special code to report OOM, while on other platforms it's just a regular crash.

On Wed, Jul 3, 2024 at 8:26 AM Avi Drissman <a...@chromium.org> wrote:
I thought we generally try to avoid mixing public groups and internal groups on the same email thread?

Avi Drissman

unread,
Jul 3, 2024, 12:06:40 PM7/3/24
to Marijn Kruisselbrink, Ian Clelland, Keishi Hattori, content-owners, Mark Mentovai, Chrome Mac Dev, Nicholas Chen, chrome-speed-metrics-core
My apologies. We do avoid mixing them; I got confused as to their statuses.
Reply all
Reply to author
Forward
0 new messages