Crash Ingestion Percentage

1 view
Skip to first unread message

Tom Ritter

unread,
Apr 16, 2025, 9:32:29 AMApr 16
to crash-reporting-wg
A colleague and I were discussing - what percentage of crash reports wind up on crash-stats for each channel?  I had it in my head that it was 100% for Nightly (and Beta?) but only a percentage of the release ones?  But if you manually submitted one from release then it definitely got processed.

Is that right?  It seems like it would be useful to have a sentence to that effect on the crash-stats homepage just for reference, I can try to do a PR if you agree.

-tom

Gian-Carlo Pascutto

unread,
Apr 16, 2025, 12:50:48 PMApr 16
to crash-rep...@mozilla.com
The code is here, the rules have various exceptions.

https://github.com/mozilla-services/antenna/blob/main/antenna/throttler.py#L399
https://github.com/mozilla-services/antenna/blob/main/antenna/throttler.py#L505

As for expressing this on the crash-stats page: crash reports are
already a biased and very incomplete view of the actual crashes in the
field, so at best this would give the user false sense of nonexistent
accuracy or coverage.

If you're trying to do any analysis like that, please use *Crash
Telemetry* instead.

--
GCP

Tom Ritter

unread,
Apr 18, 2025, 8:36:31 AMApr 18
to Gian-Carlo Pascutto, crash-rep...@mozilla.com
On Wed, Apr 16, 2025 at 12:50 PM Gian-Carlo Pascutto <gpas...@mozilla.com> wrote:
On 16/04/2025 15:32, Tom Ritter wrote:
> A colleague and I were discussing - what percentage of crash reports
> wind up on crash-stats for each channel?  I had it in my head that it
> was 100% for Nightly (and Beta?) but only a percentage of the release
> ones?  But if you manually submitted one from release then it definitely
> got processed.
>
> Is that right?  It seems like it would be useful to have a sentence to
> that effect on the crash-stats homepage just for reference, I can try to
> do a PR if you agree.

The code is here, the rules have various exceptions.

https://github.com/mozilla-services/antenna/blob/main/antenna/throttler.py#L399
https://github.com/mozilla-services/antenna/blob/main/antenna/throttler.py#L505

Love it, thanks!
 
As for expressing this on the crash-stats page: crash reports are
already a biased and very incomplete view of the actual crashes in the
field, so at best this would give the user false sense of nonexistent
accuracy or coverage.

If you're trying to do any analysis like that, please use *Crash
Telemetry* instead.

Really what I was trying to figure out is "Am I not seeing any crashes in my code because I'm the best programmer ever, or is it possible they're getting throttled out".

And... I guess I still don't know?  I know that we're getting all the reports from Nightly, and that's something, but it sounds like I should really be looking at Crash Telemetry to see if there are crashes in my code?  Is there an interface to Crash Telemetry (besides STMO)?  Does it have stacks?

-tom

Gian-Carlo Pascutto

unread,
Apr 18, 2025, 9:58:09 AMApr 18
to Tom Ritter, crash-rep...@mozilla.com
On 18/04/2025 14:36, Tom Ritter wrote:
> And... I guess I still don't know? I know that we're getting all
> the reports from Nightly, and that's something,

You're getting all reports that _were_ _submitted_, which is 70% for
main process crashes, 30% for content crashes, and <0.1% for everything
else.

https://youtu.be/7gnkzdBSJtQ?t=328

So, depending on where your code actually runs, you might be missing
most or all of the reports, even in Nightly.

> but it sounds like I should really be looking at Crash Telemetry to
> see if there are crashes in my> code? Is there an interface to
> Crash Telemetry (besides STMO)? Does it have stacks?

Yes and yes.

There is a GUI (literally rewritten a week ago, so improving rapidly):
https://crash-pings.mozilla.org/

But for your kind of investigation, you can trawl through the stacks too
using STMO.

Here's some example queries that look for certain signatures:
https://sql.telemetry.mozilla.org/queries/106631
https://sql.telemetry.mozilla.org/queries/106438?
p_channel=%5B%22release%22%5D&p_os=%5B%22Windows%22%2C%22Linux%22%2C%22Android%22%2C%22macOS%22%5D

That said, we do limit to 5000 crashes per OS per channel per process
type per day (or something...) and randomly sample, because the
symbolication of crashes is quite resource intensive.

Thus, if your crash only happens once in a blue moon and only on Windows
release in content, you might need to trawl a longer period if you're
unlucky with the sampling (and realize you might not see all crashes
every day).

If you need a full crash report for a crash where you only have
telemetry, we are about to deploy the capability to prompt for this to
Nightly: https://bugzilla.mozilla.org/show_bug.cgi?id=1853108

If you need to consider all crash telemetry and be sure a certain crash
never ever happened, that would be a one-off at this point (but possible
with spending some compute resources). The automated tooling is
certainly biased towards identifying the most common rather than very
uncommon crashes :-)

--
GCP
Reply all
Reply to author
Forward
0 new messages