Enabling DCHECK's in canary or beta build

546 views
Skip to first unread message

Stephen Cheng

unread,
Jun 3, 2014, 5:25:46 PM6/3/14
to chromi...@chromium.org
I've recently moved from the m31 release code base to m35 release code base for my little local customization project. I ran into many easily-reproducible debug assertions on m31. After spending only a couple of days on m35, I've run into quite a few debug assertions as well. I have a strong feeling that the debug build is not adequately tested by real-world actions from real users. 

Can we at least enable the DCHECK's in the canary build, assuming it is not already enabled? Exposing the dchecks to the outside world can help reveal a lot of potential problems in the code. I am not exactly sure how many people are downloading and playing with the canary build. If it's only limited to a couple of thousands of developers, I'd suggesting putting into the beta build instead. 



Joao da Silva

unread,
Jun 4, 2014, 4:54:06 PM6/4/14
to sfc...@gmail.com, laf...@chromium.org, kar...@chromium.org, Chromium-dev
I support this idea based on recent bugs I've looked at, that would have been easily spotted by a DCHECK in the dev channel on ChromeOS.

Has a conscious decision been made about disabling DCHECKs in the canary and dev channels? Enabling them would trigger more crashes but would also generate useful reports.


On Tue, Jun 3, 2014 at 11:25 PM, Stephen Cheng <sfc...@gmail.com> wrote:
I've recently moved from the m31 release code base to m35 release code base for my little local customization project. I ran into many easily-reproducible debug assertions on m31. After spending only a couple of days on m35, I've run into quite a few debug assertions as well. I have a strong feeling that the debug build is not adequately tested by real-world actions from real users. 

Can we at least enable the DCHECK's in the canary build, assuming it is not already enabled? Exposing the dchecks to the outside world can help reveal a lot of potential problems in the code. I am not exactly sure how many people are downloading and playing with the canary build. If it's only limited to a couple of thousands of developers, I'd suggesting putting into the beta build instead. 



--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Ryan Sleevi

unread,
Jun 4, 2014, 4:57:29 PM6/4/14
to João da Silva, Shanfeng Cheng, laf...@chromium.org, kar...@chromium.org, Chromium-dev
I have little confidence in the correctness of a number of DCHECKs. That is, the code is correct, but the DCHECK is wrong. While "good" to fix - I feel like that will only make for an extremely crashy, unreliable experience for users - even more so than the signals we use for Canary.

See https://code.google.com/p/chromium/issues/detail?id=378079 for a real world example of good code, bad DCHECK.

Joao da Silva

unread,
Jun 4, 2014, 5:04:52 PM6/4/14
to rsl...@chromium.org, Shanfeng Cheng, laf...@chromium.org, kar...@chromium.org, Chromium-dev
I didn't anticipate that :-) So if the code is lying it will lead us to work with wrong assumptions, and it'd be great to fix those misleading checks. Sounds like yet another reason to enable dchecking, at least in the dev channel?

Shanfeng Cheng

unread,
Jun 4, 2014, 5:12:53 PM6/4/14
to rsleevi, João da Silva, laforge, kareng, chromium-dev
I agree that some of dchecks might be wrong. But the problem is that we can't tell good dchecks from bad ones until we investigate the assertion. 

On the other side, the users who have chosen to play with canary build should be more tolerant of bugs and crashes. If they can't live with the crashes, they should simply try beta channel instead. I can imagine that when we first turn on dchecks on canary, we will get a large number of crash reports. But after we spend the time and efforts to resolve the real problems and eliminate false dchecks, the number of crashes will flatten out and we will have a much cleaner project. 

A compromise we can do is to throw some of nonmodal warning instead of killing the application upon a dcheck in canary build. The user can choose to ignore the warning or click a button to report the crash. In this way, users can continue to help with the testing without too much frustration. 


Shanfeng Cheng

Scott Hess

unread,
Jun 4, 2014, 5:23:58 PM6/4/14
to Shanfeng Cheng, rsleevi, João da Silva, laforge, kareng, chromium-dev
The sql/ subsystem will throw on a bunch of these.  The reason is that almost all errors on developer machines are due to problems with the CL being checked in, while in the field similar errors can be caused by things like filesystem corruption.  My experience is that people writing files in Chromium mostly don't spend any time considering the failure cases, and certainly don't have any level of test coverage for them, so I'm not willing to commit to any level of cleanliness there.  On the other hand, that case is isolated enough to be worked around.

Note that this kind of thing will expose many failure cases which have been latent for months or years, so it wouldn't be the standard "Oh, you expect Canary to be unstable, it'll improve in the next release" experience.  Some users will simply stop working, and it might persist for a long time.  It might be reasonable to flip some sort of startup coin when enabling this to spread the pain, tightening the screws over time as the rate of crashes is managed.

-scott

Chris Hopman

unread,
Jun 4, 2014, 6:35:52 PM6/4/14
to Scott Hess, Shanfeng Cheng, rsleevi, João da Silva, laforge, kareng, chromium-dev
All this time I've been operating under the assumption that I could treat DCHECKs as assertions that the condition would never be false... I guess you are saying that's not the case. In which case, as a note to anyone who ever has to maintain my code, should I also comment my DCHECKs with a message like, "hey, if this DCHECK ever fires, then either there is a bug somewhere else, or I didn't understand what was happening here and you actually should check to make sure that what we are doing makes sense."

I guess I don't understand what value DCHECKs serve if not as assertions that that case will never happen... I guess they help clients of your api from using it in bad ways, but that doesn't actually work when half the team is building in Release just to avoid all the flaky DCHECKs.

Scott Hess

unread,
Jun 4, 2014, 6:50:51 PM6/4/14
to Chris Hopman, Shanfeng Cheng, rsleevi, João da Silva, laforge, kareng, chromium-dev
If people are building in release mode to avoid the DCHECKs I mentioned, then they are being lazy.  If they are seeing those failures, they need to fix them not paper over them.  That's why they are there.  So I'm not really sure where you're going with this.

The basic problem in the sql/ area is that developers don't write code with decent test coverage, and they don't fully grok the error cases for things injected by the runtime environment rather than their code.  It's a little handling NULL return from malloc, where nobody actually gets it right, except that when it's stored on disk it will happen every time forever until someone fixes it.  I fix things periodically as I uncover them, but only as a low-priority project because I don't take particular joy in figuring out people's broken code.  I have no doubt that this isn't isolated to the sql/ subsystem, I'm just saying that's one that I'm immediately aware of that will have issues.  There are others.

In any case, my point is that simply flipping on DCHECKs for canary isn't the same quality of unstable as normal canary, at least until it's baked for awhile, so it would be reasonable to do it incrementally.

-scott

Chris Hopman

unread,
Jun 4, 2014, 7:06:07 PM6/4/14
to Scott Hess, Shanfeng Cheng, rsleevi, João da Silva, laforge, kareng, chromium-dev
On Wed, Jun 4, 2014 at 3:50 PM, Scott Hess <sh...@chromium.org> wrote:
If people are building in release mode to avoid the DCHECKs I mentioned, then they are being lazy.  If they are seeing those failures, they need to fix them not paper over them.  That's why they are there.  So I'm not really sure where you're going with this.
 
If some completely unrelated area of the codebase has a DCHECK that fires when I do simple things like navigate to google.com, then I, too, would be tempted to just build in Release so that I can do work. What more can I do about that DCHECK than file a bug? I was not referring to the sql/ DCHECKs in particular (and I don't know that I've ever hit one of those), but I know that there are people building in Release because of failing DCHECKs.

That being said, I agree that they we can't really just turn them on in canary because it would be too crashy. Here's some ideas for getting the same benefit in a more user friendly way:

Upload a crashdump without actually crashing
Only crash/upload with some small probability
Enable DCHECKs with some small probability on process start (this would be better than enabling it for some % of installs so that no one user is stuck with the crashy version)

Joao da Silva

unread,
Jun 4, 2014, 7:09:58 PM6/4/14
to Chris Hopman, Scott Hess, Shanfeng Cheng, rsleevi, laforge, kareng, chromium-dev
There used to be a --enable-dcheck flag that I can't find right now. We could have a --disable-dcheck flag to unblock developers, and a knob in about:flags to turn dchecks off persistently for canary users.

Rachel Blum

unread,
Jun 4, 2014, 7:12:59 PM6/4/14
to joaod...@chromium.org, Chris Hopman, Scott Hess, Shanfeng Cheng, rsleevi, laforge, kareng, chromium-dev
Doesn't exist any more, see https://codereview.chromium.org/189603007/

You need to compile with dcheck_always_on to get DCHECK in release. (So, theoretically, devs who need it can just turn it off there?)

Joao da Silva

unread,
Jun 4, 2014, 7:16:39 PM6/4/14
to Rachel Blum, Chris Hopman, Scott Hess, Shanfeng Cheng, rsleevi, laforge, kareng, chromium-dev
Yep, I do work in Release most of the time and have dcheck_always_on in GYP_DEFINES.

If we do try to enable DCHECKs by default in the canary then a --disable-dchecks flag would probably come in handy (flipping the build-time define will likely require a full rebuild; and we do want people to hit DCHECKs and uncover bugs, but have a quick escape hatch to continue working)

Scott Hess

unread,
Jun 4, 2014, 7:17:01 PM6/4/14
to Chris Hopman, Shanfeng Cheng, rsleevi, João da Silva, laforge, kareng, chromium-dev
On Wed, Jun 4, 2014 at 4:05 PM, Chris Hopman <cjho...@chromium.org> wrote:
On Wed, Jun 4, 2014 at 3:50 PM, Scott Hess <sh...@chromium.org> wrote:
If people are building in release mode to avoid the DCHECKs I mentioned, then they are being lazy.  If they are seeing those failures, they need to fix them not paper over them.  That's why they are there.  So I'm not really sure where you're going with this.
 
If some completely unrelated area of the codebase has a DCHECK that fires when I do simple things like navigate to google.com, then I, too, would be tempted to just build in Release so that I can do work. What more can I do about that DCHECK than file a bug? I was not referring to the sql/ DCHECKs in particular (and I don't know that I've ever hit one of those), but I know that there are people building in Release because of failing DCHECKs.

Don't know what you want to hear, then.  You kind of made a comment about how you thought that DCHECKs were things that needed to be fixed not just ignored, but apparently you were wrong, and I responded that if developers hit these DCHECKs, they need to fix them.  If developers are routinely working around these things by running without DCHECKs, then I'm not sure how that's at all related to the specific case I mentioned as being problematic if we turn on DCHECKs for canaries.

In any case, I don't think turning on DCHECKs for canaries should be gated on some subset of Chromium developers who aren't interested in maintaining the commons for whatever reason.

That being said, I agree that they we can't really just turn them on in canary because it would be too crashy. Here's some ideas for getting the same benefit in a more user friendly way:

Upload a crashdump without actually crashing
Only crash/upload with some small probability
Enable DCHECKs with some small probability on process start (this would be better than enabling it for some % of installs so that no one user is stuck with the crashy version)

The latter is exactly what I'm suggesting.  We can start with a 1% chance and ramp it up as we get ahead of the crashes.

Only crashing with some small probability won't work because DCHECKs are not randomly distributed.  You'll probably hit the same DCHECKs repeatedly.

Uploading a crashdump without actually crashing is dicey.  It works kind of alright as a diagnostic tool, but wiring it into DCHECK is going to be annoying (it's up in chrome/, I believe, so it would involve some more plumbing).  It also can't stand alone because of crash-upload restrictions - spurious DCHECK uploads might mask real release-mode crashes.

Chris Hopman

unread,
Jun 4, 2014, 7:38:28 PM6/4/14
to Scott Hess, Shanfeng Cheng, rsleevi, João da Silva, laforge, kareng, chromium-dev
Scott, I agree with everything you've been saying. 

Maybe my first message would have been more clear as a reply to rsleevi@ because his were the comments that I was replying to, not yours. He had pointed out that there are probably many incorrect DCHECKs in the code, and I just wish that that weren't the case.

Ryan Sleevi

unread,
Jun 4, 2014, 7:41:14 PM6/4/14
to Chris Hopman, Scott Hess, Shanfeng Cheng, rsleevi, João da Silva, laforge, kareng, chromium-dev
So it's clear - I agree, quite unfortunate to have bad DCHECKs. In the bug I referenced, it was caught by bots, but only when specific conditions happen that (surprise) weren't covered already by unit tests. This, of course, is a bad state for both implementation and testing.

I just highlight it because of the number of interactions that aren't covered by tests. Having DCHECKs in user-facing releases as a much, much greater chance of crashing - even for 'benign' (overzealous) DCHECKs. I don't think Canary should necessarily let us run wild with making things unusable for users - and so agree with Scott, that if it is enabled, it should be for a suitably small population of users.

Shanfeng Cheng

unread,
Jun 4, 2014, 8:46:02 PM6/4/14
to rsleevi, Chris Hopman, Scott Hess, rsleevi, João da Silva, laforge, kareng, chromium-dev
I think we basically have two options:

1. Enable dchecks in the current canary build but make it more user friendly. Such as poping a nonmodal notification at the lower right corner of the window with one-click access to submit the assertion report. We can even add an option to permanently surpress all dchecks upon the user's request. Or, if one assertion is displayed, there won't be any other assertion until three days later. I believe similar other tricks can be used so that we can get enough notifications of the dchecks without pissing off too many users.  

2. Create another build (name it Bleeding build?) and enable dchecks in it and let it crash. Put out explicit warnings about it when users try to download this build so that it doesn't get used by too many people. 

I am voting for the first option if my opinion counts. 

Peter Kasting

unread,
Jun 4, 2014, 9:08:31 PM6/4/14
to Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kar...@chromium.org, Chromium-dev
On Wed, Jun 4, 2014 at 1:57 PM, Ryan Sleevi <rsl...@chromium.org> wrote:
I have little confidence in the correctness of a number of DCHECKs. That is, the code is correct, but the DCHECK is wrong. While "good" to fix - I feel like that will only make for an extremely crashy, unreliable experience for users - even more so than the signals we use for Canary.

See https://code.google.com/p/chromium/issues/detail?id=378079 for a real world example of good code, bad DCHECK.

I strongly disagree with this argument.  I also disagree with how the network team seems to be handling the bug you link (i.e. not treating this as important to fix).

The history of our policy toward DCHECKs is that in the Firefox days, there were all kinds of assertions that failed and dumped output to the console, but because there were hundreds, no one worried about them, or noticed when more happened.  This meant that any real bugs they highlighted didn't get fixed, and code that asserted couldn't be trusted to actually be run with the assertions holding.

We didn't want any of those things to be true of Chrome, so we created a very clear policy.  DCHECK failure must not be handled except in the case where not handling it would lead to an exploitable security bug.  All DCHECKs must pass, and should be read as strong assertions that they will _never_ fail, even in error cases -- things like full memory or disk corruption should never lead to failing DCHECKs.  Encountering a DCHECK failure should be treated exactly like encountering a browser crash: it is a P1 bug and should be fixed immediately.

If there are DCHECKs in the code today for which this is not true, it undermines all the advantages this sort of strict policy gives.  We don't want that to happen.  Therefore, we SHOULD run things like Canary with DCHECKs enabled, it should not cause excessive crashing, and if it does, we should fix those crashes with high priority.  We should never say that because there might be incorrect DCHECKs, we should avoid running Canary with DCHECKs on.

If there are subteams not handling DCHECKs according to the policy above, please correct things.  Make sure people take DCHECK failure as seriously as crashing.  Disallow most instances of the following constructions in code you write or review:

DCHECK(foo);
if (!foo) ...

or

if (foo) {
  NOTREACHED();
  return;
}

The benefit of this will be safer, more readable code.

PK

Luke Nihlen

unread,
Jun 4, 2014, 9:11:32 PM6/4/14
to pkas...@chromium.org, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kar...@chromium.org, Chromium-dev
I would volunteer to be part of a DCHECK cleanup squad should we decide to turn this on in Canary. Which I think we should.

\L


--

Anthony LaForge

unread,
Jun 4, 2014, 9:31:14 PM6/4/14
to Luke Nihlen, ana...@chromium.org, Peter Kasting, Ryan Sleevi, João da Silva, Shanfeng Cheng, kar...@chromium.org, Chromium-dev
The premise is interesting, although let's perhaps re-direct that energy a little bit.  Rather than inflict this pain (i.e. we have a high confidence that this will be unstable), alternatively, perhaps a better initial target would be a chromebot run of a DCHECK enabled build.  I'm guessing that would shake out a fair amount of low hanging fruit, w/ out directly inconveniencing our user base (who are much less tolerant of instability than most would suspect).

Kind Regards,

Anthony Laforge
Technical Program Manager
Mountain View, CA

Shanfeng Cheng

unread,
Jun 4, 2014, 9:55:51 PM6/4/14
to laforge, Luke Nihlen, anantha, Peter Kasting, rsleevi, João da Silva, kareng, chromium-dev
A chromebot run wouldn't come close to real test by real users. We are already running a lot of unit tests on the debug build already, aren't we? But we are still missing out on so many assertions. 
One very obvious bug I would like to mention as an example is the one I just found. I download any file in a non-English UI such as Chinese, korean and Japanese and I ran into a debug assertion immediately. It is not a weird test case by any standar. However, the bug has been there since m34, bypassing all the automatic tests. It would have been reported by users on day 1 if dcheck is enabled on canary. 

I strongly support enabling dchecks in canary. And there are many ways we can work out to do that without making the canary build totally unusable. The most straightforward way is run a random number. If the random number is within the range, we will throw a warning and let users to choose to submit the bug report. We can choose 1% or 0.1% or any other percentage number suitable for the team and the people outside. However, we MUST expose the dchecks to real users so that we can eliminate most dchecks in frequently-walked code path or caused by popular sites. 


Shanfeng Cheng
 
Date: 2014-06-04 20:30
Subject: Re: [chromium-dev] Enabling DCHECK's in canary or beta build

Daniel Cheng

unread,
Jun 4, 2014, 11:29:28 PM6/4/14
to laf...@chromium.org, Shanfeng Cheng, João da Silva, Luke Nihlen, Ryan Sleevi, Peter Kasting, kar...@chromium.org, ana...@chromium.org, Chromium-dev

Sorry for the ignorance: what's a chromebot run?

Scott Hess

unread,
Jun 5, 2014, 12:47:24 AM6/5/14
to Shanfeng Cheng, laforge, Luke Nihlen, anantha, Peter Kasting, rsleevi, João da Silva, kareng, chromium-dev
What is the goal, though?  If the solution to a DCHECK firing on a canary of this sort is to remove the DCHECK, then we really haven't gained anything.  If the solution is to analyze the reason for the DCHECK and fix the code, that would be great.  But ... do you not have enough known bugs to fix currently?

Like I said earlier, if developers are running with DCHECKs disabled because they cannot be bothered to log and/or fix bugs, then those developers are frankly letting the team down.  Enabling DCHECKs on canary won't fix that, it will just generate a bunch of un-triaged reports to overwhelm the relatively small number of people who spend their time triaging crash reports.

-scott

dak...@gmail.com

unread,
Jun 5, 2014, 2:15:20 AM6/5/14
to sh...@chromium.org, Shanfeng Cheng, laforge, Luke Nihlen, anantha, Peter Kasting, rsleevi, João da Silva, kareng, chromium-dev
Unsubscribe

Sent from my HTC on T-Mobile 4G LTE
> wrote:

> A chromebot run wouldn't come close to real test by real users. We are
> already running a lot of unit tests on the debug build already, aren't we?
> But we are still missing out on so many assertions.
> One very obvious bug I would like to mention as an example is the one I
> just found. I download any file in a non-English UI such as Chinese, korean
> and Japanese and I ran into a debug assertion immediately. It is not a
> weird test case by any standar. However, the bug has been there since
> m34, bypassing all the automatic tests. It would have been reported by
> users on day 1 if dcheck is enabled on canary.
>
> I strongly support enabling dchecks in canary. And there are many ways we
> can work out to do that without making the canary build totally unusable.
> The most straightforward way is run a random number. If the random number
> is within the range, we will throw a warning and let users to choose to
> submit the bug report. We can choose 1% or 0.1% or any other percentage
> number suitable for the team and the people outside. However, we MUST
> expose the dchecks to real users so that we can eliminate most dchecks in
> frequently-walked code path or caused by popular sites.
>
> ------------------------------
>  Shanfeng Cheng
>
>
> *From:* Anthony LaForge <laf...@chromium.org>
> *Date:* 2014-06-04 20:30
> *To:* Luke Nihlen <lu...@chromium.org>; anantha <ana...@chromium.org>
> *CC:* Peter Kasting <pkas...@chromium.org>; Ryan Sleevi
> <rsl...@chromium.org>; João da Silva <joaod...@chromium.org>; Shanfeng
> Cheng <sfc...@gmail.com>; kareng <kar...@chromium.org>; Chromium-dev
> <chromi...@chromium.org>
> *Subject:* Re: [chromium-dev] Enabling DCHECK's in canary or beta build
> The premise is interesting, although let's perhaps re-direct that energy a
> little bit.  Rather than inflict this pain (i.e. we have a high confidence
> that this will be unstable), alternatively, perhaps a better initial target
> would be a chromebot run of a DCHECK enabled build.  I'm guessing that
> would shake out a fair amount of low hanging fruit, w/ out directly
> inconveniencing our user base (who are much less tolerant of instability
> than most would suspect).
>
> Kind Regards,
>
> Anthony Laforge
> Technical Program Manager
> Mountain View, CA
>
>
> On Wed, Jun 4, 2014 at 6:11 PM, Luke Nihlen <lu...@chromium.org> wrote:
>
>> I would volunteer to be part of a DCHECK cleanup squad should we decide
>> to turn this on in Canary. Which I think we should.
>>
>> \L
>>
>>
>> On Wed, Jun 4, 2014 at 6:07 PM, Peter Kasting <pkas...@chromium.org>
>> wrote:
>>
>>>  On Wed, Jun 4, 2014 at 1:57 PM, Ryan Sleevi <rsl...@chromium.org>
>>> wrote:
>>>
>>>> I have little confidence in the correctness of a number of DCHECKs.
>>>> That is, the code is correct, but the DCHECK is wrong. While "good" to fix
>>>> - I feel like that will only make for an extremely crashy, unreliable
>>>> experience for users - even more so than the signals we use for Canary.
>>>>
>>>> See https://code.google.com/p/chromium/issues/detail?id=378079 for a
>>>> real world example of good code, bad DCHECK.
>>>>
>>>
>>> I strongly disagree with this argument.  I also disagree with how the
>>> network team seems to be handling the bug you link (i.e. not treating this
>>> as important to fix).
>>>
>>> The history of our policy toward DCHECKs is that in the Firefox days,
>>> there were all kinds of assertions that failed and dumped output to the
>>> console, but because there were hundreds, no one worried about them, or
>>> noticed when more happened.  This meant that any real bugs they highlighted
>>> didn't get fixed, and code that asserted couldn't be trusted to actually be
>>> run with the assertions holding.
>>>
>>> We didn't want any of those things to be true of Chrome, so we created a
>>> very clear policy.  DCHECK failure must not be handled except in the case
>>> where not handling it would lead to an exploitable security bug.  All
>>> DCHECKs must pass, and should be read as strong assertions that they will
>>> _never_ fail, even in error cases -- things like full memory or disk
>>> corruption should never lead to failing DCHECKs.  Encountering a DCHECK
>>> failure should be treated exactly like encountering a browser crash: it is
>>> a P1 bug and should be fixed immediately.
>>>
>>> If there are DCHECKs in the code today for which this is not true, it
>>> undermines all the advantages this sort of strict policy gives.  We don't
>>> want that to happen.  Therefore, we SHOULD run things like Canary with
>>> DCHECKs enabled, it should not cause excessive crashing, and if it does, we
>>> should fix those crashes with high priority.  We should never say that
>>> because there might be incorrect DCHECKs, we should avoid running Canary
>>> with DCHECKs on.
>>>
>>> If there are subteams not handling DCHECKs according to the policy
>>> above, please correct things.  Make sure people take DCHECK failure as
>>> seriously as crashing.  Disallow most instances of the following
>>> constructions in code you write or review:
>>>
>>> DCHECK(foo);
>>> if (!foo) ...
>>>
>>> or
>>>
>>> if (foo) {
>>>   NOTREACHED();
>>>   return;
>>> }
>>>
>>> The benefit of this will be safer, more readable code.
>>>
>>> PK
>>>
>>> --
>>> --
>>> Chromium Developers mailing list: chromi...@chromium.org

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Shanfeng Cheng

unread,
Jun 5, 2014, 2:17:33 AM6/5/14
to Scott Hess, laforge, Luke Nihlen, anantha, Peter Kasting, rsleevi, João da Silva, kareng, chromium-dev
Let's say there are 100 known bugs and 100 people working on them with their hands full. There are a simple switch we can flip so that we can know another 900 bugs. I'd still rather flip the switch to at least know all the bugs.  At least, I can pick the most important 100 bugs out of the 1000 to fix with the people we have. We still get a higher quality project by doing that. 

Unless the assumption is that dchecks are unimportant and ignorable to begin with, that'd be a different story. 


Shanfeng Cheng

Ryan Sleevi

unread,
Jun 5, 2014, 2:31:34 AM6/5/14
to Shanfeng Cheng, Scott Hess, Joao da Silva, Luke Nihlen, chromium-dev, Peter Kasting, anantha, laforge, kareng


On Jun 4, 2014 11:16 PM, "Shanfeng Cheng" <sfc...@gmail.com> wrote:
>
> Let's say there are 100 known bugs and 100 people working on them with their hands full. There are a simple switch we can flip so that we can know another 900 bugs. I'd still rather flip the switch to at least know all the bugs.  At least, I can pick the most important 100 bugs out of the 1000 to fix with the people we have. We still get a higher quality project by doing that. 
>

Wouldn't the 100 known bugs - the ones being encountered by real users in the real world, as evidenced by the crash data - by definition be the most important?

Shanfeng Cheng

unread,
Jun 5, 2014, 2:44:43 AM6/5/14
to rsleevi, Scott Hess, João da Silva, Luke Nihlen, chromium-dev, Peter Kasting, anantha, laforge, kareng
Not necessarily. I assuming only a small percentage of the known bugs are crashing bugs? Many of them would be usability issues. 

And bugs doesn't just compete with bugs. Bugs compete with new features as well. If we simply have too many bugs to fix, we might postpone on some of the new features and focus on resolving existing issues. 


Shanfeng Cheng
 
Date: 2014-06-05 01:31
Subject: Re: Re: [chromium-dev] Enabling DCHECK's in canary or beta build

On Jun 4, 2014 11:16 PM, "Shanfeng Cheng" <sfc...@gmail.com> wrote:
>
> Let's say there are 100 known bugs and 100 people working on them with their hands full. There are a simple switch we can flip so that we can know another 900 bugs. I'd still rather flip the switch to at least know all the bugs.  At least, I can pick the most important 100 bugs out of the 1000 to fix with the people we have. We still get a higher quality project by doing that. 
>

Peter Kasting

unread,
Jun 5, 2014, 1:49:46 PM6/5/14
to Scott Hess, Shanfeng Cheng, laforge, Luke Nihlen, anantha, rsleevi, João da Silva, kareng, chromium-dev
On Wed, Jun 4, 2014 at 9:46 PM, Scott Hess <sh...@chromium.org> wrote:
What is the goal, though?  If the solution to a DCHECK firing on a canary of this sort is to remove the DCHECK, then we really haven't gained anything.  If the solution is to analyze the reason for the DCHECK and fix the code, that would be great.  But ... do you not have enough known bugs to fix currently?

In my experience, while DCHECK failures are not always easy to fix correctly (it's tempting to bandaid them by just adding NULL-checks or the like, which often isn't right), they're usually easier to fix than many other kinds of bugs, yet often represent real issues that in a non-DCHECK build eventually cause harder-to-track-down crashes.

So preferentially spending energy fixing DCHECK failures is not necessarily pointless.

PK

Matt Menke

unread,
Jun 5, 2014, 1:50:23 PM6/5/14
to chromi...@chromium.org, pkas...@chromium.org, rsl...@chromium.org, joaod...@chromium.org, sfc...@gmail.com, laf...@chromium.org, kar...@chromium.org
I think enable DCHECKs on Canary is just a recipe for further reducing Canary usage.  I've certainly been using Canary less than I used to because with a lot of tabs on the same site, it becomes painfully sluggish in a way that stable and dev don't.

Could we create "crash" dumps without actually crashing the browser?  If so, seems like we could just create at most a couple per session, and use those, instead of actually crashing.

Peter Kasting

unread,
Jun 5, 2014, 1:53:37 PM6/5/14
to Matt Menke, Chromium-dev, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kareng
On Thu, Jun 5, 2014 at 10:50 AM, Matt Menke <mme...@chromium.org> wrote:
I think enable DCHECKs on Canary is just a recipe for further reducing Canary usage.

Doesn't this assume that tons of such DCHECKs will fire?  I think that's not a guaranteed assumption.  Personally I suspect this will have relatively little overall impact.

But instead of speculating about what will happen, why don't we actually turn it on for a day and see what the results are?  That will let us reason from actual data.

PK

Scott Hess

unread,
Jun 5, 2014, 1:57:07 PM6/5/14
to Peter Kasting, Shanfeng Cheng, laforge, Luke Nihlen, anantha, rsleevi, João da Silva, kareng, chromium-dev
I bet if we enabled DCHECK on canary for a limited time (like one release) or a limited subset of users (like rand(100)==0), we'd generate a ton of actionable crashes.  But I bet that if we set the goal of enabling DCHECK on canary 100% of the time at some point, we'll end up suppressing a bunch of DCHECKs rather than fixing them, either because they are too hard to fix or because nobody has the time to really understand what is going on in the surrounding code to figure out if the checked item is real.  As a for-instance, I've run across DCHECKs which seem to be threading-related and are hard to repro, so they _probably_ indicate a problem, but it's hard to figure out what they're telling me.

-scott

Peter Kasting

unread,
Jun 5, 2014, 1:59:32 PM6/5/14
to Scott Hess, Shanfeng Cheng, laforge, Luke Nihlen, anantha, rsleevi, João da Silva, kareng, chromium-dev
On Thu, Jun 5, 2014 at 10:56 AM, Scott Hess <sh...@chromium.org> wrote:
As a for-instance, I've run across DCHECKs which seem to be threading-related and are hard to repro, so they _probably_ indicate a problem, but it's hard to figure out what they're telling me.

That is when I file a P1 bug against the person who added the DCHECK.  Sure, they may bandaid it instead of fix it correctly (though I try to urge people not to do so), but at least they're in a better position to make that decision than I am.

However, even the world you describe is actually better than the current world.  A DCHECK tells a reader that some assertion must hold.  If that assertion doesn't actually hold in the real world, then we shouldn't be continuing to assert that it does, because that will mislead people who build on that assumption.  So while just removing DCHECKs is not ideal, it's better than doing nothing at all.

PK

Victor Miura

unread,
Jun 5, 2014, 2:18:58 PM6/5/14
to rsl...@chromium.org, João da Silva, Shanfeng Cheng, laf...@chromium.org, kar...@chromium.org, Chromium-dev
Instead of ignoring DCHECKs we should enable them and weed them out.

We test with Release + DCHECKs enabled on the build waterfalls, so if we are not getting these bad DCHECKs in testing it means we don't have good test coverage.


On Wed, Jun 4, 2014 at 1:57 PM, Ryan Sleevi <rsl...@chromium.org> wrote:
I have little confidence in the correctness of a number of DCHECKs. That is, the code is correct, but the DCHECK is wrong. While "good" to fix - I feel like that will only make for an extremely crashy, unreliable experience for users - even more so than the signals we use for Canary.

See https://code.google.com/p/chromium/issues/detail?id=378079 for a real world example of good code, bad DCHECK.
On Wed, Jun 4, 2014 at 1:53 PM, Joao da Silva <joaod...@chromium.org> wrote:
I support this idea based on recent bugs I've looked at, that would have been easily spotted by a DCHECK in the dev channel on ChromeOS.

Has a conscious decision been made about disabling DCHECKs in the canary and dev channels? Enabling them would trigger more crashes but would also generate useful reports.
On Tue, Jun 3, 2014 at 11:25 PM, Stephen Cheng <sfc...@gmail.com> wrote:
I've recently moved from the m31 release code base to m35 release code base for my little local customization project. I ran into many easily-reproducible debug assertions on m31. After spending only a couple of days on m35, I've run into quite a few debug assertions as well. I have a strong feeling that the debug build is not adequately tested by real-world actions from real users. 

Can we at least enable the DCHECK's in the canary build, assuming it is not already enabled? Exposing the dchecks to the outside world can help reveal a lot of potential problems in the code. I am not exactly sure how many people are downloading and playing with the canary build. If it's only limited to a couple of thousands of developers, I'd suggesting putting into the beta build instead. 



--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Matt Menke

unread,
Jun 5, 2014, 3:11:45 PM6/5/14
to Peter Kasting, Chromium-dev, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kareng
I'm a bit paranoid about a potential long term impact on the size of Canary channel, as users potentially get fed up with it.

Alexei Svitkine

unread,
Jun 5, 2014, 3:38:58 PM6/5/14
to mme...@chromium.org, Peter Kasting, Chromium-dev, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kareng
If we enable dchecks for 1% or (even 0.1%) of the runs (randomized on startup, so a user can just restart and get put in a different group if they get a bad experience), I don't think it should make canary a noticeably worse experience for most users while still feeding us useful data about issues found in the wild. This can be done with a session-randomized field trial.


Matt Menke

unread,
Jun 5, 2014, 3:39:54 PM6/5/14
to Alexei Svitkine, Peter Kasting, Chromium-dev, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kareng
That sounds pretty reasonable to me.

Chris Hopman

unread,
Jun 5, 2014, 4:41:05 PM6/5/14
to asvi...@chromium.org, mme...@chromium.org, Peter Kasting, Chromium-dev, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kareng
It would be better for this to be randomized per-process, which should hide the extra crashiness for renderers (and if the browser crashes then you get to flip the coin again anyway). This way a user should never be stuck with a particularly crashy experience.

Thiemo Nagel

unread,
Jun 7, 2014, 5:23:30 PM6/7/14
to pkas...@chromium.org, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kar...@chromium.org, Chromium-dev
(this time with the correct sender address, sorry for the spam)

On Thu, Jun 5, 2014 at 3:07 AM, Peter Kasting <pkas...@chromium.org> wrote:
If there are subteams not handling DCHECKs according to the policy above, please correct things.  Make sure people take DCHECK failure as seriously as crashing.  Disallow most instances of the following constructions in code you write or review:

DCHECK(foo);
if (!foo) ...

or

if (foo) {
  NOTREACHED();
  return;
}


I've updated the Coding Style to include these anti-patterns and added a few sentences to try to make the section clearer for casual readers. (I hope that's ok.)


Other than that, +1 for low-probabilty enabling of DCHECK() for canary and dev (lower probability) and beta (even lower probability).

Cheers,
Thiemo

Thiemo Nagel

unread,
Jun 9, 2014, 9:30:47 PM6/9/14
to Thiemo Nagel, pkas...@chromium.org, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kar...@chromium.org, Chromium-dev
One more thing: Could somebody please offer guidance as to when to use CHECK() and when DCHECK()? Is it just a matter of avoiding CHECK() in hotspots or do we generally prefer DCHECK() because of code size considerations?

Peter Kasting

unread,
Jun 9, 2014, 9:38:17 PM6/9/14
to Thiemo Nagel, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kareng, Chromium-dev
On Mon, Jun 9, 2014 at 6:30 PM, Thiemo Nagel <tna...@chromium.org> wrote:
One more thing: Could somebody please offer guidance as to when to use CHECK() and when DCHECK()? Is it just a matter of avoiding CHECK() in hotspots or do we generally prefer DCHECK() because of code size considerations?

The Chrome coding style guide at http://dev.chromium.org/developers/coding-style documents this, but in short, use CHECK in two scenarios:

(1) The consequence of failing this assertion would be an immediate, exploitable security vulnerability.  (Usually this is handled some other way -- I don't recall having ever written such a CHECK.)
(2) You're upgrading a DCHECK to a CHECK temporarily (or temporarily adding CHECKs) to try and get crash stacks to illuminate some bug.

Otherwise, use DCHECK, for both performance and codesize reasons, and so that readers don't think one of the above two cases is in effect.  Remember that, because DCHECK and CHECK mean "this condition always holds", there's normally no reason to pay any penalty, even a small one, to evaluate the condition in a release build.

PK

Thiemo Nagel

unread,
Jun 10, 2014, 8:52:06 PM6/10/14
to Peter Kasting, Thiemo Nagel, Ryan Sleevi, João da Silva, Shanfeng Cheng, Anthony LaForge, kareng, Chromium-dev
(this time with correct email sender, sorry again :( )

Thank you very much! I agree that implicitly, the preference of DCHECK() over CHECK() was already contained in the Coding Style, I've now added another half-sentence to make it explicit:

Chris Hopman

unread,
Jul 16, 2014, 4:02:54 PM7/16/14
to Scott Hess, Shanfeng Cheng, laforge, Luke Nihlen, anantha, Peter Kasting, rsleevi, João da Silva, kareng, chromium-dev
On Wed, Jun 4, 2014 at 9:46 PM, Scott Hess <sh...@chromium.org> wrote:
What is the goal, though?  If the solution to a DCHECK firing on a canary of this sort is to remove the DCHECK, then we really haven't gained anything.  If the solution is to analyze the reason for the DCHECK and fix the code, that would be great.  But ... do you not have enough known bugs to fix currently?

Like I said earlier, if developers are running with DCHECKs disabled because they cannot be bothered to log and/or fix bugs, then those developers are frankly letting the team down.  Enabling DCHECKs on canary won't fix that, it will just generate a bunch of un-triaged reports to overwhelm the relatively small number of people who spend their time triaging crash reports.

From a recent thread (on a different mailing list):

"Based on "fault addr 0xfbadbeef", this looks like an assertion in a debug build.  We aren't as disciplined as we should be in tracking those down.  Most people end up using release builds to avoid them (as well as the faster compile)."

I don't think that just saying people shouldn't disable DCHECKs is a good solution. If going to wikipedia.org consistently crashes, a bug is filed for the crash, and two months later it is still crashing, developers are going to start disabling DCHECKs. And as developers disable DCHECKs, it's just going to get worse and worse. I think we are currently far into that "worse and worse" stage.

What can we do to get out of that state? I don't know. Maybe we don't incentivize code health enough. Maybe we don't care enough about fixing crashes. It was asked internally recently if we care too much about adding features and not enough about maintenance and code health and other stuff. Is this a symptom of that?

So, will enabling DCHECKs in canary/beta help? Maybe. I think we would get crash reports about more issues. Maybe even better would be that we would get a better idea about the relative frequency of different crashes so that we could better prioritize fixing them.

-Chris
Reply all
Reply to author
Forward
0 new messages