CHECKing pointers before derefs

282 views
Skip to first unread message

Wojciech Dzierżanowski

unread,
Dec 13, 2024, 11:06:54 AM12/13/24
to c...@chromium.org
Hi,

Every now and again a discussion pops up about constructs like

void DoSomethingWithT(T* t) {
  CHECK(t);  // To CHECK or not to CHECK?
  t->f();
}

There used to be guidance in Chromium that explicitly called this out as redundant CHECKing. But then it was removed and the reason is unclear to me.

I can think of arguments for and against CHECKing. For example:
+ Dereferencing a null pointer is undefined behavior per the standard.
+ Even if we're expecting a crash, it may happen several levels above DoSomethingWithT() in the frame stack (upon attempting to access data members of T).
- Accessing memory at the address 0x0 or close to it is guaranteed to crash on the platforms we target.
- Assuming we're guaranteed to crash anyway, preceding every such dereference with a CHECK is a lot of CHECKing that is not strictly necessary.

Is either choice generally recommended in Chromium or should people weigh the pros and cons on a case by case basis?

Hopefully, the quandary is going to be less and less common. If DoSomethingWithT() is written according to the current Google C++ Style Guide, it becomes:
 
  void DoSomethingWithT(T& t) {
    t.f();
  }

and the question of CHECKing becomes moot. However, given that most of the existing code in Chromium predates the guidance to take never-null arguments by reference (and return never-null objects by reference), the new signature of DoSomethingWithT() can easily only move the question lower in the frame stack: The t reference can still potentially be formed by dereferencing a null pointer.

--
Wojtek

Peter Boström

unread,
Dec 13, 2024, 3:37:36 PM12/13/24
to Wojciech Dzierżanowski, c...@chromium.org
In general if you're working in code that's not a hotspot then there's not much harm in CHECKing your expectations, it may save you debugging time down the line. In developer / debug builds you also get better error messages if that's ever violated rather than segfault and on official builds where it's cheap unless used in an extremely hotspot (and you probably wanna find out that it's a problem rather than proactively avoid it).

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAFDkGJPZogOm8Esko_7hZUc_YWja1%3DFRHoeezf2cyPTdEW685g%40mail.gmail.com.

Matt Denton

unread,
Dec 13, 2024, 5:40:17 PM12/13/24
to Peter Boström, Wojciech Dzierżanowski, c...@chromium.org
On Fri, Dec 13, 2024 at 12:37 PM Peter Boström <pb...@chromium.org> wrote:
In general if you're working in code that's not a hotspot then there's not much harm in CHECKing your expectations, it may save you debugging time down the line. In developer / debug builds you also get better error messages if that's ever violated rather than segfault and on official builds where it's cheap unless used in an extremely hotspot (and you probably wanna find out that it's a problem rather than proactively avoid it).

On Sat, Dec 14, 2024 at 3:06 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
Hi,

Every now and again a discussion pops up about constructs like

void DoSomethingWithT(T* t) {
  CHECK(t);  // To CHECK or not to CHECK?
  t->f();
}

There used to be guidance in Chromium that explicitly called this out as redundant CHECKing. But then it was removed and the reason is unclear to me.

I can think of arguments for and against CHECKing. For example:
+ Dereferencing a null pointer is undefined behavior per the standard.

Because we [compile with (the badly named) -fno-delete-null-pointer-checks flag](https://source.chromium.org/chromium/chromium/src/+/main:build/config/compiler/BUILD.gn;drc=419187ec8ae8782d1306398977251460a4883243;l=328), clang treats null-pointer deref as defined behavior. (GCC unfortunately has a bug and we don't enable it on gcc builds)
 
+ Even if we're expecting a crash, it may happen several levels above DoSomethingWithT() in the frame stack (upon attempting to access data members of T).

Yes, I agree with pbos@ that CHECK() isn't harmful in most cases and can make the crashes more obvious. Sometimes the crash is hidden on a long line and is due to a null `this` pointer and happens when accessing this->member_, many frames up the stack.
 
- Accessing memory at the address 0x0 or close to it is guaranteed to crash on the platforms we target.
- Assuming we're guaranteed to crash anyway, preceding every such dereference with a CHECK is a lot of CHECKing that is not strictly necessary.

Is either choice generally recommended in Chromium or should people weigh the pros and cons on a case by case basis?

Hopefully, the quandary is going to be less and less common. If DoSomethingWithT() is written according to the current Google C++ Style Guide, it becomes:
 
  void DoSomethingWithT(T& t) {
    t.f();
  }

and the question of CHECKing becomes moot. However, given that most of the existing code in Chromium predates the guidance to take never-null arguments by reference (and return never-null objects by reference), the new signature of DoSomethingWithT() can easily only move the question lower in the frame stack: The t reference can still potentially be formed by dereferencing a null pointer.

--
Wojtek

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAFDkGJPZogOm8Esko_7hZUc_YWja1%3DFRHoeezf2cyPTdEW685g%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.

Peter Kasting

unread,
Dec 13, 2024, 5:58:23 PM12/13/24
to Wojciech Dzierżanowski, cxx
CHECK is best used to express preconditions, IMO. If a function takes a pointer, a CHECK makes it obvious that any null deref is the caller's fault.

Outside that scenario, waters get muddier. I would hate to see us CHECK before every single deref due to the noise. 

CHECK_DEREF is also a thing, IIRC.

PK

--

Wojciech Dzierżanowski

unread,
Dec 16, 2024, 7:18:36 AM12/16/24
to Peter Kasting, mpde...@google.com, pb...@chromium.org, cxx
Thank you all for replying!

Because we compile with (the badly named) -fno-delete-null-pointer-checks flag, clang treats null-pointer deref as defined behavior.

Maybe it's also badly documented :), but looking at the docs and related Chromium tasks (as no expert on this by any means), I get the impression it's the other way around (please correct me if I'm confused):

We assume a null pointer dereference crashes deterministically on the platforms we target. Furthermore, we want to be able to rely on this behavior so we tell the compiler not to optimize around a null pointer dereference being undefined behavior. In particular, with a base::WeakPtr referencing a destroyed object we really prefer a reliable crash over a UaF. The latter is possible (or at least was possible before hardening?) if the compiler removes a code block guarded by a null pointer check that it considers garbage due to it only being entered when the pointer is null and it sees we're dereferencing it later. Without -fno-delete-null-pointer-checks the compiler is free to remove it -- just as it's free to remove any code that the standard says would execute within the realm of undefined behavior.

Most importantly, whether it's this or the other way around, the bottom line doesn't change: We assume a null pointer dereference is a sure crash -- when it results in an attempt to access memory at 0x0 (+ a small offset).

The responses people have provided seem to converge on the following reasoning wrt CHECKing before a dereference:

Since a null pointer dereference a sure crash, it's not strictly necessary to CHECK as there is no security risk. At the same time, CHECKing is typically not harmful and can be helpful for debugging, so feel free to do it (modulo hotspots). Try to place any such CHECKs strategically (at API boundaries) to avoid spamming the code with CHECKs.

Does that sound like a fair summary?

Peter Kasting

unread,
Dec 16, 2024, 11:27:15 AM12/16/24
to Wojciech Dzierżanowski, Matt Denton, Peter Boström, cxx
I read your summary as correct, but also not "the other way around" from "we consider null derefs defined behavior".

As for where to use checks, sure, your guidance is reasonable. It's not how I'd phrase it (I would think more of checks as tied to function preconditions, irrespective of any derefs in the function body, which should rarely if ever have checks), but I suspect it amounts to the same thing. 

PK

Matt Denton

unread,
Dec 16, 2024, 7:03:07 PM12/16/24
to Wojciech Dzierżanowski, Peter Kasting, pb...@chromium.org, cxx
On Mon, Dec 16, 2024 at 4:18 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
Thank you all for replying!

Because we compile with (the badly named) -fno-delete-null-pointer-checks flag, clang treats null-pointer deref as defined behavior.

Maybe it's also badly documented :), but looking at the docs and related Chromium tasks (as no expert on this by any means), I get the impression it's the other way around (please correct me if I'm confused):

This is the documentation for -fdelete-null-pointer-checks whereas we pass -fno-delete-null-pointer-checks. Our flags make null pointer deref into *defined* behavior, i.e. the compiler assumes that data can live at null.

We assume a null pointer dereference crashes deterministically on the platforms we target. Furthermore, we want to be able to rely on this behavior so we tell the compiler not to optimize around a null pointer dereference being undefined behavior. In particular, with a base::WeakPtr referencing a destroyed object we really prefer a reliable crash over a UaF. The latter is possible (or at least was possible before hardening?) if the compiler removes a code block guarded by a null pointer check

This is where it is badly named. :) There is no null check. In this WeakPtr case (pre-hardening), there is a branch that stores null to a pointer, and later we unconditionally dereference that pointer. Without  -fno-delete-null-pointer-checks the compiler would assume that because the null store leads to undefined behavior (derefing null), it can just assume that branch is never taken and delete it (this is critical to lots of compiler optimizations when inlining code). But then we never overwrote our dangling pointer with 0, and so a UAF results when we deref the dangling pointer.
 
that it considers garbage due to it only being entered when the pointer is null and it sees we're dereferencing it later. Without -fno-delete-null-pointer-checks the compiler is free to remove it -- just as it's free to remove any code that the standard says would execute within the realm of undefined behavior.

Most importantly, whether it's this or the other way around, the bottom line doesn't change: We assume a null pointer dereference is a sure crash -- when it results in an attempt to access memory at 0x0 (+ a small offset).

The responses people have provided seem to converge on the following reasoning wrt CHECKing before a dereference:

Since a null pointer dereference a sure crash, it's not strictly necessary to CHECK as there is no security risk. At the same time, CHECKing is typically not harmful and can be helpful for debugging, so feel free to do it (modulo hotspots). Try to place any such CHECKs strategically (at API boundaries) to avoid spamming the code with CHECKs.

Does that sound like a fair summary?

Yep this sounds good to me!

Wojciech Dzierżanowski

unread,
Dec 17, 2024, 6:47:48 AM12/17/24
to Matt Denton, Peter Kasting, pb...@chromium.org, cxx
On Tue, Dec 17, 2024 at 1:03 AM Matt Denton <mpde...@google.com> wrote:
On Mon, Dec 16, 2024 at 4:18 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
Thank you all for replying!

Because we compile with (the badly named) -fno-delete-null-pointer-checks flag, clang treats null-pointer deref as defined behavior.

Maybe it's also badly documented :), but looking at the docs and related Chromium tasks (as no expert on this by any means), I get the impression it's the other way around (please correct me if I'm confused):

This is the documentation for -fdelete-null-pointer-checks whereas we pass -fno-delete-null-pointer-checks. Our flags make null pointer deref into *defined* behavior, i.e. the compiler assumes that data can live at null.

Yes, I realize that we pass -fno-delete-null-pointer-checks :). The point that I was trying to make is that the assumption that a null pointer dereference crashes deterministically (on our target platforms), i.e., that it's defined behavior for us, comes first. It's how our target platforms work. -fno-delete-null-pointer-checks has no effect on whether a null pointer dereference crashes predictably or not; our platforms have. (It does have an effect on optimizations that would otherwise be made on the compiler's default, standard-compliant assumption that a null pointer dereference is undefined behavior.) So if we're looking for a reason to be able to say "CHECKing before a deref is not strictly necessary", the reason is because our platforms are such that attempting to access memory close to the address 0x0 triggers a fault. The fact that we feel confident passing -fno-delete-null-pointer-checks is another consequence of this behavior.

Does that make sense at all?
 
We assume a null pointer dereference crashes deterministically on the platforms we target. Furthermore, we want to be able to rely on this behavior so we tell the compiler not to optimize around a null pointer dereference being undefined behavior. In particular, with a base::WeakPtr referencing a destroyed object we really prefer a reliable crash over a UaF. The latter is possible (or at least was possible before hardening?) if the compiler removes a code block guarded by a null pointer check

This is where it is badly named. :) There is no null check. In this WeakPtr case (pre-hardening), there is a branch that stores null to a pointer, and later we unconditionally dereference that pointer. Without  -fno-delete-null-pointer-checks the compiler would assume that because the null store leads to undefined behavior (derefing null), it can just assume that branch is never taken and delete it (this is critical to lots of compiler optimizations when inlining code). But then we never overwrote our dangling pointer with 0, and so a UAF results when we deref the dangling pointer.

Right. I tend to think about this in terms of actual null checks, but in this case we just overwrite a pointer upon destruction. Thanks for clearing this up!

 
Does that sound like a fair summary?

Yep this sounds good to me!

Awesome!
Do people think it would be worthwhile to bring back some guidance on this to styleguide/c++/checks.md?

--
Wojtek

Peter Kasting

unread,
Dec 17, 2024, 8:25:31 AM12/17/24
to Wojciech Dzierżanowski, Matt Denton, pb...@chromium.org, cxx
On Tue, Dec 17, 2024 at 3:47 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
Do people think it would be worthwhile to bring back some guidance on this to styleguide/c++/checks.md?

Are you worried people won't write enough CHECKs today, or will write too many, or write them in the wrong places, or ...?

We have a lot of guidance around CHECKs already and some evidence most people don't read and remember what we do have, so I'd like to understand what problem we want to solve before I know whether I agree with adding more guidance.

PK

Wojciech Dzierżanowski

unread,
Dec 17, 2024, 9:20:48 AM12/17/24
to Peter Kasting, Matt Denton, pb...@chromium.org, cxx
The thing that prompted this thread was observing recurrent internal discussions on the use of CHECKs with pointers before dereferencing. It doesn't seem like everyone is automatically on the same page here (at least internally, I can't speak of the wider Chromium community). While researching the topic someone found the previous guidance in styleguide/c++/c++.md, which made it explicit that a null pointer dereference is defined behavior in our case. The fact that this guidance was removed sowed some doubt about this assumption. So probably the only bit I'm finding to be potentially worth adding to styleguide/c++/checks.md is the clarification that undefined behavior cannot be evoked as reason to CHECK before a pointer deref in Chromium.

Just a suggestion. I agree that there's a limit to the level of detail that makes sense in guidelines.

--
Wojtek

Peter Kasting

unread,
Dec 17, 2024, 10:33:16 AM12/17/24
to Wojciech Dzierżanowski, Matt Denton, pb...@chromium.org, cxx
On Tue, Dec 17, 2024 at 6:20 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
The thing that prompted this thread was observing recurrent internal discussions on the use of CHECKs with pointers before dereferencing.

In codereviews? Do we have reviewers who are asking for things not in the styleguide, and not saying it's optional personal preference?

probably the only bit I'm finding to be potentially worth adding to styleguide/c++/checks.md is the clarification that undefined behavior cannot be evoked as reason to CHECK before a pointer deref in Chromium.

I wonder if we should actually shorten the guidance further, and then link to a longer explainer doc.

PK 

Peter Boström

unread,
Dec 17, 2024, 3:00:39 PM12/17/24
to Peter Kasting, Wojciech Dzierżanowski, Matt Denton, cxx
I think it's acceptable to let local author/reviewer preferences dictate here and not document anything imo. If you wanna CHECK it just CHECK it (you're allowed to CHECK any preconditions even if you don't strictly go off the rails if they're violated), if you feel like it's noisy, omit it. It's neither necessary nor useless.

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.

Matt Denton

unread,
Dec 17, 2024, 4:19:43 PM12/17/24
to Wojciech Dzierżanowski, Peter Kasting, pb...@chromium.org, cxx
On Tue, Dec 17, 2024 at 3:47 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
On Tue, Dec 17, 2024 at 1:03 AM Matt Denton <mpde...@google.com> wrote:
On Mon, Dec 16, 2024 at 4:18 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
Thank you all for replying!

Because we compile with (the badly named) -fno-delete-null-pointer-checks flag, clang treats null-pointer deref as defined behavior.

Maybe it's also badly documented :), but looking at the docs and related Chromium tasks (as no expert on this by any means), I get the impression it's the other way around (please correct me if I'm confused):

This is the documentation for -fdelete-null-pointer-checks whereas we pass -fno-delete-null-pointer-checks. Our flags make null pointer deref into *defined* behavior, i.e. the compiler assumes that data can live at null.

Yes, I realize that we pass -fno-delete-null-pointer-checks :). The point that I was trying to make is that the assumption that a null pointer dereference crashes deterministically (on our target platforms), i.e., that it's defined behavior for us, comes first. It's how our target platforms work. -fno-delete-null-pointer-checks has no effect on whether a null pointer dereference crashes predictably or not; our platforms have. (It does have an effect on optimizations that would otherwise be made on the compiler's default, standard-compliant assumption that a null pointer dereference is undefined behavior.) So if we're looking for a reason to be able to say "CHECKing before a deref is not strictly necessary", the reason is because our platforms are such that attempting to access memory close to the address 0x0 triggers a fault. The fact that we feel confident passing -fno-delete-null-pointer-checks is another consequence of this behavior.

Does that make sense at all?

Good point, yes this makes sense to me.

David Benjamin

unread,
Dec 17, 2024, 6:19:54 PM12/17/24
to Matt Denton, Wojciech Dzierżanowski, Peter Kasting, pb...@chromium.org, cxx
On Tue, Dec 17, 2024 at 4:19 PM 'Matt Denton' via cxx <c...@chromium.org> wrote:
On Tue, Dec 17, 2024 at 3:47 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
On Tue, Dec 17, 2024 at 1:03 AM Matt Denton <mpde...@google.com> wrote:
On Mon, Dec 16, 2024 at 4:18 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
Thank you all for replying!

Because we compile with (the badly named) -fno-delete-null-pointer-checks flag, clang treats null-pointer deref as defined behavior.

Maybe it's also badly documented :), but looking at the docs and related Chromium tasks (as no expert on this by any means), I get the impression it's the other way around (please correct me if I'm confused):

This is the documentation for -fdelete-null-pointer-checks whereas we pass -fno-delete-null-pointer-checks. Our flags make null pointer deref into *defined* behavior, i.e. the compiler assumes that data can live at null.

Yes, I realize that we pass -fno-delete-null-pointer-checks :). The point that I was trying to make is that the assumption that a null pointer dereference crashes deterministically (on our target platforms), i.e., that it's defined behavior for us, comes first. It's how our target platforms work. -fno-delete-null-pointer-checks has no effect on whether a null pointer dereference crashes predictably or not; our platforms have. (It does have an effect on optimizations that would otherwise be made on the compiler's default, standard-compliant assumption that a null pointer dereference is undefined behavior.) So if we're looking for a reason to be able to say "CHECKing before a deref is not strictly necessary", the reason is because our platforms are such that attempting to access memory close to the address 0x0 triggers a fault. The fact that we feel confident passing -fno-delete-null-pointer-checks is another consequence of this behavior.

Does that make sense at all?

Good point, yes this makes sense to me.
 
I don't think this is quite right. It is not true that dereferencing null at the language level will necessarily crash as a null dereference when lowered to hardware. Sometimes the compiler does not actually emit code to dereference the pointer, e.g. if you bind null to a reference and happen not to use it, or call a non-virtual method that happens not to dereference this. (Or maybe the compiler decided to delete the memory read and reorder stuff.) When getting the UBSan bot green, fixing instances of this was probably one of the bigger time sinks, and frequent regressions.

That means the CHECK is not actually redundant. NB: I'm just saying that it's not redundant. I'm not arguing that we should therefore aggressively CHECK everything. That sounds kinda tedious and will probably result in a bunch of extra runtime checks. I suspect null-safety is better handled at the type system, though we don't have great tools for that in C++.

>  The fact that we feel confident passing -fno-delete-null-pointer-checks is another consequence of this behavior.

I think it's the exact opposite. :-) We pass it because this behavior is not true.

If we knew that all language-level null derefs reliably lowered to hardware-level null derefs and therefore had defined trapping behavior, we would be perfectly comfortable letting the compiler exploit this fact to optimize code. However, we are not confident of this and, in fact, we know that language null derefs are sometimes left unchecked when the compiler lowers to hardware. See the examples listed above.

Now, because there are unchecked null derefs, that means these compiler optimizations might arbitrarily mangle our code when we get it wrong, risking security vulnerabilities for our users. And so we tell the compiler that it cannot reason about the language's null deref preconditions, because we would rather have safety here over performance. (Also I suspect we're not really losing out on that many optimizations here. Although, ironically, the optimizations it would probably help with the most are deleting defensive null CHECKs. :-D)
 
We assume a null pointer dereference crashes deterministically on the platforms we target. Furthermore, we want to be able to rely on this behavior so we tell the compiler not to optimize around a null pointer dereference being undefined behavior. In particular, with a base::WeakPtr referencing a destroyed object we really prefer a reliable crash over a UaF. The latter is possible (or at least was possible before hardening?) if the compiler removes a code block guarded by a null pointer check

This is where it is badly named. :) There is no null check. In this WeakPtr case (pre-hardening), there is a branch that stores null to a pointer, and later we unconditionally dereference that pointer. Without  -fno-delete-null-pointer-checks the compiler would assume that because the null store leads to undefined behavior (derefing null), it can just assume that branch is never taken and delete it (this is critical to lots of compiler optimizations when inlining code). But then we never overwrote our dangling pointer with 0, and so a UAF results when we deref the dangling pointer.

Right. I tend to think about this in terms of actual null checks, but in this case we just overwrite a pointer upon destruction. Thanks for clearing this up!

 
Does that sound like a fair summary?

Yep this sounds good to me!

Awesome!
Do people think it would be worthwhile to bring back some guidance on this to styleguide/c++/checks.md?

--
Wojtek

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.

Wojciech Dzierżanowski

unread,
Dec 18, 2024, 6:38:34 AM12/18/24
to David Benjamin, Matt Denton, Peter Kasting, pb...@chromium.org, cxx
On Wed, Dec 18, 2024 at 12:19 AM David Benjamin <davi...@chromium.org> wrote:
On Tue, Dec 17, 2024 at 4:19 PM 'Matt Denton' via cxx <c...@chromium.org> wrote:
On Tue, Dec 17, 2024 at 3:47 AM Wojciech Dzierżanowski <wdzierz...@opera.com> wrote:
Yes, I realize that we pass -fno-delete-null-pointer-checks :). The point that I was trying to make is that the assumption that a null pointer dereference crashes deterministically (on our target platforms), i.e., that it's defined behavior for us, comes first. It's how our target platforms work. -fno-delete-null-pointer-checks has no effect on whether a null pointer dereference crashes predictably or not; our platforms have. (It does have an effect on optimizations that would otherwise be made on the compiler's default, standard-compliant assumption that a null pointer dereference is undefined behavior.) So if we're looking for a reason to be able to say "CHECKing before a deref is not strictly necessary", the reason is because our platforms are such that attempting to access memory close to the address 0x0 triggers a fault. The fact that we feel confident passing -fno-delete-null-pointer-checks is another consequence of this behavior.

Does that make sense at all?

Good point, yes this makes sense to me.
 
I don't think this is quite right. It is not true that dereferencing null at the language level will necessarily crash as a null dereference when lowered to hardware. Sometimes the compiler does not actually emit code to dereference the pointer, e.g. if you bind null to a reference and happen not to use it, or call a non-virtual method that happens not to dereference this. (Or maybe the compiler decided to delete the memory read and reorder stuff.) When getting the UBSan bot green, fixing instances of this was probably one of the bigger time sinks, and frequent regressions.

Yes, "null pointer dereference equals crash for us" was an oversimplification. Thanks for pointing that out.
I skipped the non-crashing cases out of fear of further complicating an already complicated train of thought. In my mind, I also probably downplayed the significance of those cases as it seemed that no real harm can be done (not directly at least). That may have been a mistake.
 
That means the CHECK is not actually redundant. NB: I'm just saying that it's not redundant. I'm not arguing that we should therefore aggressively CHECK everything. That sounds kinda tedious and will probably result in a bunch of extra runtime checks.
 
Earlier in this thread we already agreed that CHECKing can be a great documentation and debugging tool when it specifies preconditions and catches unintended nulls early. So definitely not redundant :)

But then the answer to the question of whether it's necessary to CHECK before every dereference seemed to be no it's not, because there is no security risk; we will crash before any harm can be done. If you're saying that the non-crashing cases do pose a security risk then I'm not sure how to square that with "not arguing that we should therefore aggressively CHECK everything".

I suspect null-safety is better handled at the type system, though we don't have great tools for that in C++.

IIUC, raw_ptr at least has the ability to crash on an attempted null dereference. So things should get better as we replace more raw pointers with raw_ptr :)
 
>  The fact that we feel confident passing -fno-delete-null-pointer-checks is another consequence of this behavior.

I think it's the exact opposite. :-) We pass it because this behavior is not true.

If we knew that all language-level null derefs reliably lowered to hardware-level null derefs and therefore had defined trapping behavior, we would be perfectly comfortable letting the compiler exploit this fact to optimize code. However, we are not confident of this and, in fact, we know that language null derefs are sometimes left unchecked when the compiler lowers to hardware. See the examples listed above.

Now, because there are unchecked null derefs, that means these compiler optimizations might arbitrarily mangle our code when we get it wrong, risking security vulnerabilities for our users. And so we tell the compiler that it cannot reason about the language's null deref preconditions, because we would rather have safety here over performance. (Also I suspect we're not really losing out on that many optimizations here. Although, ironically, the optimizations it would probably help with the most are deleting defensive null CHECKs. :-D)

It's often easier to discuss things with concrete examples in mind. In the context of -fno-delete-null-pointer-checks the only example I'm aware of is base::WeakPtr. The compiler took the opportunity to remove code (setting an "invalid" flag in base::WeakPtr) when it saw that executing that path would lead to undefined behavior (unconditional dereference of a null pointer). The only reason -fno-delete-null-pointer-checks removes a vulnerability in this case is because we know that an unconditional dereference of a null pointer is trapped at the hardware level. So we get a trap instead of a use-after-free. This is what I meant by saying that passing -fno-delete-null-pointer-checks is a consequence of us assuming the trapping behavior. Passing -fno-delete-null-pointer-checks without the conviction that a null deref is trapped would mean we're actually OK with undefined behavior when a dangling base::WeakPtr is dereferenced. The trapping behavior is what makes it defined.

--
Wojtek

David Benjamin

unread,
Dec 18, 2024, 3:47:52 PM12/18/24
to Wojciech Dzierżanowski, Matt Denton, Peter Kasting, pb...@chromium.org, cxx
I think that is still backwards. The flag is about reducing the impacts of undefined behavior. Null derefs are not consistently trapped, and that is why we need to pass that pointer. Here are a pair of concrete examples of where the compiler would, by default, delete null pointer checks even though the compiler does not trap. The flag stops the compiler from performing this transform.

If it were the case that every time we wrote `*ptr`, `ptr->Foo()`, etc., the compiler reliably emitted code that trapped (either by way of the hardware's 0 page handling or otherwise), we would not need this flag. But that is not reality, so we pass the flag to bound the impact of this UB.

David

Wojciech Dzierżanowski

unread,
Dec 18, 2024, 3:49:27 PM12/18/24
to Peter Boström, Peter Kasting, Matt Denton, cxx
Fair enough.
I was wondering if people think this question comes up often enough to have the styleguide say something about it (and evidently people did think so at one point).
But I totally understand your unwillingness to overspecify the CHECKing rules.

Thank you for your comments!
--
Wojtek

Wojciech Dzierżanowski

unread,
Dec 18, 2024, 4:28:05 PM12/18/24
to David Benjamin, Matt Denton, Peter Kasting, pb...@chromium.org, cxx
Those are great (and somewhat perversely entertaining) examples, thanks! :)

Is it possible that this is not an either/or thing?
I mean I still think that we need to rely on the dereference to crash (if it leads to accessing memory close to 0x0) in order for -fno-delete-null-pointer-checks to be a good trade-off in the base::WeakPtr case. This allows us to exchange a UAF for a "crash in many cases". Otherwise, we would be trading a UAF for completely undefined behavior.

I think I understand the idea of bounding the impact of UB with -fno-delete-null-pointer-checks better now, thank you. Though I am a bit doubtful about being able to judge that a program containing UB, i.e., an ill-formed program, is "safer" than a transformed ill-formed program. The compiler authors seem to disagree, seeing how these optimizations are enabled by default and neither the clang nor GCC documentation mentions safety.

--
Wojtek
Reply all
Reply to author
Forward
0 new messages