Covering raw_ptr<T> in Chromium C++ Style Guide

945 views
Skip to first unread message

Łukasz Anforowicz

unread,
Nov 15, 2021, 2:03:37 PM11/15/21
to cxx, Keishi Hattori, Bartek Nowierski
Hello c...@chromium.org,

Based on recent experiments, the go/miracleptr project is quite close to being ready for the Big Rewrite, where we will rewrite `SomeClass* field_` fields into `raw_ptr<SomeClass> field_`.  While we prepare for this rewrite (including drafting a broader PSA + scheduling a tree closure), it seems useful to start discussing how the Chromium C++ Style Guide can accommodate `raw_ptr<T>`.

Before I dive deeper into details, let me summarize what raw_ptr<T> is (copy&pasting from the "raw_ptr<T> Usage Rules" doc):

raw_ptr<T> is a drop-in replacement for raw pointers. It behaves just like a raw pointer with an exception that it is zero-initialized and cleared on destruction and move. Unlike std::unique_ptr<T>, base::scoped_refptr<T>, etc., it doesn’t manage ownership or lifetime of an allocated object - you are still responsible for freeing the object when no longer used, just as you would with a raw pointer.

raw_ptr<T> prevents some Use-after-Free (UaF) bugs from being exploitable (by poisoning the freed memory and quarantining it as long as a dangling raw_ptr<T> exists), but dereferencing a raw_ptr<T> to freed memory may still crash.

Initialization, assignment, and destruction of a raw_ptr<T> incur additional performance overhead compared to a raw pointer. There is no overhead when dereferencing a pointer, however.

We plan to move most of the contents of the "raw_ptr<T> Usage Rules" doc into //base/memory/raw_ptr.md, but we think that the following high-level rules should become part of the Chromium C++ Style Guide:

In general, raw pointers should be avoided and a smart pointer with the right ownership semantics should be used instead.  Google C++ Style Guide further clarifies to prefer unique ownership (i.e. std::unique_ptr) over shared ownership (i.e. scoped_refptr or base::WeakPtr).  For non-owning pointers, Chromium provides raw_ptr<T> as a security-beneficial alternative to raw T* pointers. When to use raw_ptr<T> is described in the high level guidelines below.


Use raw_ptr<T> instead of a raw T* pointer in every class/struct field, unless:

  1. The pointer has type |const char*|, is a pointer to an Objective-C object, or is a function pointer.

  2. The field is defined in a source file listed in .../raw_ptr_exception_paths.txt. Acceptable exclusions cover:

    1. Renderer-only code.

    2. Code that cannot depend on //base.

    3. Code in //ppapi.

  3. The field is annotated with the RAW_PTR_PLUGIN_IGNORE attribute. Acceptable exclusions cover:

    1. All of its possible values point outside of PartitionAlloc (e.g. literals, stack allocated memory, shared memory, mmap'ed memory, V8/Oilpan/Java heaps, TLS, etc.).

    2. It doesn’t compile (when it happens, consult Recoverable compile-time problems and You likely won’t be able to make it compile first).

    3. [very rare] It causes regression on perf bots.

    4. [very rare] It leads to runtime errors (i.e. when following the Extra pointer rules is not possible).


The rules above are *not* enforced yet - before we confirm raw_ptr’s performance characteristics through Stable channel experiments (in the longer term violating the rules will lead to compile errors through Chromium Clang Plugin).  For more details, please refer to //base/memory/raw_ptr.md.


Do you have any feedback or comments on the proposal above?  Do you think the raw_ptr<T> rules above can be covered in a subsection of //styleguide/c++/c++.md (maybe next to the "Object ownership and calling conventions" section)?  Or maybe you'd suggest covering the rules above in a separate file/document under //styleguide/c++?

Thanks,

Lukasz

dan...@chromium.org

unread,
Nov 16, 2021, 4:51:11 PM11/16/21
to Łukasz Anforowicz, cxx, Keishi Hattori, Bartek Nowierski
On Mon, Nov 15, 2021 at 2:03 PM Łukasz Anforowicz <luk...@chromium.org> wrote:
Hello c...@chromium.org,

Based on recent experiments, the go/miracleptr project is quite close to being ready for the Big Rewrite, where we will rewrite `SomeClass* field_` fields into `raw_ptr<SomeClass> field_`.  While we prepare for this rewrite (including drafting a broader PSA + scheduling a tree closure), it seems useful to start discussing how the Chromium C++ Style Guide can accommodate `raw_ptr<T>`.

Before I dive deeper into details, let me summarize what raw_ptr<T> is (copy&pasting from the "raw_ptr<T> Usage Rules" doc):

raw_ptr<T> is a drop-in replacement for raw pointers. It behaves just like a raw pointer with an exception that it is zero-initialized and cleared on destruction and move. Unlike std::unique_ptr<T>, base::scoped_refptr<T>, etc., it doesn’t manage ownership or lifetime of an allocated object - you are still responsible for freeing the object when no longer used, just as you would with a raw pointer.

raw_ptr<T> prevents some Use-after-Free (UaF) bugs from being exploitable (by poisoning the freed memory and quarantining it as long as a dangling raw_ptr<T> exists), but dereferencing a raw_ptr<T> to freed memory may still crash.

Initialization, assignment, and destruction of a raw_ptr<T> incur additional performance overhead compared to a raw pointer. There is no overhead when dereferencing a pointer, however.

We plan to move most of the contents of the "raw_ptr<T> Usage Rules" doc into //base/memory/raw_ptr.md, but we think that the following high-level rules should become part of the Chromium C++ Style Guide:

In general, raw pointers should be avoided and a smart pointer with the right ownership semantics should be used instead.  Google C++ Style Guide further clarifies to prefer unique ownership (i.e. std::unique_ptr) over shared ownership (i.e. scoped_refptr or base::WeakPtr).


WeakPtr isn't shared ownership, it's visibility and runtime detection of/decisions based on lifetime.
 

For non-owning pointers, Chromium provides raw_ptr<T> as a security-beneficial alternative to raw T* pointers. When to use raw_ptr<T> is described in the high level guidelines below.


I think it would be useful to talk about when to use raw_ptr vs WeakPtr.
 

Use raw_ptr<T> instead of a raw T* pointer in every class/struct field, unless:

  1. The pointer has type |const char*|


Do you mean, the pointer is to a static string? The rewriter can't tell but humans can.
 
  1. , is a pointer to an Objective-C object, or is a function pointer.

  2. The field is defined in a source file listed in .../raw_ptr_exception_paths.txt. Acceptable exclusions cover:

    1. Renderer-only code.

    2. Code that cannot depend on //base.


Should we move raw_ptr out of base then? What can't dep on base?

    1. Code in //ppapi. 

  1. The field is annotated with the RAW_PTR_PLUGIN_IGNORE attribute. Acceptable exclusions cover:


I would say that a clang plugin will enforce this in the future, but that the RAW_PTR_PLUGIN_IGNORE attribute can be used to override it.
 
    1. All of its possible values point outside of PartitionAlloc (e.g. literals, stack allocated memory, shared memory, mmap'ed memory, V8/Oilpan/Java heaps, TLS, etc.).


This is hard to verify, would we really care to annotate it? Then if it changes we wouldn't know. Why not leave it as raw_ptr?
 
    1. It doesn’t compile (when it happens, consult Recoverable compile-time problems and You likely won’t be able to make it compile first).

    2. [very rare] It causes regression on perf bots.

    3. [very rare] It leads to runtime errors (i.e. when following the Extra pointer rules is not possible).


The rules above are *not* enforced yet - before we confirm raw_ptr’s performance characteristics through Stable channel experiments (in the longer term violating the rules will lead to compile errors through Chromium Clang Plugin).  For more details, please refer to //base/memory/raw_ptr.md.


Do you have any feedback or comments on the proposal above?  Do you think the raw_ptr<T> rules above can be covered in a subsection of //styleguide/c++/c++.md (maybe next to the "Object ownership and calling conventions" section)?

This section is about function parameters only. And the Google style guide only talks about ownership.

Normally I'd say this should be in https://source.chromium.org/chromium/chromium/src/+/main:styleguide/c++/c++-dos-and-donts.md but I think "how to write a non-owning field" is of pretty high consequence and broad reaching, and is something a person should know before starting to write Chrome code. So I think adding a "Non-owning pointers in class fields" section to the styleguide would make sense.
 
  Or maybe you'd suggest covering the rules above in a separate file/document under //styleguide/c++?

Thanks,

Lukasz

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAA_NCUEaAEqtWs5D0a_cPL%2Be3dgcFJ-4wLTMfEHifTpybzGJVw%40mail.gmail.com.

Bartek Nowierski

unread,
Nov 16, 2021, 7:04:31 PM11/16/21
to dan...@chromium.org, Łukasz Anforowicz, cxx, Keishi Hattori
Thanks Dana!

Below my responses to your suggestions so far.

I think it would be useful to talk about when to use raw_ptr vs WeakPtr.
We want to make the rule extremely simple, and prefer not to go beyond "use raw_ptr<> where you'd use raw pointer" here.

Do you mean, the pointer is to a static string? The rewriter can't tell but humans can.
We decided to not deal with |const char*| in v1. If we can't rewrite, we can't enforce, so decided not to burden devs with this decision for now. Not to mention it'd be easy to get it wrong. We'll tackle this once we have StringLiteralPtr, which will make it harder to make a mistake.

Should we move raw_ptr out of base then? What can't dep on base?

I would say that a clang plugin will enforce this in the future, but that the RAW_PTR_PLUGIN_IGNORE attribute can be used to override it.
It'll kinda divert from the "use raw_ptr<T> unless" flow. 

All of its possible values point outside of PartitionAlloc (e.g. literals, stack allocated memory, shared memory, mmap'ed memory, V8/Oilpan/Java heaps, TLS, etc.).
This is hard to verify, would we really care to annotate it? Then if it changes we wouldn't know. Why not leave it as raw_ptr?
For perf reasons.


Cheers,
Bartek

dan...@chromium.org

unread,
Nov 17, 2021, 11:11:44 AM11/17/21
to Bartek Nowierski, Łukasz Anforowicz, cxx, Keishi Hattori
On Tue, Nov 16, 2021 at 7:04 PM Bartek Nowierski <bar...@chromium.org> wrote:
Thanks Dana!

Below my responses to your suggestions so far.

I think it would be useful to talk about when to use raw_ptr vs WeakPtr.
We want to make the rule extremely simple, and prefer not to go beyond "use raw_ptr<> where you'd use raw pointer" here.

Do you mean, the pointer is to a static string? The rewriter can't tell but humans can.
We decided to not deal with |const char*| in v1. If we can't rewrite, we can't enforce, so decided not to burden devs with this decision for now. Not to mention it'd be easy to get it wrong. We'll tackle this once we have StringLiteralPtr, which will make it harder to make a mistake.

Should we move raw_ptr out of base then? What can't dep on base?

I would say that a clang plugin will enforce this in the future, but that the RAW_PTR_PLUGIN_IGNORE attribute can be used to override it.
It'll kinda divert from the "use raw_ptr<T> unless" flow. 

All of its possible values point outside of PartitionAlloc (e.g. literals, stack allocated memory, shared memory, mmap'ed memory, V8/Oilpan/Java heaps, TLS, etc.).
This is hard to verify, would we really care to annotate it? Then if it changes we wouldn't know. Why not leave it as raw_ptr?
For perf reasons.

I think high level what we're not agreeing on is:

- These rules seem to match what the rewriter did. And that makes sense - it's a mass rewrite of all our existing code, without any human thought into each case.
- What I believe the style guide is for is for people writing or editing code by hand, where there is a lot more context and understanding, and it's a single pointer at a time.

So I don't really believe these need to be the same, and we should error toward simpler/fewer rules in the style guide. People will and can always make exceptions when they must for some perf critical case or something. But we should give the guidance for what we expect.

So for example storing a pointer T* on a class, that happens to be to stack memory - AutoReset perhaps is an example of this. Sometimes it may be on the stack, in which case raw_ptr<T> is useless. But over time that may change and it may hold a non-stack pointer too. So it's much safer and simpler to say all raw pointer fields you write are raw_ptr<T> and then maybe the runtime effect is nothing, and maybe in new usage it does something, or maybe raw_ptr evolves and it does something. But the rule is simple.

Do you agree on this higher level direction for advice?
 

Łukasz Anforowicz

unread,
Nov 17, 2021, 11:38:18 AM11/17/21
to dan...@chromium.org, Bartek Nowierski, cxx, Keishi Hattori
On Wed, Nov 17, 2021 at 8:11 AM <dan...@chromium.org> wrote:
On Tue, Nov 16, 2021 at 7:04 PM Bartek Nowierski <bar...@chromium.org> wrote:
Thanks Dana!

Below my responses to your suggestions so far.

I think it would be useful to talk about when to use raw_ptr vs WeakPtr.
We want to make the rule extremely simple, and prefer not to go beyond "use raw_ptr<> where you'd use raw pointer" here.

Do you mean, the pointer is to a static string? The rewriter can't tell but humans can.
We decided to not deal with |const char*| in v1. If we can't rewrite, we can't enforce, so decided not to burden devs with this decision for now. Not to mention it'd be easy to get it wrong. We'll tackle this once we have StringLiteralPtr, which will make it harder to make a mistake.

Should we move raw_ptr out of base then? What can't dep on base?

I would say that a clang plugin will enforce this in the future, but that the RAW_PTR_PLUGIN_IGNORE attribute can be used to override it.
It'll kinda divert from the "use raw_ptr<T> unless" flow. 

All of its possible values point outside of PartitionAlloc (e.g. literals, stack allocated memory, shared memory, mmap'ed memory, V8/Oilpan/Java heaps, TLS, etc.).
This is hard to verify, would we really care to annotate it? Then if it changes we wouldn't know. Why not leave it as raw_ptr?
For perf reasons.

I think high level what we're not agreeing on is:

- These rules seem to match what the rewriter did. And that makes sense - it's a mass rewrite of all our existing code, without any human thought into each case.
FWIW the rule being commented on here/above (i.e. excluding [for performance reasons] cases where the pointee is outside of PartitionAlloc) is very much a human-guidance/decision rule (and not something that the rewriter is aware of).
 
- What I believe the style guide is for is for people writing or editing code by hand, where there is a lot more context and understanding, and it's a single pointer at a time.

So I don't really believe these need to be the same, and we should error toward simpler/fewer rules in the style guide. People will and can always make exceptions when they must for some perf critical case or something. But we should give the guidance for what we expect.

So for example storing a pointer T* on a class, that happens to be to stack memory - AutoReset perhaps is an example of this. Sometimes it may be on the stack, in which case raw_ptr<T> is useless. But over time that may change and it may hold a non-stack pointer too. So it's much safer and simpler to say all raw pointer fields you write are raw_ptr<T> and then maybe the runtime effect is nothing, and maybe in new usage it does something, or maybe raw_ptr evolves and it does something. But the rule is simple.

Do you agree on this higher level direction for advice?
I agree that the guidelines are for humans.  Maybe the guidelines are currently mixing 2-3 kinds of advice/rules:
  • Advice/rules on situations that *cannot* be automatically analyzed (where a human is required to decide whether to use raw_ptr or not).  Examples:
    • Deciding whether a source/directory path can be added to raw_ptr_exception_paths.txt labeled with the (proposed/hypothetical) RAW_PTR_EXCLUSION attribute:
      • Path-based exclusions: Renderer-only code.  Code that cannot depend on //base.  Code in //ppapi.
      • Per-field exclusions: Pointees outside of PartitionAlloc.  "You likely won’t be able to make it compile" section (unclear how many of these can be encoded in the plugin;  some require 2 passes in the rewriter).  Perf regressions.  Runtime errors.
  • Rules for situations where we can automatically (e.g. in the rewriter or the Chromium Clang Plugin in the future) decide whether to use raw_ptr or not.
    • Humans might still want to know what situations will trip the Chromium Clang Plugin.  Examples:
      • const char*, ObjC, function pointers
      • Saying that raw_ptr<T> is only needed in fields of structs/classes (implicitly excluding parameters+localvariables and unions)
    • Humans might still want to know how to refactor code to support raw_ptr<T> when naive s/T*/raw_ptr<T>/g replacement doesn't work.  Examples:
      • Recoverable compile-time problems
It seems to me that humans care about all of the above, but I am open to arguments otherwise and/or suggestions on how to reword the guidelines.
Ack.  Adding a "Non-owning pointers in class fields" section to the styleguide SGTM!

dan...@chromium.org

unread,
Nov 17, 2021, 12:35:46 PM11/17/21
to Łukasz Anforowicz, Bartek Nowierski, cxx, Keishi Hattori
I agree that people may need or want this information, but this is how to use and implementation details of raw_ptr. I think this level of detail makes sense in the base/memory/raw_ptr.md file.

I think the level of direction that makes sense in a style guide is more like:

- Use raw_ptr<T> for class fields in place of a raw pointer T* to a C/C++ type whenever possible, except in Blink.
- Explain what raw_ptr does in a sentence.
- Since raw_ptr<T> has runtime overhead, this may not be possible in rare cases for performance reasons.
- A raw_ptr<T> can't point to objC types or functions.
- Blink uses Oilpan to manage memory and should use GC-aware smart pointers or raw pointers (T*) instead.
- Link to more details in raw_ptr.md

Then write a PRESUBMIT in //ppapi that bans raw_ptr so we avoid it there.

Does that make sense?

Łukasz Anforowicz

unread,
Nov 17, 2021, 1:26:33 PM11/17/21
to dan...@chromium.org, Kentaro Hara, Bartek Nowierski, cxx, Keishi Hattori
Makes sense.  We wanted to be very transparent and comprehensive in the high-level rule when presenting to the EngReview and discussing the ergonomics overhead of raw_ptr<T>.  I agree that the rules can be further distilled and simplified for inclusion in the Chromium C++ Styleguide.  (+Kentaro Hara as FYI.)

I think the initial iterations might be easiest in a Google Doc (where it is slightly easier/faster to proposed suggestions and add comments than in git/Gerrit).  Therefore, I took your bullet items above and put them into the "High-level rules (distilled/simplified - for C++ Styleguide)" section in the doc here: https://docs.google.com/document/d/1fdYDDf5kSRAFKJF5Omp1xCKzlhycV1jKwaCkJPGGK7s/edit#heading=h.jubgcrks1vch   (PTAL).

I'll work on auditing the more comprehensive/full rules and making sure they are covered in raw_ptr.md.  I'll move the C++ Styleguide changes into a CL tomorrow/Thursday (by then the reviews/edits in the doc will hopefully be more or less settled).

Kentaro Hara

unread,
Nov 18, 2021, 1:46:37 AM11/18/21
to Łukasz Anforowicz, dan...@chromium.org, Bartek Nowierski, cxx, Keishi Hattori
I like having Dana's distilled version at the top of raw_ptr.md :)

--
Kentaro Hara, Tokyo

Bartek Nowierski

unread,
Nov 18, 2021, 10:05:30 AM11/18/21
to Kentaro Hara, Łukasz Anforowicz, dan...@chromium.org, cxx, Keishi Hattori
I think high level what we're not agreeing on is:
- These rules seem to match what the rewriter did. And that makes sense - it's a mass rewrite of all our existing code, without any human thought into each case.
- What I believe the style guide is for is for people writing or editing code by hand, where there is a lot more context and understanding, and it's a single pointer at a time.

I appreciate your desire to increase the scope of protection. We have the same goal. Please keep in mind a couple things:
1. If we can't rewrite something automatically, we can't enforce it (with exception of a small number of classes of exceptions that are well contained and we can handle manually after Big Rewrite).
2. Certain classes of exceptions (e.g. Renderer-only code, const char*, pointers in templated collections/wrappers) will be evaluated as a follow-up project and better to not touch them yet.

dan...@chromium.org

unread,
Nov 18, 2021, 10:22:53 AM11/18/21
to Bartek Nowierski, Kentaro Hara, Łukasz Anforowicz, cxx, Keishi Hattori
On Thu, Nov 18, 2021 at 10:05 AM Bartek Nowierski <bar...@chromium.org> wrote:
I think high level what we're not agreeing on is:
- These rules seem to match what the rewriter did. And that makes sense - it's a mass rewrite of all our existing code, without any human thought into each case.
- What I believe the style guide is for is for people writing or editing code by hand, where there is a lot more context and understanding, and it's a single pointer at a time.

I appreciate your desire to increase the scope of protection. We have the same goal. Please keep in mind a couple things:
1. If we can't rewrite something automatically, we can't enforce it (with exception of a small number of classes of exceptions that are well contained and we can handle manually after Big Rewrite).

This is fine. Generally if it's fully enforced by the compiler there's no reason to talk about it in the style guide. They don't need to say the same things. Simpler rules are better than explaining what errors the compiler generates or doesn't.
 
2. Certain classes of exceptions (e.g. Renderer-only code, const char*, pointers in templated collections/wrappers) will be evaluated as a follow-up project and better to not touch them yet.

Evaluating if code is "renderer-only" is a lot harder than if code is "in blink" which is why I think the latter is a much better rule for humans. This doesn't have to match what is a better rule for the rewriter tool.
Reply all
Reply to author
Forward
0 new messages