constexpr char kMyLovelyStringLiteral[] = "Yay"

1,064 views
Skip to first unread message

Wez

unread,
Nov 3, 2022, 1:40:23 PM11/3/22
to chromi...@chromium.org
Hallo Chromium-Dev,

In the past there have been threads recommending use of "static" in conjunction with "const" on string literal arrays, to avoid some nasty copying behaviour in some compilers.

Time has moved on and we now use constexpr for most of these - do we still have a use for static in that case?  What even is the distinction between "constexpr char[]" and "static constexpr char[]"...?

kthxbai,

Wez

Peter Kasting

unread,
Nov 3, 2022, 2:00:58 PM11/3/22
to w...@chromium.org, chromi...@chromium.org
On Thu, Nov 3, 2022 at 10:39 AM Wez <w...@chromium.org> wrote:
Time has moved on and we now use constexpr for most of these - do we still have a use for static in that case?  What even is the distinction between "constexpr char[]" and "static constexpr char[]"...?

Are you in a header file or a .cc file?  I believe in a header what you want is "inline constexpr const char[]".  I believe in a .cc file what you want is "constexpr const char[]" inside an anonymous namespace (or, better, the function which uses the constant).

But I am not certain and haven't tested!

PK

Jeremy Roman

unread,
Nov 3, 2022, 2:01:57 PM11/3/22
to w...@chromium.org, chromi...@chromium.org
Local constexpr variables still get distinct addresses and have local lifetime; they're just guaranteed to have initializers which could be evaluated at compile time.

For example, the following is still wrong:

const char* GetFoo() {
  constexpr char foo[] = "foo";
  return foo;
}

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/CALekkJd6b1r1DZvpz1cJbwxEFbGnBd5xO%3DKoZtee3Y2ANmdfJQ%40mail.gmail.com.

Peter Kasting

unread,
Nov 3, 2022, 11:34:51 PM11/3/22
to Wez, Chromium-dev
Oh also don't forget that you can do constexpr StringPiece also, and it's probably better in every way than char[].

PK

Daniel Cheng

unread,
Nov 4, 2022, 2:35:01 AM11/4/22
to pkas...@chromium.org, Wez, Chromium-dev
Using a constexpr StringPiece probably requires a relocation. I would stick to constexpr char[] over StringPiece.

Daniel

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Peter Kasting

unread,
Nov 4, 2022, 11:56:17 AM11/4/22
to Daniel Cheng, Wez, Chromium-dev
On Thu, Nov 3, 2022 at 11:33 PM Daniel Cheng <dch...@chromium.org> wrote:
Using a constexpr StringPiece probably requires a relocation. I would stick to constexpr char[] over StringPiece.

Is that really a big deal?  Note that we have over a hundred of these already: https://source.chromium.org/search?q=%22constexpr%20base::StringPiece%22&ss=chromium

Also, using StringPiece gives you a nicer API, and helps avoid the risk of accidentally including the trailing '\0' when you didn't mean to. std::end() on a char[] and on a StringPiece point different places, so algorithms work differently too (see e.g. https://chromium-review.googlesource.com/c/chromium/src/+/3988851/3/net/websockets/websocket_frame_perftest.cc ).

PK

Andrew Grieve

unread,
Nov 4, 2022, 12:06:16 PM11/4/22
to pkas...@chromium.org, Daniel Cheng, Wez, Chromium-dev
Relocations are bad for binary size, memory, and start-up. While one of them isn't bad, if we encourage code patterns that lead to 1000s of them, that's bad.


--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Wez

unread,
Nov 4, 2022, 12:07:01 PM11/4/22
to Peter Kasting, Daniel Cheng, Chromium-dev
Yes, relocations are a huge problem, due to the runtime memory overhead, which is paid per-process in some systems.

Regarding the various responses: This seems a very subtle area in which there is a lot of opinion, and cargo-culting, but for which we don't provide clear guidance in the Chromium documentation - perhaps we should fix that? :)

Peter Kasting

unread,
Nov 4, 2022, 12:08:20 PM11/4/22
to Wez, Daniel Cheng, Chromium-dev
On Fri, Nov 4, 2022 at 9:05 AM Wez <w...@chromium.org> wrote:
Yes, relocations are a huge problem, due to the runtime memory overhead, which is paid per-process in some systems.

Hmm.  Is there a way to track these and try and reduce them, as we do with static initializers?

PK 

Wez

unread,
Nov 4, 2022, 12:18:11 PM11/4/22
to Peter Kasting, Daniel Cheng, Chromium-dev
We can certainly track the number of relocations, and presumably also the number of CoW pages needed to hold them, but I don't think we could realistically "limit" them since there are common patterns & language features (e.g. vtables) that require them - I think we'd need to focus on specific patterns/use-cases, e.g. requiring "static constexpr const kMyLovelyLiteral[] = ...." for literals, rather than the various other forms that currently exist

Torne (Richard Coles)

unread,
Nov 4, 2022, 1:25:50 PM11/4/22
to w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
On Fri, 4 Nov 2022 at 12:17, Wez <w...@chromium.org> wrote:
We can certainly track the number of relocations, and presumably also the number of CoW pages needed to hold them, but I don't think we could realistically "limit" them since there are common patterns & language features (e.g. vtables) that require them - I think we'd need to focus on specific patterns/use-cases, e.g. requiring "static constexpr const kMyLovelyLiteral[] = ...." for literals, rather than the various other forms that currently exist

Agreed - I looked into our relocations years ago to try to improve the situation on Android (where we originally could not share these pages at all) and at the time >50% of them were for vtables, which is probably still true. Avoiding them for vtables is very difficult, but IMO that just means we *should* avoid specific patterns that could easily be replaced with something that doesn't need relocating, so that we only have to pay for the ones that are harder to avoid.

(Chrome and WebView on Android now *usually* share the relocated pages between processes through Android-specific linker trickery but the mechanisms are complex and not 100% effective, and even when shared these pages are still dirty).
 
On Fri, 4 Nov 2022 at 17:06, Peter Kasting <pkas...@chromium.org> wrote:
On Fri, Nov 4, 2022 at 9:05 AM Wez <w...@chromium.org> wrote:
Yes, relocations are a huge problem, due to the runtime memory overhead, which is paid per-process in some systems.

Hmm.  Is there a way to track these and try and reduce them, as we do with static initializers?

PK 

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Wez

unread,
Nov 4, 2022, 1:35:09 PM11/4/22
to Torne (Richard Coles), Peter Kasting, Daniel Cheng, Chromium-dev
On Fri, 4 Nov 2022 at 18:24, Torne (Richard Coles) <to...@chromium.org> wrote:
On Fri, 4 Nov 2022 at 12:17, Wez <w...@chromium.org> wrote:
We can certainly track the number of relocations, and presumably also the number of CoW pages needed to hold them, but I don't think we could realistically "limit" them since there are common patterns & language features (e.g. vtables) that require them - I think we'd need to focus on specific patterns/use-cases, e.g. requiring "static constexpr const kMyLovelyLiteral[] = ...." for literals, rather than the various other forms that currently exist

Agreed - I looked into our relocations years ago to try to improve the situation on Android (where we originally could not share these pages at all) and at the time >50% of them were for vtables, which is probably still true. Avoiding them for vtables is very difficult, but IMO that just means we *should* avoid specific patterns that could easily be replaced with something that doesn't need relocating, so that we only have to pay for the ones that are harder to avoid.

The toolchain does now support relative-vtables, which should work so long as you're able to build all of the C++ you're linking with them enabled, of course. :)

String literals seem a prime example where a little work can get us both no relocations and no strlen, while also clearing up ambiguity around static, const, constexpr etc.

Torne (Richard Coles)

unread,
Nov 4, 2022, 1:59:42 PM11/4/22
to Wez, Peter Kasting, Daniel Cheng, Chromium-dev
On Fri, 4 Nov 2022 at 13:34, Wez <w...@chromium.org> wrote:
On Fri, 4 Nov 2022 at 18:24, Torne (Richard Coles) <to...@chromium.org> wrote:
On Fri, 4 Nov 2022 at 12:17, Wez <w...@chromium.org> wrote:
We can certainly track the number of relocations, and presumably also the number of CoW pages needed to hold them, but I don't think we could realistically "limit" them since there are common patterns & language features (e.g. vtables) that require them - I think we'd need to focus on specific patterns/use-cases, e.g. requiring "static constexpr const kMyLovelyLiteral[] = ...." for literals, rather than the various other forms that currently exist

Agreed - I looked into our relocations years ago to try to improve the situation on Android (where we originally could not share these pages at all) and at the time >50% of them were for vtables, which is probably still true. Avoiding them for vtables is very difficult, but IMO that just means we *should* avoid specific patterns that could easily be replaced with something that doesn't need relocating, so that we only have to pay for the ones that are harder to avoid.

The toolchain does now support relative-vtables, which should work so long as you're able to build all of the C++ you're linking with them enabled, of course. :)

That's *super* interesting and I'd love to know if anyone has already started looking at trying this on Android. The Android NDK libraries should all be a pure C ABI for ABI stability, so there's no *obvious* reason why we couldn't use a relative-vtable ABI for our C++ and see what the performance/memory tradeoffs look like.

(also since the dynamic linker tricks we use to try to share the relocated pages still *apply* all of the relocations separately in each process first and only *afterward* maps over the pages that "match" to free up the memory, reducing the number of relocations would have some benefit to startup time too..)

Hans Wennborg

unread,
Nov 4, 2022, 4:35:03 PM11/4/22
to to...@chromium.org, Wez, Peter Kasting, Daniel Cheng, Chromium-dev
On Fri, Nov 4, 2022 at 10:58 AM Torne (Richard Coles) <to...@chromium.org> wrote:
On Fri, 4 Nov 2022 at 13:34, Wez <w...@chromium.org> wrote:
On Fri, 4 Nov 2022 at 18:24, Torne (Richard Coles) <to...@chromium.org> wrote:
On Fri, 4 Nov 2022 at 12:17, Wez <w...@chromium.org> wrote:
We can certainly track the number of relocations, and presumably also the number of CoW pages needed to hold them, but I don't think we could realistically "limit" them since there are common patterns & language features (e.g. vtables) that require them - I think we'd need to focus on specific patterns/use-cases, e.g. requiring "static constexpr const kMyLovelyLiteral[] = ...." for literals, rather than the various other forms that currently exist

Agreed - I looked into our relocations years ago to try to improve the situation on Android (where we originally could not share these pages at all) and at the time >50% of them were for vtables, which is probably still true. Avoiding them for vtables is very difficult, but IMO that just means we *should* avoid specific patterns that could easily be replaced with something that doesn't need relocating, so that we only have to pay for the ones that are harder to avoid.

The toolchain does now support relative-vtables, which should work so long as you're able to build all of the C++ you're linking with them enabled, of course. :)

That's *super* interesting and I'd love to know if anyone has already started looking at trying this on Android. The Android NDK libraries should all be a pure C ABI for ABI stability, so there's no *obvious* reason why we couldn't use a relative-vtable ABI for our C++ and see what the performance/memory tradeoffs look like.

It's happening in crbug.com/1375035
 

Jean-Philippe Gravel

unread,
Nov 4, 2022, 6:45:18 PM11/4/22
to w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
Note that according to https://abseil.io/tips/140:
"absl::string_view is a good way to declare a string constant. The type has a constexpr constructor and a trivial destructor, so it is safe to declare them as global variables. Because a string view knows its length, using them does not require a runtime call to strlen()."

Are relocations greater of a problem than having to compute string length at runtime everywhere?

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Andrew Grieve

unread,
Nov 7, 2022, 11:16:44 AM11/7/22
to jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
I think the answer to whether we should add checks/guidance for strings depends on whether there's a compelling alternative. 

For strings < N bytes (maybe 16?), I think const char[] is a compelling alternative. Computing its length over and over will be hard to notice, and we'd be better off saving the space.

For larger strings, we could probably write our own global_string<int size> template that could use char[] under-the-hood, but I'm not sure if we could do it without a macro.

E.g.:

template<int SIZE>
struct GlobalString {
  char data[SIZE];
  operator const char*() const { return data; }
};
GlobalString<7> global_str{ "global"};


Can we get rid of the explicit <7> without a macro?



dan...@chromium.org

unread,
Nov 7, 2022, 11:35:03 AM11/7/22
to agr...@chromium.org, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
On Mon, Nov 7, 2022 at 11:16 AM Andrew Grieve <agr...@chromium.org> wrote:
I think the answer to whether we should add checks/guidance for strings depends on whether there's a compelling alternative. 

For strings < N bytes (maybe 16?), I think const char[] is a compelling alternative. Computing its length over and over will be hard to notice, and we'd be better off saving the space.

For larger strings, we could probably write our own global_string<int size> template that could use char[] under-the-hood, but I'm not sure if we could do it without a macro.

E.g.:

template<int SIZE>
struct GlobalString {
  char data[SIZE];
  operator const char*() const { return data; }
};
GlobalString<7> global_str{ "global"};


Can we get rid of the explicit <7> without a macro?

Wez

unread,
Nov 7, 2022, 11:47:58 AM11/7/22
to dan...@chromium.org, agr...@chromium.org, jpgr...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
User literals require that there is an operator defined that returns the desired type for the literal, though, so I think we'd still end up with a template argument that is a either a pointer or a pointer+length pair?

K. Moon

unread,
Nov 7, 2022, 11:48:32 AM11/7/22
to dan...@chromium.org, agr...@chromium.org, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
The style guide bans the creation and use of UDLs.

dan...@chromium.org

unread,
Nov 7, 2022, 11:57:09 AM11/7/22
to K. Moon, agr...@chromium.org, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
Oh, I think we'd need C++20 changes to use them for this, but being banned is unlucky.

Here's a char array example with the size in the template:

template<std::size_t N>
struct DoubleString
{
    char p[N*2-1]{};
 
    constexpr DoubleString(char const(&pp)[N])
    {
        std::ranges::copy(pp, p);
        std::ranges::copy(pp, p + N - 1);
    };
};
 
template<DoubleString A>
constexpr auto operator"" _x2()
{
    return A.p;
}

Mark Mentovai

unread,
Nov 7, 2022, 2:18:13 PM11/7/22
to agr...@chromium.org, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
Andrew Grieve wrote:
I think the answer to whether we should add checks/guidance for strings depends on whether there's a compelling alternative. 

For strings < N bytes (maybe 16?), I think const char[] is a compelling alternative. Computing its length over and over will be hard to notice, and we'd be better off saving the space.

The strlen is typically optimized away for a (potentially static) const char[].

mark@arm-and-hammer zsh% cat t.c                                              
#include <string.h>

int F() {
  static const char s[] = "nineteen characters";
  return strlen(s);
}
mark@arm-and-hammer zsh% crclang -arch x86_64 -masm=intel -Os -S t.c -o- | grep -Ev '^[[:space:]]*[.#]'
_F:                                     ## @F
        push rbp
        mov rbp, rsp
        mov eax, 19
        pop rbp
        ret
mark@arm-and-hammer zsh% crclang -arch arm64 -Os -S t.c -o- | grep -Ev '^[[:space:]]*[.;]'
_F:                                     ; @F
        mov w0, #19
        ret

For larger strings, we could probably write our own global_string<int size> template that could use char[] under-the-hood, but I'm not sure if we could do it without a macro.

I don’t know if this is really worthwhile given the above.

Andrew Grieve

unread,
Nov 8, 2022, 4:49:43 PM11/8/22
to Mark Mentovai, agr...@chromium.org, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
Speaking of compiler optimizations...

I tried to verify that global StringPiece symbols result in a relocation and came up almost empty.

After spot checking a bunch from the link @Peter Kasting provided above, I could find only a single instance where relocations appear after optimization:
cookie_monster_change_dispatcher.cc's net::kGlobalDomainKey and net::kGlobalNameKey
I used this supersize report (googlers only), which has symbol sizes swapped out for relocation count.

Unless my findings here are incorrect, I think we're fine to allow StringPiece, const char * const, or const char[], at least in terms of relocations.

Mark Mentovai

unread,
Nov 8, 2022, 5:02:28 PM11/8/22
to Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev
Did the char data have static storage duration? That’s necessary to prevent the (typically unnecessary) automatic storage duration copies from being made. But the data segment (where objects of static storage duration live) is also where you’ll see the relocations we’re discussing.

Your two cookie_monster_change_dispatcher.cc examples both have static storage duration. A spot-check of the first page of results in Peter’s query shows that almost all, but not all, have static storage duration. Your Super Size report isn’t loading for me, despite being Googley.

Gabriel Charette

unread,
Nov 9, 2022, 11:35:29 AM11/9/22
to ma...@chromium.org, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev, Etienne Bergeron
Clang no longer seems to care: https://godbolt.org/z/Kqh3h68xY
image.png

PS: constexpr implies const so "constexpr const" (which was mentioned above) is redundant I think?

Wez

unread,
Nov 9, 2022, 11:43:10 AM11/9/22
to Gabriel Charette, ma...@chromium.org, Andrew Grieve, jpgr...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev, Etienne Bergeron
Is this one of those things that varies subtly depending on precisely what you do with the string?

e.g. in your examples, if your function were to return a string_view then would they still end up identical..?

Mark Mentovai

unread,
Nov 9, 2022, 11:43:50 AM11/9/22
to Gabriel Charette, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev, Etienne Bergeron
Gabriel Charette wrote:
Clang no longer seems to care: https://godbolt.org/z/Kqh3h68xY
image.png

That’s not where the difference would be apparent. Try https://godbolt.org/z/5TME3EKoa.

static_const_char.png

Gabriel Charette

unread,
Nov 9, 2022, 12:58:30 PM11/9/22
to Mark Mentovai, Gabriel Charette, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev, Etienne Bergeron
Ah! Thanks Mark. FWIW, I took that from this old thread where return A[x] was sufficient to trigger a difference before.

This confirms what I thought: constexpr and const are equivalent for POD types (constexpr just says "please pre-compute this" but a POD is always pre-computed).

So we still need static (and const is fine, constexpr is overkill).

K. Moon

unread,
Nov 9, 2022, 1:39:08 PM11/9/22
to g...@chromium.org, Mark Mentovai, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev, Etienne Bergeron
Re: constexpr vs. consthttps://en.cppreference.com/w/cpp/language/constexpr
"A constexpr specifier used in an object declaration (or non-static member function (until C++14)) implies const. A constexpr specifier used in a function (or static data member (since C++17)) declaration implies inline."

I was thinking maybe people kept meaning to type constexpr char, though.

Mark Mentovai

unread,
Nov 9, 2022, 2:26:49 PM11/9/22
to Gabriel Charette, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev, Etienne Bergeron
Gabriel Charette wrote:
Ah! Thanks Mark. FWIW, I took that from this old thread where return A[x] was sufficient to trigger a difference before.

Unfortunately that wasn’t a very solid test because the pointer never left the function. It was legal to optimize dummy2 even in 2013. The conclusion back then that static avoided the copy was correct, we’re just looking at better optimizers this time around, so we needed a stronger experiment.

This confirms what I thought: constexpr and const are equivalent for POD types (constexpr just says "please pre-compute this" but a POD is always pre-computed).

So we still need static (and const is fine, constexpr is overkill).

I don’t agree that constexpr is overkill. A constexpr variable must be initialized at compile-time, and compilation will fail if this is not possible.

If by “pre-computed” you mean determinate at compile time, it’s not true that these are always “pre-computed”. It’s easy to initialize a char[] or other POD type, even a const one, with something that’s not compile-time determinate. It’s even easy to do it accidentally! Using constexpr affords protection against unexpected runtime code, and against (cue spooky music) module initializers.

For example:

void G(char const *);
char randomChar();

void F() {
  static char const kCharConstArray[] = {'A', randomChar(), '\0'};
  G(kCharConstArray);

  static constexpr char kConstexprCharArray[] = {'A', randomChar(), '\0'};
  G(kConstexprCharArray);
}
 

The kConstexprCharArray initializer isn’t a constant expression, so compilation fails. error: constexpr variable 'kConstexprCharArray' must be initialized by a constant expression. The kCharConstArray variant compiles but needs to run code to initialize the variable. If you were at namespace (global) scope instead, like this:

char randomChar();
char const kCharConstArray[] = {'A', randomChar(), '\0'};


you’d get a module initializer. Oops!

For char[] in particular, an initializer list isn’t exactly the most common pattern because string literals are so much easier, but there’s a time and a place for them, and we do use them to some extent. For types other than char[], braced initializers are very common.

In other cases, constexpr is required, and char const won’t work at all. For example, at class scope:

class Cl {
 public:
  static char const kCharConstArray[] = "char const array";
  static constexpr char kConstexprCharArray[] = "constexpr char array";
};

The kCharConstArray variant won’t compile at all. error: in-class initializer for static data member of type 'const char[17]' requires 'constexpr' specifier.

Since we have the constexpr tool to enforce compile-time initialization, why wouldn’t we use it? And if we’re in the habit of writing constexpr char for cases where it doesn’t make a difference, it’ll be harder to goof in cases where it does. It’s also easier to remember one relatively simple rule (“use static constexpr”) than it is to follow a decision tree.

(and now I’ve outed myself as what they’re calling “east const”.)

dan...@chromium.org

unread,
Nov 9, 2022, 2:31:17 PM11/9/22
to ma...@chromium.org, Gabriel Charette, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Peter Kasting, Daniel Cheng, Chromium-dev, Etienne Bergeron
Just want to +1 using constexpr for things we want to init at compile time. :) And thanks for the testing of codegen.

Peter Kasting

unread,
Nov 9, 2022, 2:39:12 PM11/9/22
to Mark Mentovai, Gabriel Charette, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
So... what's the TLDR of this whole thread?

* Use constexpr over const because it's a nice practice to be in and occasionally enforces that you didn't actually do something dumb
* No need for "constexpr const char[]" because in this case constexpr implies const (note, that is not necessarily true for other uses of constexpr)
* No need for "inline constexpr char[]" because in this case constexpr implies inline (note, that is not necessarily true for other uses of constexpr)
* Use "static" in classes and inside functions?
* If you did use "static", it is also OK to use a StringPiece, because we won't generate a relocation? What about for file-/global-scope objects?

Did I get all this right, especially the last two? Did I miss anything?

PK

Gabriel Charette

unread,
Nov 9, 2022, 5:06:27 PM11/9/22
to Peter Kasting, Mark Mentovai, Gabriel Charette, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
Don't get me wrong, I'm a big fan of constexpr. But I think something like `constexpr int kFoo = 1234;` is overkill and I view `constexpr char kBar[] = "bar";` the same 🤷‍♂️. I'll never ask for it to be changed in a review but I don't see a need for it. Do constexpr any complex initialization for sure.

On Wed, Nov 9, 2022 at 2:37 PM Peter Kasting <pkas...@chromium.org> wrote:
So... what's the TLDR of this whole thread?

* Use constexpr over const because it's a nice practice to be in and occasionally enforces that you didn't actually do something dumb
Meh (per above). 

* No need for "constexpr const char[]" because in this case constexpr implies const (note, that is not necessarily true for other uses of constexpr)
 Agreed (when would `constexpr const` ever mean something other than `constexpr`?)

* No need for "inline constexpr char[]" because in this case constexpr implies inline (note, that is not necessarily true for other uses of constexpr)
For class scope: `static constexpr char kFoo[] = "foo"`
So yes, no inline needed I think?
 
* Use "static" in classes and inside functions?
"static" does better reflect the intent but since clang seems to optimize all the same, we should stop caring?

* If you did use "static", it is also OK to use a StringPiece, because we won't generate a relocation? What about for file-/global-scope objects?
That's interesting, I'm not sure. Wanna test in godbolt how clang resolves that one?

Wez

unread,
Nov 10, 2022, 7:17:50 AM11/10/22
to Gabriel Charette, Peter Kasting, Mark Mentovai, Andrew Grieve, jpgr...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
Gabriel: This thread is specifically about string literals, since C/C++ strings are a little different to other integral types - while parts of the guidance are relevant more generally, I think it's fair to discuss any such guidance separately.

Mark: Thanks for the Olympian effort to realise my suspicions as Godbolt(s).  ;)

Peter: Thanks for your summary, most of which sounds reasonable to me based on the discussion.  My understanding is that we still can't use StringPiece in this way, even with constexpr, because part of the (effectively) compile-time-static information in a StringPiece is the _pointer_ to the start of the data - but as has been demonstrated, I'm not up to speed on how smart our toolchain(s) are, so I'll need to Godbolt a bit more to explore that.

In the absence of further challenges to this crowd-source wisdom, I'll send out a change to add a brief best-practice note to the Chromium C++ style, for string literals.

Mark Mentovai

unread,
Nov 10, 2022, 10:15:09 AM11/10/22
to Gabriel Charette, Peter Kasting, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
Gabriel Charette wrote:
Don't get me wrong, I'm a big fan of constexpr. But I think something like `constexpr int kFoo = 1234;` is overkill and I view `constexpr char kBar[] = "bar";` the same 🤷‍♂️. I'll never ask for it to be changed in a review but I don't see a need for it. Do constexpr any complex initialization for sure.

On Wed, Nov 9, 2022 at 2:37 PM Peter Kasting <pkas...@chromium.org> wrote:
So... what's the TLDR of this whole thread?

* Use constexpr over const because it's a nice practice to be in and occasionally enforces that you didn't actually do something dumb
Meh (per above). 

* No need for "constexpr const char[]" because in this case constexpr implies const (note, that is not necessarily true for other uses of constexpr)
 Agreed (when would `constexpr const` ever mean something other than `constexpr`?)

* No need for "inline constexpr char[]" because in this case constexpr implies inline (note, that is not necessarily true for other uses of constexpr)
For class scope: `static constexpr char kFoo[] = "foo"`
So yes, no inline needed I think?
 
* Use "static" in classes and inside functions?
"static" does better reflect the intent but since clang seems to optimize all the same, we should stop caring?

Careful! In general, this is not true. It’s not really about intent at all, there’s a real (and desirable) reason to use static, and we should care.

What most people want most of the time for their string literals is static storage duration. If you’re in function scope, that means you need to use static.

https://godbolt.org/z/bGa6eaP7x shows a wider variety of options. You may have been led to the conclusion that static doesn’t matter at function scope because there was no difference between static char const * const and char const * const. But those char * pointer-oriented variants are different from the array-oriented char[] ones. But, for string literals, the pointer style is inferior to the array style in just about every way.

What’s the point of static in static constexpr char kString[] at function scope, anyway? This is probably an almost-FAQ, but it’s so subtle that it’s easy to miss that there even is a distinction. Without the static at function scope, the object has automatic storage duration. Objects with automatic storage duration are required to have a unique address at each activation. So for each call to a function with a constexpr char kString[], the kString object needs a unique address. That’s achieved by putting it onto the stack. But now that it’s on the stack, it’s got to be initialized at each pass—that’s where all of that extra code comes from. It seems like busywork and in most cases for string literals, it is: pointer equality is rarely a concern for string literals.

The compiler is free to “as-if”-optimize that sometimes-busywork away to the point that the static might appear irrelevant, but that only works in very specific cases where the compiler is able to see that the uniqueness of the address is not considered. That’s why the older examples that didn’t allow the pointer to escape the test functions were flawed: the generated code was being improved by optimizer action, but not under realistic circumstances. In the real world, if you have a string literal at function scope, a pointer to it is probably going to escape the function—and if it escapes the translation unit (roughly “file”), without something like LTO, the compiler won’t have any way to assure itself that address uniqueness is irrelevant, so that optimization will be off limits.

Want to know more about where this requirement comes from? C++17 [intro.object] §4.5/8:
Unless an object is a bit-field or a base class subobject of zero size, the address of that object is the address of the first byte it occupies. Two objects a and b with overlapping lifetimes that are not bit-fields may have the same address if one is nested within the other, or if at least one is a base class subobject of zero size and they are of different types; otherwise, they have distinct addresses.⁵
5) Under the “as-if” rule an implementation is allowed to store two objects at the same machine address or not store an object at all if the program cannot observe the difference (4.6).

And [intro.execution] §4.6/6:
An instance of each object with automatic storage duration (6.7.3) is associated with each entry into its block. Such an object exists and retains its last-stored value during the execution of the block and while the block is suspended (by a call of a function or receipt of a signal).

Similar in C18 §6.2.4/6, §6.5.9/6.

The different behavior of char const * is rooted in the fact that the chars don’t comprise the object as they do in the array case; the object declaration is a pointer and its storage duration is irrelevant for the purpose of where the pointed-to data lives, and thus what value it takes. The char data is just in the string pool, and your pointer points there. C++17 [lex.string] §5.13.5/8 controls in that case:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (6.7).

Similar in C18 §6.4.5/6.

This is all very subtle. clang got it wrong for many years, and if you dig hard enough, you’ll find old threads where I recommended against the static at function scope as irrelevant noise. Turns out that was wrong. Special thanks to ex-Chromie Leonard Mosescu for challenging me to dig deeper.

Alexei Svitkine

unread,
Nov 10, 2022, 11:29:34 AM11/10/22
to ma...@chromium.org, Gabriel Charette, Peter Kasting, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
Not to detract from the main point (that static still has a benefit), but I don't think there's a uniqueness guarantee without static? If you call the same function twice in a row from the same callstack and thread, wouldn't the local variable get allocated at the same address on the stack since it's relative to the stack pointer? And more generally, I don't think uniqueness would be guaranteed in some way (as that would require some machinery to ensure an address from before isn't re-used, which doesn't make sense to have.)

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.

Mark Mentovai

unread,
Nov 10, 2022, 12:46:01 PM11/10/22
to Alexei Svitkine, Gabriel Charette, Peter Kasting, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
Alexei Svitkine wrote:
Not to detract from the main point (that static still has a benefit), but I don't think there's a uniqueness guarantee without static? If you call the same function twice in a row from the same callstack and thread, wouldn't the local variable get allocated at the same address on the stack since it's relative to the stack pointer?

If you’re calling a function twice from exactly the same call stack, the first call would necessarily have returned by the time you made the second call. The objects with automatic storage duration that existed during the first call will have ceased to exist upon its return. That renders the remainder of the question invalid—the C++17 [intro.object] §4.5/8 reference I provided earlier limits its application to “two objects a and b with overlapping lifetimes”, and in your example, their lifetimes do not overlap.

Furthermore, there’s no guarantee that the two calls to the same function “from the same callstack and thread” would wind up with their own stack frames at the same address. I can cook up ways to violate your assumption, but I’ll hold back as it’s a distraction. Fun, but definitely tangential.

At function scope, the typical way to tangibly experience the difference between static char const kString[] and plain char const kString[] is reentrancy. Often that means a recursive call, but don’t discount gremlins like signal handlers. The old (and now fixed) clang bug I referenced in my last message demonstrated clang’s former noncompliance with recursion.

And more generally, I don't think uniqueness would be guaranteed in some way (as that would require some machinery to ensure an address from before isn't re-used, which doesn't make sense to have.)

Of course it’s valid for memory to be repurposed. The requirement for unique addresses only applies to objects that are alive simultaneously. But for those that are alive simultaneously, the requirement absolutely exists, and this much should be intuitive: two things can’t exist simultaneously at the same address.

static in this context does alter the lifetime, but that’s not the main point in this discussion. Here, static means that there’s only one object, no matter how many times the function is called, without regard to reentrancy (including recursion). That alone means that the compiler’s able to produce better code.

If you want a reason to get behind function-scoped static that’s more directly related to lifetime, consider this: it’s totally valid to return a pointer to static data, because its life doesn’t end when its enclosing block does.

char const * Valid() {
  static constexpr char kString[] = "nineteen characters";
  return kString;
}

char const * Invalid() {
  constexpr char kString[] = "nineteen characters";
  return kString;
}


Fortunately, clang will prevent you from writing Invalid, advising: warning: address of stack memory associated with local variable 'kString' returned [-Wreturn-stack-address].

Reilly Grant

unread,
Nov 10, 2022, 1:59:04 PM11/10/22
to ma...@chromium.org, Alexei Svitkine, Gabriel Charette, Peter Kasting, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
I'm looking forward to the conclusions of this thread being documented in a Markdown file somewhere in the source tree. *wink*
Reilly Grant | Software Engineer | rei...@chromium.org | Google Chrome


Jean-Philippe Gravel

unread,
Nov 10, 2022, 2:44:01 PM11/10/22
to Reilly Grant, ma...@chromium.org, Alexei Svitkine, Gabriel Charette, Peter Kasting, Andrew Grieve, jpgr...@chromium.org, w...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
> I'm looking forward to the conclusions of this thread being documented in a Markdown file somewhere in the source tree.
Documenting Chromium's stance on these questions would certainly be helpful. Note however that these questions have been studied in depth at Google already. Of course, it's certainly worth questioning whether google3 advices apply to Chromium, but perhaps some knowledge sharing could save some trouble.

Now interestingly, a similar question was just asked on Google's C++ readability mentor mailing list, around best practice for defining string constants. The consensus is still around the recommendations in https://abseil.io/tips/140, that is, prefer `constexpr string_view` especially at namespace scope (or rather base::StringPiece in Chromium, I'm assuming they are equivalent?) They have to be `inline` or `extern` in headers (never `static`!) or else every compilation unit would hold their own version. In CC files namespace scope, `constexpr` are by default `static`, so there is no need to spell that explicitly. And as pointed above, `static` is needed in a function if it's important for all invocations to use the same string (as in, if the function returns a pointer/view on the string.)

There are also arguments that using string_view (StringPiece?) by default encourages the rest of the code to also use string_view, helping to prevent repetitive `strlen` and copies to temporary `std::string` everywhere. `string_view` and `StringPiece` are also faster than `std::string&`. That's because the view types are/should be passed on the stack and they hold a direct pointer to the string buffer. An std::string reference on the other hand is a pointer to an object that then holds a pointer to the data. That's two dereferences instead of 1 and possibly two cache misses instead of one. These details are important, remember that old blog post from 2014:
"TL;DR: std::string is responsible for almost half of all allocations in the Chrome browser process; please be careful how you use it!"

Gabriel Charette

unread,
Nov 10, 2022, 8:02:48 PM11/10/22
to Jean-Philippe Gravel, Daniel Cheng, Reilly Grant, ma...@chromium.org, Alexei Svitkine, Gabriel Charette, Peter Kasting, Andrew Grieve, w...@chromium.org, Daniel Cheng, Chromium-dev, Etienne Bergeron
Thanks Mark for the in-depth reply. Of course, this makes sense. I was somehow convinced by your first reply and then forgot again, sorry for the confusion.

@Wez : I don't understand, none of my replies pertain to C/C++ strings. All of this is about how to define `const char[]`.

I'm curious about google3's stance to favor constexpr string_view, maybe because relocations aren't as big of a deal server-side? I'm not as familiar with this topic, @Daniel Cheng or others?

Daniel Cheng

unread,
Nov 15, 2022, 1:24:57 AM11/15/22
to Gabriel Charette, Jean-Philippe Gravel, Reilly Grant, ma...@chromium.org, Alexei Svitkine, Peter Kasting, Andrew Grieve, w...@chromium.org, Chromium-dev, Etienne Bergeron
I did some experiments for relocations using std::string_view. As long as nothing takes the address of the std::string_view, clang actually puts everything into .rodata, which is great. This is because it effectively constructs a std::string_view each time it's used:

    1664: 48 8d 1d eb 09 00 00          leaq    0x9eb(%rip), %rbx       # 0x2056 <_IO_stdin_used+0x56>
    166b: bf 2c 00 00 00                movl    $0x2c, %edi
    1670: 48 89 de                      movq    %rbx, %rsi
    1673: e8 68 fb ff ff                callq   0x11e0 <F(std::basic_string_view<char, std::char_traits<char> >)>

I also did some experiments where the std::string_view was repeatedly used, or where only data() was passed to my test function, and clang still preferred to "construct" std::string_view instances on demand.

Non-exhaustive list of things I did not test:
- what happens in very large binaries (to see if it prefers to put the std::string_view itself in .relro if the offsets are larger)
- what happens if a static constexpr std::string_view is passed in a lot of difference places (I suspect this would not make a difference)

However, if *anything* takes the address of the std::string_view (e.g. a function takes a std::string_view by const *reference*), then this forces clang to generate a relocation entry. This includes passing a std::string_view to a templated function with a signature like this:

template <typename T>
void F(const T& string_view) { ... }

or like this:
template <typename T>
void F(T&& string_view) { ... }

*even* if the function never really depends on the address. There are also other ways that clang might be forced to emit relocations, e.g. constexpr arrays of constexpr std::string_view.

Given that the most common cases don't seem to require a relocation, it should not be a big concern for Chrome at this time. And we should also try to make sure we don't pass base::StringPiece by reference anywhere (outside templates).

Daniel

P.S. for those interested, the test snippet I used was:

void __attribute__((noinline)) F(std::string_view sv) {
  std::string s(sv.data(), sv.size());

  std::random_device rd;
  std::uniform_int_distribution<uint8_t> dist(0, 255);
  for (auto& c : s) {
    c &= static_cast<char>(dist(rd));
  }
  puts(s.data());
}

It is somewhat cargo-culted; the noinline is to try to simulate how the compiler would usually treat this function in Chrome, and the random bits are just an attempt to keep the optimizer from being too smart about the whole thing.
Reply all
Reply to author
Forward
0 new messages