std::string_view.
We have span::span_from_cstring to do this
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | +1 |
LGTM but I would undo the string_view changes - that's not a pattern we want to establish or encourage.
If we want a type that can turn into a span without the null and is used for string constants we may need to provide our own.
resource1->AppendData(kData.substr(0u, 4u));
first(4u)
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
LGTM but I would undo the string_view changes - that's not a pattern we want to establish or encourage.
If we want a type that can turn into a span without the null and is used for string constants we may need to provide our own.
Ah yeah, I was trying to find `span_from_cstring` but couldn't remember the name. 😊
Although, is `std::string_view` really the wrong pattern here? These variables are all strings, and `std::string_view` turns into `base::span` just fine. We even already use `std::string_view` for string constants not infrequently:
https://source.chromium.org/search?q=const(expr)%3F%5C%20(std::string_view%7Cbase::StringPiece).*%5C%22%20-file:third_party&ss=chromium
If we kept them as arrays and use `base::span_from_cstring` at the point of usage, it's more verbose, and error-prone because you might forget `span_from_cstring` and silently keep the NUL. `auto kData = base::span_from_cstring("abcde");` would avoid that, but it diverges from what other code in the project is already doing. And, in so far as we believe in capturing text vs. bytes in the type system, `std::string_view` retains that better. It's and just being interpreted as bytes when stuffed into a `Resource`.
There's even an `operator""sv` when you don't want to stick them in an intermediate variable. In the spirit of teaching people safer patterns, we have a better shot of saying "put the `sv` suffix on string literals and get the `std::string_view` type you've already learned" than saying "`base::span_from_cstring("foo")`, except when you want something that works with `std::string` because `span` to `string_view` is currently difficult." It's also most analogous to Rust, where string literals give you a `&'static str`, not a `&[u8; N]`.
To that end, `span_from_cstring` does have the advantage of keeping the compile-time size, but that's not very important here. Indeed Rust similarly does not keep compile-time size for `&str`.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
David BenjaminLGTM but I would undo the string_view changes - that's not a pattern we want to establish or encourage.
If we want a type that can turn into a span without the null and is used for string constants we may need to provide our own.
Ah yeah, I was trying to find `span_from_cstring` but couldn't remember the name. 😊
Although, is `std::string_view` really the wrong pattern here? These variables are all strings, and `std::string_view` turns into `base::span` just fine. We even already use `std::string_view` for string constants not infrequently:
https://source.chromium.org/search?q=const(expr)%3F%5C%20(std::string_view%7Cbase::StringPiece).*%5C%22%20-file:third_party&ss=chromiumIf we kept them as arrays and use `base::span_from_cstring` at the point of usage, it's more verbose, and error-prone because you might forget `span_from_cstring` and silently keep the NUL. `auto kData = base::span_from_cstring("abcde");` would avoid that, but it diverges from what other code in the project is already doing. And, in so far as we believe in capturing text vs. bytes in the type system, `std::string_view` retains that better. It's and just being interpreted as bytes when stuffed into a `Resource`.
There's even an `operator""sv` when you don't want to stick them in an intermediate variable. In the spirit of teaching people safer patterns, we have a better shot of saying "put the `sv` suffix on string literals and get the `std::string_view` type you've already learned" than saying "`base::span_from_cstring("foo")`, except when you want something that works with `std::string` because `span` to `string_view` is currently difficult." It's also most analogous to Rust, where string literals give you a `&'static str`, not a `&[u8; N]`.
To that end, `span_from_cstring` does have the advantage of keeping the compile-time size, but that's not very important here. Indeed Rust similarly does not keep compile-time size for `&str`.
The problem with string_view is that when you take its address, it requires it to be relocated. Cstring arrays don't have that issue. And while we work with them by reference, generic code does not.
If we had our own type, we could have span<char> and string_view both construct from it. Is there room to do better though and prevent relocations from happening?
Maybe not, in which case yes `constexpr string_view` seems better than span_from_c_string().
It would be nice if span didn't implicitly construct from `char []` though, to prevent the ambiguity of "char array" vs "c string", and force the code to use string_view (implicit conversion) or span() (with nul) or span_from_cstring() (without nul).
+dcheng
Thoughts?
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
David BenjaminLGTM but I would undo the string_view changes - that's not a pattern we want to establish or encourage.
If we want a type that can turn into a span without the null and is used for string constants we may need to provide our own.
danakjAh yeah, I was trying to find `span_from_cstring` but couldn't remember the name. 😊
Although, is `std::string_view` really the wrong pattern here? These variables are all strings, and `std::string_view` turns into `base::span` just fine. We even already use `std::string_view` for string constants not infrequently:
https://source.chromium.org/search?q=const(expr)%3F%5C%20(std::string_view%7Cbase::StringPiece).*%5C%22%20-file:third_party&ss=chromiumIf we kept them as arrays and use `base::span_from_cstring` at the point of usage, it's more verbose, and error-prone because you might forget `span_from_cstring` and silently keep the NUL. `auto kData = base::span_from_cstring("abcde");` would avoid that, but it diverges from what other code in the project is already doing. And, in so far as we believe in capturing text vs. bytes in the type system, `std::string_view` retains that better. It's and just being interpreted as bytes when stuffed into a `Resource`.
There's even an `operator""sv` when you don't want to stick them in an intermediate variable. In the spirit of teaching people safer patterns, we have a better shot of saying "put the `sv` suffix on string literals and get the `std::string_view` type you've already learned" than saying "`base::span_from_cstring("foo")`, except when you want something that works with `std::string` because `span` to `string_view` is currently difficult." It's also most analogous to Rust, where string literals give you a `&'static str`, not a `&[u8; N]`.
To that end, `span_from_cstring` does have the advantage of keeping the compile-time size, but that's not very important here. Indeed Rust similarly does not keep compile-time size for `&str`.
The problem with string_view is that when you take its address, it requires it to be relocated. Cstring arrays don't have that issue. And while we work with them by reference, generic code does not.
If we had our own type, we could have span<char> and string_view both construct from it. Is there room to do better though and prevent relocations from happening?
Maybe not, in which case yes `constexpr string_view` seems better than span_from_c_string().
It would be nice if span didn't implicitly construct from `char []` though, to prevent the ambiguity of "char array" vs "c string", and force the code to use string_view (implicit conversion) or span() (with nul) or span_from_cstring() (without nul).
+dcheng
Thoughts?
The problem with string_view is that when you take its address, it requires it to be relocated. Cstring arrays don't have that issue. And while we work with them by reference, generic code does not.
Ah, I see. Oooowwww.
I'm not sure what to do about that one. Thinking aloud...
It's also fun across compilation units. With C strings, in a file like `chrome/common/chrome_switches.h`, we're actually relying on `extern const char kAcceptLang[];` decaying to a pointer whenever one accesses it, because the other compilation units don't know the size of the array! I.e. that entire language construct is anti-`span` and deeply relies on NUL-termination to get its bounds.
I don't know if we even can, with our own type, replicate the cross-compilation unit behavior, short of wrapping everything in functions that return `string_view`/`span`/whatever. This ability to create an array that's usable, but with unknown size, is pretty deep in the language. (Also incredibly gross. Imagine trying to bounds-check `kAcceptLang[20]`.) And requiring the programmer count the size of the array and echo it into the header isn't reasonable.
I wonder if C++20 modules avoids this weirdness... would `export const char kAcceptLang[] = "accept-lang"` export a `const char[]` with unknown size or a `const char[11]` with known size? I'd guess it's the latter, but I don't know much about this. I guess, if we were willing to increase compile times *and* link times, we could use inline variables? Then the header is able determine the type.
I think the only way to avoid the relocation would be if it were a container type, rather than a view type. View types have to have pointers in them, so once you ODR-use them, you've got to emit the pointer globally.
I.e. if it would have to be the analog of a `std::array<char, N>`, instead of a `std::string_view`. Doing that then requires we parameterization on `N`, which gets fun because it means none of our patterns can involve uttering the type. So lots and lots of CTAD and/or `auto`. And then we'd still need to do something about the cross-file thing above. A pattern that only works for file-local constants is a little goofy because those are also the cases where it's less likely to matter. (Less likely to be ODR-used, more likely that the compiler will dissolve it all anyway.)
It would be nice if span didn't implicitly construct from char [] though
Agreed. Though I suspect attempting to patch that one around will cause more problems than it solves. :-( The curse of `"blah"` being a `char[N]`.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I found go/totw/140 internally, which seems to prefer `string_view`, but I think they just didn't care about the relocation cost. :-/
I suspect our choices are:
What would you like to be our path forward here, now that we have cstring_view.
Do you think this needs to also have StaticString and we should implement that before this proceeds? Should we leave them as char[] and wrap them when constructing span?
I suspect this is waiting for StaticString, but I want to be sure, in which case we should prioritize moving that head sooner rather than later.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I think either `cstring_view` and `string_view` are equivalent here in that these tests don't actually care about the NUL terminator. (You only need `cstring_view` if you care about the NUL, and none of this code cares.)
I don't feel very strongly. I think my default would be to just stick with `string_view` (i.e. this CL as-is) because it works and, in the context of this CL, we don't actually need `StaticString` to avoid the relocation. They're all function-local constants. But if you feel strongly that we should funnel everyone to `StaticString`, we can keep this parked.
(I think, realistically, going the `StaticString` route means we're going to get a mix of `(c)string_view` and `StaticString` no matter what we do, so I don't personally put much weight on trying to establish one pattern over another.)
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
They're all function-local constants.
These ones are not. But they are also not marked `static`.
I agree we will get a mix, but we can generate an error on `static std::string_view` and `static base::cstring_view` that points the user to StaticString for those.
That would still allow the conversion here in the test file. Would it also allow however regrettable use of c/string_view in production cc files? Is there a better shape to limit the impact of relocations?
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
These ones are not.
Ah yeah, missed those.
But they are also not marked `static`.
I mean, they're global, and in an anonymous namespace. The `static` keyword itself is kinda overloaded.
I agree we will get a mix, but we can generate an error on static std::string_view and static base::cstring_view that points the user to StaticString for those.
Sure.
TBH, I think we're making a mountain of a molehill here. When global `string_view`s are in tests, it doesn't matter in the first place. When they are file-local, as they are here, most of the time, the compiler will optimize out the relocations. The few relocations that sneak through also are not the end of the world compared to everything else we do. (I mean, every virtual method costs a relocation, and we're not banning those.)
Having `StaticString` as a preferred option sounds good, once we have it. If we want to build tooling to encourage it, I guess we could, but I'm not sure it's the best use of time or developer annoyance. Shrug.
But right now we don't have StaticString or that warning, so either this CL remains parked and we can't spanify this code, and I'll just keep rebasing it along in my checkout, or we decide this is good enough and land it. I prefer the latter, but if you prefer the former, I'm happy to keep it parked. 😊
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
LGTM
Yeah I agree about this being a test file.
Ok let's keep moving forward.
std::string_view.
We have span::span_from_cstring to do this
Acknowledged
resource1->AppendData(kData.substr(0u, 4u));
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | +1 |
LGTM
void AppendInternal(const char* data, size_t);
I'm OK with this but I preferred it when we funneled everything to AppendInternal().
Maybe I'll feel differently about this once we don't have so many Append() methods
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
void AppendInternal(const char* data, size_t);
I'm OK with this but I preferred it when we funneled everything to AppendInternal().
Maybe I'll feel differently about this once we don't have so many Append() methods
Oh, the reason for this is that, once we spanify it, the right spelling for `AppendInternal` is `AppendInternal(span<const char>)`. But that's already one of the APIs we'd like to expose publicly as `Append(span<char>)`, so there's no point in having an `Append` vs `AppendInternal` split.
In the old, non-span version, it bottomed out at this one, which is *almost* one of the public APIs, except the public one has some `STRICTLY_TYPED_ARG` business.
Given that, do you still prefer `AppendInternal`? I can artificially split `void Append(base::span<const char> data)` into an inline wrapper over an `AppendInternal` if you prefer.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
void AppendInternal(const char* data, size_t);
David BenjaminI'm OK with this but I preferred it when we funneled everything to AppendInternal().
Maybe I'll feel differently about this once we don't have so many Append() methods
Oh, the reason for this is that, once we spanify it, the right spelling for `AppendInternal` is `AppendInternal(span<const char>)`. But that's already one of the APIs we'd like to expose publicly as `Append(span<char>)`, so there's no point in having an `Append` vs `AppendInternal` split.
In the old, non-span version, it bottomed out at this one, which is *almost* one of the public APIs, except the public one has some `STRICTLY_TYPED_ARG` business.
Given that, do you still prefer `AppendInternal`? I can artificially split `void Append(base::span<const char> data)` into an inline wrapper over an `AppendInternal` if you prefer.
I think I'll probably happier once it's simplified in the future :) So this is fine
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
4 is the latest approved patch-set.
No files were changed between the latest approved patch-set and the submitted one.
Spanify RawResource::AppendData
As part of this, I've also spanified parts of the SharedBuffer API. I
haven't removed the old SharedBuffer API just because there are rather a
lot of callers.
One subtlety: Blink seems to use char instead of unsigned char (aka
uint8_t) for bytes, so I've kept it at span<const char>. When binding a
span<const char> to a string literal, you pick up the NUL byte because C
is the worst. To avoid that, I transited things through
std::string_view.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |