PSA: base::StringPiece is now std::string_view

283 views
Skip to first unread message

David Benjamin

unread,
Jul 17, 2023, 5:12:03 PM7/17/23
to Chromium-dev, cxx, Dana Jansens, Daniel Cheng

Hi all,


We recently landed https://crrev.com/1161510 and https://crrev.com/1166103, which makes base::StringPiece into a type alias for std::string_view and allows std::string_view. Please feel free to start writing std::string_view instead in new code.


If you need to use operator<< with the u16string or wstring variants, you will need to include "base/strings/utf_ostream_operators.h", or simply include "base/strings/utf_string_conversions.h" and call base::UTF16ToUTF8 and base::WideToUTF8 explicitly. (These operators previously lived in a combination of "base/string_piece.h" and "base/logging.h".)


std::string_view (and types that convert to std::string_view) integrate better with std::string. This should reduce the need to decompose a std::string_view into data() and size(), which loses bounds checks.


Finally, as a reminder, whether it's base::StringPiece or std::string_view, never pass a view type's data() into a function that expects a NUL-terminated C string! StringPiece/string_view is not guaranteed to be NUL-terminated, so this is a buffer overflow. Always use data() and size() in pairs or, better yet, make the functions take a string_view directly.


A handy cheat sheet:


#include "base/string_piece.h" => #include <string_view>

#include "base/string_piece_forward.h" => #include <string_view>


base::StringPiece => std::string_view

base::StringPiece16 => std::u16string_view

base::WStringPiece => std::wstring_view


str.assign(view.data(), view.size()); => str.assign(view); or str = view;

str.append(view.data(), view.size()); => str.append(view); or str += view;


std::string str(view.data(), view.size()); => std::string str(view);

std::string str(view.data()); => Never do this! Use the above instead.


We don't have a concrete plan yet to migrate existing code, but that would be a good clean-up opportunity.


David

Greg Thompson

unread,
Jul 18, 2023, 7:12:36 AM7/18/23
to David Benjamin, Chromium-dev, cxx, Dana Jansens, Daniel Cheng
On Mon, Jul 17, 2023 at 11:12 PM David Benjamin <davi...@chromium.org> wrote:

Hi all,


We recently landed https://crrev.com/1161510 and https://crrev.com/1166103, which makes base::StringPiece into a type alias for std::string_view and allows std::string_view. Please feel free to start writing std::string_view instead in new code.


If you need to use operator<< with the u16string or wstring variants, you will need to include "base/strings/utf_ostream_operators.h", or simply include "base/strings/utf_string_conversions.h" and call base::UTF16ToUTF8 and base::WideToUTF8 explicitly. (These operators previously lived in a combination of "base/string_piece.h" and "base/logging.h".)


std::string_view (and types that convert to std::string_view) integrate better with std::string. This should reduce the need to decompose a std::string_view into data() and size(), which loses bounds checks.


Finally, as a reminder, whether it's base::StringPiece or std::string_view, never pass a view type's data() into a function that expects a NUL-terminated C string!


Does that apply to cases where you're certain that the piece/view was constructed from a string literal? Here's a totally contrived example:

std::string SayHiToMom() {
  static constexpr std::string_view kFormat("hi %s\n");
  return base::Stringprintf(kFormat.data(), "mom");
}

 
Yes, that's totally contrived, but I think I've seen cases where we have pieces constructed from literals where we rely on them being terminated. Is that not okay?

StringPiece/string_view is not guaranteed to be NUL-terminated, so this is a buffer overflow. Always use data() and size() in pairs or, better yet, make the functions take a string_view directly.


A handy cheat sheet:


#include "base/string_piece.h" => #include <string_view>

#include "base/string_piece_forward.h" => #include <string_view>


base::StringPiece => std::string_view

base::StringPiece16 => std::u16string_view

base::WStringPiece => std::wstring_view


str.assign(view.data(), view.size()); => str.assign(view); or str = view;

str.append(view.data(), view.size()); => str.append(view); or str += view;


std::string str(view.data(), view.size()); => std::string str(view);

std::string str(view.data()); => Never do this! Use the above instead.


We don't have a concrete plan yet to migrate existing code, but that would be a good clean-up opportunity.


David

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAF8qwaAbgQEeui-dZ_rhZtG_ea%3DCeOzN7pPygeZDDzuXhHng%2BQ%40mail.gmail.com.

Dana Jansens

unread,
Jul 18, 2023, 9:02:15 AM7/18/23
to Greg Thompson, David Benjamin, Chromium-dev, cxx, Daniel Cheng
On Tue, Jul 18, 2023 at 7:12 AM Greg Thompson <g...@chromium.org> wrote:
On Mon, Jul 17, 2023 at 11:12 PM David Benjamin <davi...@chromium.org> wrote:

Hi all,


We recently landed https://crrev.com/1161510 and https://crrev.com/1166103, which makes base::StringPiece into a type alias for std::string_view and allows std::string_view. Please feel free to start writing std::string_view instead in new code.


If you need to use operator<< with the u16string or wstring variants, you will need to include "base/strings/utf_ostream_operators.h", or simply include "base/strings/utf_string_conversions.h" and call base::UTF16ToUTF8 and base::WideToUTF8 explicitly. (These operators previously lived in a combination of "base/string_piece.h" and "base/logging.h".)


std::string_view (and types that convert to std::string_view) integrate better with std::string. This should reduce the need to decompose a std::string_view into data() and size(), which loses bounds checks.


Finally, as a reminder, whether it's base::StringPiece or std::string_view, never pass a view type's data() into a function that expects a NUL-terminated C string!


Does that apply to cases where you're certain that the piece/view was constructed from a string literal? Here's a totally contrived example:

std::string SayHiToMom() {
  static constexpr std::string_view kFormat("hi %s\n");
  return base::Stringprintf(kFormat.data(), "mom");
}

 
Yes, that's totally contrived, but I think I've seen cases where we have pieces constructed from literals where we rely on them being terminated. Is that not okay?

Let's say it's not a durable practice, the view may be modified by future code changes that introduce bugs, and the compiler won't help you. This example is very local, so it could come with scary comments that hope to guard against breaking it in the future. Personally, I'd work with char[] if I want a null-terminated constant in this example. We don't currently have a lightweight view type that promises null termination which would be more appropriate than char[]/char*.

David Benjamin

unread,
Jul 18, 2023, 11:47:11 AM7/18/23
to Dana Jansens, Greg Thompson, Chromium-dev, cxx, Daniel Cheng
On Tue, Jul 18, 2023, 09:02 Dana Jansens <dan...@chromium.org> wrote:
On Tue, Jul 18, 2023 at 7:12 AM Greg Thompson <g...@chromium.org> wrote:
On Mon, Jul 17, 2023 at 11:12 PM David Benjamin <davi...@chromium.org> wrote:

Hi all,


We recently landed https://crrev.com/1161510 and https://crrev.com/1166103, which makes base::StringPiece into a type alias for std::string_view and allows std::string_view. Please feel free to start writing std::string_view instead in new code.


If you need to use operator<< with the u16string or wstring variants, you will need to include "base/strings/utf_ostream_operators.h", or simply include "base/strings/utf_string_conversions.h" and call base::UTF16ToUTF8 and base::WideToUTF8 explicitly. (These operators previously lived in a combination of "base/string_piece.h" and "base/logging.h".)


std::string_view (and types that convert to std::string_view) integrate better with std::string. This should reduce the need to decompose a std::string_view into data() and size(), which loses bounds checks.


Finally, as a reminder, whether it's base::StringPiece or std::string_view, never pass a view type's data() into a function that expects a NUL-terminated C string!


Does that apply to cases where you're certain that the piece/view was constructed from a string literal? Here's a totally contrived example:

std::string SayHiToMom() {
  static constexpr std::string_view kFormat("hi %s\n");
  return base::Stringprintf(kFormat.data(), "mom");
}

 
Yes, that's totally contrived, but I think I've seen cases where we have pieces constructed from literals where we rely on them being terminated. Is that not okay?

Let's say it's not a durable practice, the view may be modified by future code changes that introduce bugs, and the compiler won't help you. This example is very local, so it could come with scary comments that hope to guard against breaking it in the future. Personally, I'd work with char[] if I want a null-terminated constant in this example. We don't currently have a lightweight view type that promises null termination which would be more appropriate than char[]/char*.

+1
 
You're right that it works, but it's fragile and makes it much harder to catch the bad cases that don't work. We could conceivably build a decently accurate linter for the bad cases. (Look for functions that call data without size.) But distinguishing those from the secretly NUL-terminated would be extra challenging. Especially as code gets moved around and more and more layers are introduced in between. And then things that computers have a hard time checking tend also to be things that humans have a hard time with. :-)

Relatedly, although they are guaranteed to be the same as of C++11, I'm personally a fan of using std::string::data() when I'm going to pair it with size() and std::string::c_str() when I'm relying on NUL termination. It helps reinforce this rule by saying data/size always come in pairs (better yet, use span and view types), while c_str is only available when you can rely on NUL termination.

David

Lei Zhang

unread,
Jul 25, 2023, 2:09:21 PM7/25/23
to David Benjamin, Chromium-dev, cxx, Dana Jansens, Daniel Cheng
On Mon, Jul 17, 2023 at 2:12 PM David Benjamin <davi...@chromium.org> wrote:
> We don't have a concrete plan yet to migrate existing code, but that would be a good clean-up opportunity.

How about getting rid of base::WStringPiece and base::StringPiece16
first? Those only have ~400 and ~800 uses in the code base,
respectively.

Lei Zhang

unread,
Nov 7, 2023, 12:34:19 PM11/7/23
to Chromium-dev, cxx, Dana Jansens, Daniel Cheng, David Benjamin
Just to update everyone on the base::StringPiece migration:

As of now, base/strings/string_piece.h just includes
string_piece_foward.h. All the files that depended on string_piece.h
for transitive includes have been fixed.

In the near future:
1) string_piece_foward.h is going away first. Its contents will be
moved over into string_piece.h.
2) base::WStringPiece and base::StringPiece16 will get replaced and be deleted.
3) base::StringPiece and base::BasicStringPiece will get replaced and
be deleted, along with string_piece.h.

If you would like to follow along as the migration progresses, the bug
report to star is https://crbug.com/691162 .
Reply all
Reply to author
Forward
0 new messages