zstring_view

198 views
Skip to first unread message

Olaf van der Spek

unread,
May 13, 2015, 7:13:59 AM5/13/15
to std-pr...@isocpp.org
C (platform) functions / libraries usually take strings as const char*. When a parameter in one of your functions has to be forwarded to such a function, what should the type of the parameter be? const char*? std::string? Both?
Disadvantage of const char* is the inability to pass std::string as is, the caller has to use c_str(). Disadvantage of std::string is the potential unnecessary construction if a const char* is passed. Disadvantage of both is interface duplication.

Would it make sense to have a zstring_view for such cases? It'd be a null-terminated variant of string_view. Obviously it'd be less generic then string_view but as C functions aren't going anywhere any time soon it might still be handy.

Nicol Bolas

unread,
May 13, 2015, 10:16:07 AM5/13/15
to std-pr...@isocpp.org

So, you want to add an entire new type, just so that you don't have to type `.c_str()` after a std::string?

Equally importantly, `zstring_view` would not be a very useful view class by itself. It's API will be more limited than an actual `string_view`. You can chop of leading characters, but you can't do anything more than that, since it has to be null-terminated. You could never do Regex searches that return `zstring_view`s. And so forth.

The only useful thing you can do with it is pass it to a C-interface. And let's examine how useful that really is.

Let's say you're writing a function which internally needs to pass one of its parameters to a C API that takes null-terminated strings (not all C API's pretend that strings are null-terminated). Here are your options for that parameter type:

* `const char *`: It works directly with any compatible type. If the user only has a `string_view`, then they'll have to allocate memory, probably with a `std::string`.
* `const std::string &`: It works directly with any compatible type. If the user doesn't have a `std:string` on hand, then memory will have to be allocated. But so long as their type can implicitly construct a `std::string`, they don't have to see it.
* `zstring_view`: It works directly with any compatible type. If the user only has a `string_view`, then they'll have to allocate memory, probably with a `std::string`.

In short... it's exactly the same as `const char *` from the user's perspective. It hasn't improved anything beyond not having to type `c_str()`. I'd rather see the existing option used, rather than some new class type.

Personally, I prefer option 2. The user has to allocate memory, but they don't have to put it there explicitly. It happens invisibly as a temporary. Or at least, it can, depending on the source type.

Nevin Liber

unread,
May 13, 2015, 10:39:56 AM5/13/15
to std-pr...@isocpp.org
On 13 May 2015 at 06:13, Olaf van der Spek <olafv...@gmail.com> wrote:
Would it make sense to have a zstring_view for such cases? It'd be a null-terminated variant of string_view. Obviously it'd be less generic then string_view but as C functions aren't going anywhere any time soon it might still be handy.

I rather see something like this come from experience rather than invention  It is unclear how useful it would be in practice.  There are also some rather obvious interface questions, such as what is the result of zstring_view("Embedded\0Here").size().
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Matthew Fioravante

unread,
May 14, 2015, 9:47:16 PM5/14/15
to std-pr...@isocpp.org
In order to be consisty


On Wednesday, May 13, 2015 at 10:39:56 AM UTC-4, Nevin ":-)" Liber wrote:
 what is the result of zstring_view("Embedded\0Here").size().

In order to maintain invariant that strlen(zs.c_str()) == zs.length() it would have to be 8. That is whenever you construct a zstring_vew from a string_view the implementation must scan the string for nulls. This won't matter for string literals because the compiler will optimize out the search.


Nevin Liber

unread,
May 14, 2015, 11:06:31 PM5/14/15
to std-pr...@isocpp.org
On 14 May 2015 at 20:47, Matthew Fioravante <fmatth...@gmail.com> wrote:
On Wednesday, May 13, 2015 at 10:39:56 AM UTC-4, Nevin ":-)" Liber wrote:
 what is the result of zstring_view("Embedded\0Here").size().

In order to maintain invariant that strlen(zs.c_str()) == zs.length() it would have to be 8. That is whenever you construct a zstring_vew from a string_view the implementation must scan the string for nulls.

If that is your invariant, then I really don't understand your zmstring_view idea.  Do you plan on checking every write to make sure none of them write a '\0'?  Or does your zmstring_view have a different invariant, which seems incredibly messy for users to deal with?

If you really want something like this standardized, put it in your code base and get experience with what works and what doesn't.

Matthew Fioravante

unread,
May 14, 2015, 11:19:18 PM5/14/15
to std-pr...@isocpp.org


On Thursday, May 14, 2015 at 11:06:31 PM UTC-4, Nevin ":-)" Liber wrote:
On 14 May 2015 at 20:47, Matthew Fioravante <fmatth...@gmail.com> wrote:
On Wednesday, May 13, 2015 at 10:39:56 AM UTC-4, Nevin ":-)" Liber wrote:
 what is the result of zstring_view("Embedded\0Here").size().

In order to maintain invariant that strlen(zs.c_str()) == zs.length() it would have to be 8. That is whenever you construct a zstring_vew from a string_view the implementation must scan the string for nulls.

If that is your invariant, then I really don't understand your zmstring_view idea. 

I think there has to be the invariant, otherwise the type doesn't make any sense. Its like passing around a char* with a cached call to strlen() in a size_t. If you insert a null and don't update the cached length, your algorithm will be incorrect. Pre-computing and caching the size is important, particlarly when you want to do operations on the end of the string such as stripping trailing whitespace.
 
Do you plan on checking every write to make sure none of them write a '\0'? 

One solution is undefined behavior if you write a '\0' to a zmstring_view. Debug implementations could choose to check the writes at runtime. Using a type like this provides a superior interface to passing (char*,size_t) as both of these pieces of information are tied together within a single range object.


 

Olaf van der Spek

unread,
May 15, 2015, 7:43:10 AM5/15/15
to std-pr...@isocpp.org


Op woensdag 13 mei 2015 16:16:07 UTC+2 schreef Nicol Bolas:
So, you want to add an entire new type, just so that you don't have to type `.c_str()` after a std::string?

Yes. Are types too expensive for that?
 
Equally importantly, `zstring_view` would not be a very useful view class by itself. It's API will be more limited than an actual `string_view`. You can chop of leading characters, but you can't do anything more than that, since it has to be null-terminated. You could never do Regex searches that return `zstring_view`s. And so forth.

Obviously such functions would return a regular string_view.
 
Personally, I prefer option 2. The user has to allocate memory, but they don't have to put it there explicitly. It happens invisibly as a temporary. Or at least, it can, depending on the source type.

I am not a fan of such unnecessary allocations.

Why do you prefer 2? Do you agree that not having to use ().c_str() is a benefit?

Olaf van der Spek

unread,
May 15, 2015, 7:46:16 AM5/15/15
to std-pr...@isocpp.org
Op woensdag 13 mei 2015 16:39:56 UTC+2 schreef Nevin ":-)" Liber:
On 13 May 2015 at 06:13, Olaf van der Spek <olafv...@gmail.com> wrote:
Would it make sense to have a zstring_view for such cases? It'd be a null-terminated variant of string_view. Obviously it'd be less generic then string_view but as C functions aren't going anywhere any time soon it might still be handy.

I rather see something like this come from experience rather than invention  It is unclear how useful it would be in practice.  There are also some rather obvious interface questions, such as what is the result of zstring_view("Embedded\0Here").size().

Same as string_view("Embedded\0Here").size() ? 

Nicol Bolas

unread,
May 15, 2015, 9:23:45 AM5/15/15
to std-pr...@isocpp.org
On Friday, May 15, 2015 at 7:43:10 AM UTC-4, Olaf van der Spek wrote:
Op woensdag 13 mei 2015 16:16:07 UTC+2 schreef Nicol Bolas:
So, you want to add an entire new type, just so that you don't have to type `.c_str()` after a std::string?

Yes. Are types too expensive for that?

Yes, they are.

Every new type you add has a cost, even if that cost is only in the programmer's head when he asks "what type should I use here?" Adding more types increases the cost of this question, as it makes it more likely for the programmer to get it wrong. To create a whole type for the primary purpose of avoiding the use of a member function increases this burden to no great advantage.

Equally importantly, `zstring_view` would not be a very useful view class by itself. It's API will be more limited than an actual `string_view`. You can chop of leading characters, but you can't do anything more than that, since it has to be null-terminated. You could never do Regex searches that return `zstring_view`s. And so forth.

Obviously such functions would return a regular string_view.
 
Personally, I prefer option 2. The user has to allocate memory, but they don't have to put it there explicitly. It happens invisibly as a temporary. Or at least, it can, depending on the source type.

I am not a fan of such unnecessary allocations.

And if I have a string_view, I need to do an allocation to get the null-terminator in place. So it's hardly "unnecessary". It's only "unnecessary" if the user just so happened to already have one of your `zstring_view`s.

Why do you prefer 2? Do you agree that not having to use ().c_str() is a benefit?

 I gave you the reason: if you needed memory allocations, you don't have to litter your code with zstring_view-to-std::string conversions.

Olaf van der Spek

unread,
May 15, 2015, 9:26:32 AM5/15/15
to std-pr...@isocpp.org
2015-05-15 15:23 GMT+02:00 Nicol Bolas <jmck...@gmail.com>:
>>> Personally, I prefer option 2. The user has to allocate memory, but they
>>> don't have to put it there explicitly. It happens invisibly as a temporary.
>>> Or at least, it can, depending on the source type.
>>
>>
>> I am not a fan of such unnecessary allocations.
>
>
> And if I have a string_view, I need to do an allocation to get the
> null-terminator in place. So it's hardly "unnecessary". It's only

Eh, it's either necessary or it's not. If you've already got a
null-terminated string but it's not a std::string then the allocation
is unnecessary...


--
Olaf
Reply all
Reply to author
Forward
0 new messages