IMHO A grave design error in std::string_view will bring c++ into disrepute unless fixed

2,856 views
Skip to first unread message

Richard Hodges

unread,
Sep 4, 2017, 4:09:30 AM9/4/17
to std-dis...@isocpp.org


here's a well-intentioned offering to produce a substring from either a string-view or string in a uniform way:

#include <string>
#include <string_view>

std::string_view sub_string(std::string_view s, 
  std::size_t p, 
  std::size_t n = std::string_view::npos)
{
  return s.substr(p, n);
}

std::string sub_string(std::string&& s, 
  std::size_t p, 
  std::size_t n = std::string_view::npos)
{
  return s.substr(p, n);
}

std::string sub_string(std::string const& s, 
  std::size_t p, 
  std::size_t n = std::string_view::npos)
{
  return s.substr(p, n);
}

And here's how it will introduce random segfaults into user code.

int main()
{
  using namespace std::literals;

  auto source = "foobar"s;
  auto bar = sub_string(source, 3);

  // but uh-oh...
  bar = sub_string("foobar"s, 3);
  // now use bar at your peril...
}

gcc and clang don't produce any warnings here. I don't believe code reviews will find bugs like this reliably, and often neither will unit testing.

Allowing implicit conversions from std::string to std::string_view is all very nice, and I understand the intention - to allow algorithms to become more efficient by a simple, compatible interface change.

C++11 went a long way to removing c++'s reputation as a difficult language to get right.

This one design error in c++17 will re-award c++ the accolade of "most buggy and segfaultly language on the planet".

On grounds of safety alone, it is a design error and should be removed.

Michael Hava

unread,
Sep 4, 2017, 8:02:08 AM9/4/17
to ISO C++ Standard - Discussion
I see your concern, but isn't this a classic example of "don't store references to temporaries"?
Maybe I'm missing something, but decltype(bar) should yield std::string and therefore be safe...

Richard Hodges

unread,
Sep 4, 2017, 8:07:11 AM9/4/17
to std-dis...@isocpp.org
My mistake in the example was pointed out on stack overflow where I have corrected it (and therefore reintroduced the bug despite the well intentioned overloads).

Repeated here:

int main()
{
  using namespace std::literals;

  auto source = "foobar"s;

  auto bar = sub_string(std::string_view(source), 3);


  // but uh-oh...
  bar = sub_string("foobar"s, 3);
}
The problem for me is that string_view is a copyable object by interface, but a reference by behaviour. Since there is an automatic conversion from temporary strings to string_views, these ‘references that look like objects’ will be created by mistake by users who are not necessarily sure of which overload will be selected in any given scenario, particularly when templates are involved.

There is a name for the condition where a user is not sure of what object he’ll get out of a function call - it’s called a broken interface.




On 4 Sep 2017, at 14:02, Michael Hava <m...@live.at> wrote:

I see your concern, but isn't this a classic example of "don't store references to temporaries"?
Maybe I'm missing something, but decltype(bar) should yield std::string and therefore be safe...


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Michael Hava

unread,
Sep 4, 2017, 8:13:23 AM9/4/17
to std-dis...@isocpp.org

Ok, now your example makes sense to me (was checking StackOverflow right now too).

I agree with you! This is a horrible issue – especially as most people are (apparently) starting to recommend using string_view for APIs.

This has a distinct auto_ptr-smell to it…

--

---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-discussion/Gj5gt5E-po8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-discussio...@isocpp.org.

Thiago Macieira

unread,
Sep 4, 2017, 8:23:39 AM9/4/17
to std-dis...@isocpp.org
On the subject:

You can try not to be an alarmist. Even if there's a design error and it's
grave, it may not call C++ into disrepute. Bugs and even design errors happen
and have happened before, and in other languages. One thing does not call the
whole into question.

Now, I will wait until you rephrase the subject into a more technical
description before I answer the rest. I have not read your email.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Richard Hodges

unread,
Sep 4, 2017, 8:31:12 AM9/4/17
to std-dis...@isocpp.org
Forgive me, I am trying not to be alarmist, but I am alarmed.

Problem:

* string_view has the interface of a copyable object, but the semantics of a reference.

* string_view exposes a non-counter reference the implementation of a value_semantic object

* string_view is implicitly constructible from said value-semantic object, even from a temporary.

* exposing the implementation of a destructed temporary object by reference while masquerading as a value type is tantamount to expressing an interface that lies.

* My deeply held concern is that this lying object will prove popular, it’s popularity will cause use in inappropriate situations (such as temporary objects) and the compiler cannot help the user to expose such incorrect use cases.

* These uses will result in a UB-fest in userland.

p_ha...@wargaming.net

unread,
Sep 4, 2017, 8:42:41 AM9/4/17
to ISO C++ Standard - Discussion


On Monday, September 4, 2017 at 10:07:11 PM UTC+10, Richard Hodges wrote:
My mistake in the example was pointed out on stack overflow where I have corrected it (and therefore reintroduced the bug despite the well intentioned overloads).

Repeated here:

int main()
{
  using namespace std::literals;

  auto source = "foobar"s;
  auto bar = sub_string(std::string_view(source), 3);

  // but uh-oh...
  bar = sub_string("foobar"s, 3);
}
The problem for me is that string_view is a copyable object by interface, but a reference by behaviour. Since there is an automatic conversion from temporary strings to string_views, these ‘references that look like objects’ will be created by mistake by users who are not necessarily sure of which overload will be selected in any given scenario, particularly when templates are involved.


This doesn't seem to have been discussed or addressed in P0254, but perhaps it's as simple as making std::basic_string::operator basic_string_view deleted for rvalues? It makes the interface less convenient if your std::basic_string_view is also only ever an rvalue, and would not introduce any issues as it would not outlive the relevant std::basic_string. It might therefore fail the test of "Does it make it easy to use with library methods"?

This smells less like an auto_ptr issue, or even a dangling-reference issue, as it does a 'using an invalidated iterator' issue, or in this case, 'silently returning a reference to a const reference you got as a parameter'. I guess the latter *is* a dangling-reference issue.

Noted in p0540 in passing, ranges-v3 catches you when you try to do this, and effectively offers the same solution, rejecting rvalues piped into view::split.

Eyal Rozenberg

unread,
Sep 4, 2017, 8:43:53 AM9/4/17
to std-dis...@isocpp.org, Richard Hodges
I wasn't involved in the standardization of string_view, but here's my
take as a developer only:

I don't share your alarm, nor your surprise about the behavior of the
code you cited.

You see, I think of a string_view as basically a plain pointer with a
length indication. And a plain pointer also "has the interface of a
copyable object, but the semantics of a reference." Suppose your
sub_string returned just a pointer to the beginning of the substring.
Your code would then be:

bar_ptr = sub_string("foobar"s, 3);

you can't expect to be able to use this pointer. Same thing for the
string view.

Just my 2 cents,
Eyal

p_ha...@wargaming.net

unread,
Sep 4, 2017, 8:46:12 AM9/4/17
to ISO C++ Standard - Discussion
On Monday, September 4, 2017 at 10:42:41 PM UTC+10, p_ha...@wargaming.net wrote:
This smells less like an auto_ptr issue, or even a dangling-reference issue, as it does a 'using an invalidated iterator' issue, or in this case, 'silently returning a reference to a const reference you got as a parameter'. I guess the latter *is* a dangling-reference issue.

The reason this jumped out at me as an 'invalidated iterator issue' is that in-theory, a setup like MSVC debugging iterators could be used, so that a std::string_view built from a std::string is invalidated if the std::string's storage is released or moved, and this failure could be asserted at runtime.

It's not a _great_ solution, though. I wouldn't want to pay the runtime cost unless I was actively hunting an already-observed bug.

Bo Persson

unread,
Sep 4, 2017, 10:42:47 AM9/4/17
to std-dis...@isocpp.org
On 2017-09-04 14:31, Richard Hodges wrote:
> Forgive me, I am trying not to be alarmist, but I am alarmed.
>
> Problem:
>
> * string_view has the interface of a copyable object, but the semantics of a reference.
>
> * string_view exposes a non-counter reference the implementation of a value_semantic object
>
> * string_view is implicitly constructible from said value-semantic object, even from a temporary.
>
> * exposing the implementation of a destructed temporary object by reference while masquerading as a value type is tantamount to expressing an interface that lies.
>
> * My deeply held concern is that this lying object will prove popular, it’s popularity will cause use in inappropriate situations (such as temporary objects) and the compiler cannot help the user to expose such incorrect use cases.
>
> * These uses will result in a UB-fest in userland.


It is interesting that the original std::string didn't have an implicit
conversion to char* - considered too dangerous - so we have to use
c_str() to explicitly get the pointer.

And now we have an implicit conversion to a char* and a length - I can
feel the same danger here too...

Thiago Macieira

unread,
Sep 4, 2017, 11:10:29 AM9/4/17
to std-dis...@isocpp.org
On Monday, 4 September 2017 09:31:07 -03 Richard Hodges wrote:
> Forgive me, I am trying not to be alarmist, but I am alarmed.
>
> Problem:
>
> * string_view has the interface of a copyable object, but the semantics of a
> reference.
>
> * string_view exposes a non-counter reference the implementation of a
> value_semantic object
>
> * string_view is implicitly constructible from said value-semantic object,
> even from a temporary.
>
> * exposing the implementation of a destructed temporary object by reference
> while masquerading as a value type is tantamount to expressing an interface
> that lies.
>
> * My deeply held concern is that this lying object will prove popular, it’s
> popularity will cause use in inappropriate situations (such as temporary
> objects) and the compiler cannot help the user to expose such incorrect use
> cases.
>
> * These uses will result in a UB-fest in userland.

I'm not alarmed.

All *_view APIs (and QStringView, for that matter) need to remember never to
return or store a view based on a parameter that was a view. If your API takes
a string_view and you need to store for later, you store a string, not
string_view.

As for mutating APIs that return, you can make them take non-const lvalue
references, just like std::as_const and qAsConst (can't pass a temporary). But
in many cases, that's damned-if-you-do damned-if-you-dont. std::string_view's
API that returns new std::string_view objects would also be affected.

In general, if you can't be sure of how it's going to be used, DO NOT return
std::string_view in your API. The current Qt rule is that no API returns a
reference (except containers) or reference-like object, so nothing returns
QStringView.

Richard Hodges

unread,
Sep 4, 2017, 11:53:16 AM9/4/17
to std-dis...@isocpp.org
Library maintainers are (usually) knowledgeable, disciplined people, and once upon a time only library maintainers dallied with templates.

However, template expansion is now the norm in user code (for what is auto of it's not a template expansion shorthand?)

If string_view is unsafe as a return type (and I would argue that it is) and unsafe to be copied (in almost all scenarios, it is) then before introducing it, I would argue that there should be some protection in the language against misuse.

perhaps an attribute such as:

template<class Ch, …> [[non_storable]] struct string_view { … };

In which case, these use would be legal:

foo(somethingThatReturnsStringView());

auto foo(std::string_view sv) { /* … */ }

auto x = std::string(somethingThatReturnsStringView());

but these would not:

auto sv = somethingThatReturnsStringView();  // illegal because the sv was stored
foo(sv);

but then this could be:

auto [[force_store_allowed]] sv = somethingThatReturnsStringView();

Something like this could be applied to any object-that’s-really-a-dangling-reference type.


Ville Voutilainen

unread,
Sep 4, 2017, 12:03:16 PM9/4/17
to std-dis...@isocpp.org
On 4 September 2017 at 18:53, Richard Hodges <hodg...@gmail.com> wrote:
> but these would not:
>
> auto sv = somethingThatReturnsStringView(); // illegal because the sv was
> stored
> foo(sv);


So if I have

class OwnsAStringView
{
string_view just_gief_me_the_view();
};

and I happen to have a OwnsAStringView bar at hand, I'll do

auto sv = bar.just_gief_me_the_view();
foo(sv);

I wouldn't want that to be ill-formed just because in some other use
cases a string_view can be made to dangle.

Ville Voutilainen

unread,
Sep 4, 2017, 12:04:18 PM9/4/17
to std-dis...@isocpp.org
On 4 September 2017 at 19:03, Ville Voutilainen
<ville.vo...@gmail.com> wrote:
> So if I have
>
> class OwnsAStringView
> {
> string_view just_gief_me_the_view();
> };


And before you ask, I mean

class OwnsAStringView
{
string_view just_gief_me_the_view();
/* the rest of it omitted for brevity */
};

Richard Hodges

unread,
Sep 4, 2017, 12:18:08 PM9/4/17
to std-dis...@isocpp.org
I understand the dilemma. On the want hand we want succinctness and elegance of expression. On the other, safety.

At the moment, my firm opinion is that string_view is beautifully succinct and horribly dangerous.

It would not be so dangerous if we did not almost-always-auto, because at the point of use the user has a clue that something is up - he must be careful.

But in the call chain :

auto a = x(y(z(some_dependency)));

Where some or all of these functions are template functions (or lambdas with auto argument deduction) then even the most vigorous code inspection would in all likelihood fail to reveal a hidden implicit-string_view-from-temporary-string error awaiting someone who made a small change to the dependency’s type interface (replacing a std::string const& with a std::string_view, for example…). 

Possibly someone in a different team.

Because any way you want to cut it, std::string_view::substr() is very semantically different from std::string::substr(). The former makes a brand new, clean object. The latter exposes the guts of some unknown
object from a potentially unknown source with potentially unknown lifetime.

But std::string_view is being touted as a ‘more efficient way to pass string data’.

It is not, it’s a very different way to pass string data. The interface should not be confusingly compatible with std::string’s. It should certainly not be implicitly convertible, since 

* it’s not the same kind of thing, and

* the implicit conversion is unsafe.

If previous c++ design decisions count for anything here, I could point out that std::ref() does not allow initialisation from r-value references. Can we guess why?  



Brittany Friedman

unread,
Sep 4, 2017, 12:21:03 PM9/4/17
to std-dis...@isocpp.org
On Mon, Sep 4, 2017 at 10:53 AM, Richard Hodges <hodg...@gmail.com> wrote:
but then this could be:

auto [[force_store_allowed]] sv = somethingThatReturnsStringView();

So you postulate that teaching people how string_view works is too "difficult ... to get right"

In particular, that experts, "Library maintainers" may have no problem with string_view but that, like templates, this feature may be dangerous when exposed to "user code".

So if I am understanding correctly, in your model, there exist people who are not knowledgeable enough to use string_view safely. These people will not, or at least have not, bothered to learn how to use the string_view API correctly.

Your solution is to require these people to type an additional attribute, [[force_store_allowed]].

Why do you believe that it is easier to teach someone the semantics of [[force_store_allowed]] so that it will not be abused, rather than simply teaching them how string_view works in the first place?

Teaching someone the meaning of [[force_store_allowed]] requires them to understand the problem with string_view (or at least dangling references in general), does it not?

It seems that you are making the problem harder to teach, not easier. 

Richard Hodges

unread,
Sep 4, 2017, 12:32:51 PM9/4/17
to std-dis...@isocpp.org
It’s not that people cannot be taught how to use what is essentially a pointer to the guts of an object. People on the whole do understand how to use pointers, but a lot of effort has been spent encouraging them not to. For obvious reasons.

My previous email (I hope) provides an insight into the problems I foresee. Developers interface with other people’s code. If that code exports a string_view, or an interface changes to provide a string_view alternative (because someone was hoping to provide a performance boost to their users), then user code will still compile, silently converting string references to string_views where previously it would have made copies, and code will quietly, insidiously, behind the developer’s backs, become UB.

I understand that the string_view is a much-loved concept and emotions may run high over the issue - we all value performance. 

But implicit conversions have already been established to be a bad thing - particularly when the converted-to thing is a guts-exposer of the donor object.

This is not the intent of conversion.

std::string_view(const std::string&) should be explicit.
std::string_view(std::string&&) should be deleted.

because a string_view *is not* a "kind of string”.

A string_view is a pointer to the private implementation of a string. It’s as dangerous as std::string::c_str(). 

More so, because it’s implicit.

 

Brittany Friedman

unread,
Sep 4, 2017, 12:48:13 PM9/4/17
to std-dis...@isocpp.org
On Mon, Sep 4, 2017 at 11:32 AM, Richard Hodges <hodg...@gmail.com> wrote:
It’s not that people cannot be taught how to use what is essentially a pointer to the guts of an object. People on the whole do understand how to use pointers, but a lot of effort has been spent encouraging them not to. For obvious reasons.

My previous email (I hope) provides an insight into the problems I foresee. Developers interface with other people’s code. If that code exports a string_view, or an interface changes to provide a string_view alternative (because someone was hoping to provide a performance boost to their users), then user code will still compile, silently converting string references to string_views where previously it would have made copies, and code will quietly, insidiously, behind the developer’s backs, become UB.

I understand that the string_view is a much-loved concept and emotions may run high over the issue - we all value performance. 

But implicit conversions have already been established to be a bad thing - particularly when the converted-to thing is a guts-exposer of the donor object.

This is not the intent of conversion.

std::string_view(const std::string&) should be explicit.
std::string_view(std::string&&) should be deleted.

because a string_view *is not* a "kind of string”.

A string_view is a pointer to the private implementation of a string. It’s as dangerous as std::string::c_str(). 

More so, because it’s implicit.


I'm sorry, this argument has confused me a bit. Please help me understand.

My post was *specifically* talking about your [[allow_force_store]] proposal.

I don't understand how your response directly addresses my concerns about [[allow_force_store]].

auto [[force_store_allowed]] sv = somethingThatReturnsStringView();

Where is the implicit conversion in this code? 

Ville Voutilainen

unread,
Sep 4, 2017, 12:48:43 PM9/4/17
to std-dis...@isocpp.org
On 4 September 2017 at 19:32, Richard Hodges <hodg...@gmail.com> wrote:
> It’s not that people cannot be taught how to use what is essentially a
> pointer to the guts of an object. People on the whole do understand how to
> use pointers, but a lot of effort has been spent encouraging them not to.
> For obvious reasons.
>
> My previous email (I hope) provides an insight into the problems I foresee.
> Developers interface with other people’s code. If that code exports a
> string_view, or an interface changes to provide a string_view alternative
> (because someone was hoping to provide a performance boost to their users),
> then user code will still compile, silently converting string references to
> string_views where previously it would have made copies, and code will
> quietly, insidiously, behind the developer’s backs, become UB.

Are these just problems that you foresee, or have you observed
programmers making
the mistakes that you're concerned about? Is there some actual
evidence of such mistakes
becoming commonplace?

> I understand that the string_view is a much-loved concept and emotions may
> run high over the issue - we all value performance.
>
> But implicit conversions have already been established to be a bad thing -
> particularly when the converted-to thing is a guts-exposer of the donor
> object.

A string_view doesn't expose the guts of a string, so I'm not sure
what you're talking about there.
The implicit conversion does turn a value-semantics type into a
reference-semantics type, but
it doesn't expose the guts of the former.

The problem with all of this is timing. string_view is about to ship
as an official standardized facility,
because C++17 is about to ship. To change it now would require an
immensely strong rationale,
and I'm not convinced we have that rationale here.

Also, not all conversions from rvalue strings to string_views are
dangerous. The forwarding ones
aren't, the returning ones are.

Richard Hodges

unread,
Sep 4, 2017, 12:54:07 PM9/4/17
to std-dis...@isocpp.org
My post was *specifically* talking about your [[allow_force_store]] proposal.

Forgive me, this was not a proposal, merely the kind of thing that would be necessary to make implicit conversions safe. My proposal is that implicit conversions be disallowed.
 

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.

Richard Hodges

unread,
Sep 4, 2017, 1:07:58 PM9/4/17
to std-dis...@isocpp.org
A string_view doesn't expose the guts of a string, so I'm not sure
what you're talking about there.

  auto first_half(std::string_view sv) {  return sv.substr(0, sv.size() / 2); }   

  auto s = std::string(somthing_that_makes_strings());

  auto guts = first_half(s);      // guts is holding on to a reference to s's inner storage area. i.e. its guts.
  auto not_guts = s.substr(0, s.size() / 2);   // not_guts is a safe, clean object.

  // ...lots of other user code or calls through template functions here...
  auto y = std::move(s); // oops...

  anything(guts);   // UB
  anything(not_guts); // SAFE

The construction of guts is semantically nothing more than this:

    auto guts2 = std::make_tuple(s.c_str() + s.size() / 2, s.size() - s.size() / 2);

It's definitely exposing s's guts, if you're not careful for longer than s's lifetime.

it's semantically similar to std::reference_wrapper, who's construction function std::ref has a deleted r-value reference form. For good reason.

Q: Have I seen specific examples in the wild? 

A: how would I know? The conversion is implicit. For short strings it will probably work by luck most of the time. Maybe even for long ones most of the time. Isn't that the nature of UB? Shouldn't we try to avoid it careful interface design?

Any time a programmer says "I don't know whether I could even detect UB use of this object easily", it's a cue that the design of the interface is in error.

Or did the design principles of c++ recently change from "let's be sure" to "screw it, let's code and hope for the best"?



--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.

Brittany Friedman

unread,
Sep 4, 2017, 1:07:58 PM9/4/17
to std-dis...@isocpp.org
On Mon, Sep 4, 2017 at 11:54 AM, Richard Hodges <hodg...@gmail.com> wrote:
My post was *specifically* talking about your [[allow_force_store]] proposal.

Forgive me, this was not a proposal, merely the kind of thing that would be necessary to make implicit conversions safe. My proposal is that implicit conversions be disallowed.


Okay, thanks. So you propose removing the implicit string->string_view conversions specifically? And maybe the string_view(string&&) constructor?

Do you propose a replacement for these features? Do you want something like an explicit .view() member function?

Thiago Macieira

unread,
Sep 4, 2017, 1:10:33 PM9/4/17
to std-dis...@isocpp.org
On Monday, 4 September 2017 13:32:46 -03 Richard Hodges wrote:
> It’s not that people cannot be taught how to use what is essentially a
> pointer to the guts of an object. People on the whole do understand how to
> use pointers, but a lot of effort has been spent encouraging them not to.
> For obvious reasons.

Then don't encourage people to use std::string_view. Its problem is exactly
that of pointers: it points to some storage with unknown lifetime and no RAII
applies.

But no one is arguing for removal of pointers from the language. Both are
powerful -- if dangerous if misused -- features.

Richard Hodges

unread,
Sep 4, 2017, 1:14:51 PM9/4/17
to std-dis...@isocpp.org
Okay, thanks. So you propose removing the implicit string->string_view conversions specifically? And maybe the string_view(string&&) constructor?

Yes.
I propose that specifically:

std::string_view::string_view(std::string&&) = delete;
std::string_view& std::string_view::operator=(std::string&&) = delete;

and that:

std::string_view::string_view(std::string const&) = delete;
std::string_view& std::string_view::operator=(std::string const&) = delete;

and that:

std::string_view to_string_view(const std::string&);

std::string_view to_string_view(const char*);

template<std::size_t N> std::string_view to_string_view(const char (&)[N]);

could be provided for convenience.



> Do you propose a replacement for these features? Do you want something like an explicit .view() member function?

I generally favour free functions over members, but since std::string offers c_str() there's no reason it could not offer .view()


--

Richard Hodges

unread,
Sep 4, 2017, 1:17:04 PM9/4/17
to std-dis...@isocpp.org
But no one is arguing for removal of pointers from the language. Both are powerful -- if dangerous if misused -- features.

This could be construed as being on the verge of a straw man argument, but I'll be generous.

I am not arguing for the removal of string_view. Merely for the removal of the dangerous implicit construction from std::string.



--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.

Matthew Woehlke

unread,
Sep 4, 2017, 1:30:11 PM9/4/17
to std-dis...@isocpp.org, Thiago Macieira
On 2017-09-04 11:10, Thiago Macieira wrote:
> I'm not alarmed.
>
> All *_view APIs (and QStringView, for that matter) need to remember never to
> return or store a view based on a parameter that was a view. If your API takes
> a string_view and you need to store for later, you store a string, not
> string_view.

I would expand that further: *any* API needs to be very careful about
returning a pointer or reference to a temporary object. In particular,
any API that returns a pointer or reference that is based in some manner
on its input parameters (which, for class members, includes `this`)
needs to be very, very careful of those input parameters possibly being
temporary objects.

TBH, I'm sort-of ambivalent about allowing construction of a string_view
from a temporary string. On the one hand, we want to allow things like:

foo( string_view );
foo( "bar"s );

...which is perfectly safe (assuming that `foo` does not try to "hold on
to" the string data in any way), because the temporary string will not
go out of scope until the call to `foo` completes.

On the other hand, we would clearly like to forbid, or at least make it
easy for compilers to warn about, this:

string_view sv = "bar"s;

I'm not sure how to achieve both those objectives.

--
Matthew
Message has been deleted
Message has been deleted

Thiago Macieira

unread,
Sep 4, 2017, 1:47:29 PM9/4/17
to std-dis...@isocpp.org
On Monday, 4 September 2017 14:17:01 -03 Richard Hodges wrote:
> > But no one is arguing for removal of pointers from the language. Both are
> > powerful
> -- if dangerous if misused -- features.
>
> This could be construed as being on the verge of a straw man argument, but
> I'll be generous.
>
> I am not arguing for the removal of string_view. Merely for the removal of
> the dangerous implicit construction from std::string.

I see your other email.

That would make it more safe, at the expense of everyone -- even if fine
scenarios -- having to explicitly know that they are passing a
std::string_view. You can't add an API that takes both std::string and
std::string_view implicitly.

Richard Hodges

unread,
Sep 4, 2017, 1:49:20 PM9/4/17
to std-dis...@isocpp.org
I'm not sure how to achieve both those objectives.

by simply requiring an explicit cast:

foo(std::string_view("bar"s));


Ville Voutilainen

unread,
Sep 4, 2017, 1:49:42 PM9/4/17
to std-dis...@isocpp.org
On 4 September 2017 at 20:14, Richard Hodges <hodg...@gmail.com> wrote:
>> Okay, thanks. So you propose removing the implicit string->string_view
>> conversions specifically? And maybe the string_view(string&&) constructor?
>
> Yes.
> I propose that specifically:
>
> std::string_view::string_view(std::string&&) = delete;
> std::string_view& std::string_view::operator=(std::string&&) = delete;
>
> and that:
>
> std::string_view::string_view(std::string const&) = delete;
> std::string_view& std::string_view::operator=(std::string const&) = delete;
>
> and that:
>
> std::string_view to_string_view(const std::string&);
>
> std::string_view to_string_view(const char*);
>
> template<std::size_t N> std::string_view to_string_view(const char (&)[N]);
>
> could be provided for convenience.


This recreates the dependency from string to string_view, which was
deliberately removed before
string_view was adopted.

Richard Hodges

unread,
Sep 4, 2017, 1:50:22 PM9/4/17
to std-dis...@isocpp.org
You can't add an API that takes both std::string and
std::string_view implicitly.

Good. because the behaviour of a std::string_view is not similar to the behaviour of a std::string.


On 4 September 2017 at 19:47, Thiago Macieira <thi...@macieira.org> wrote:

Matthew Woehlke

unread,
Sep 4, 2017, 1:50:51 PM9/4/17
to std-dis...@isocpp.org, Richard Hodges
On 2017-09-04 13:07, Richard Hodges wrote:
>> A string_view doesn't expose the guts of a string, so I'm not sure
>> what you're talking about there.
>
> It's definitely exposing s's guts, if you're not careful for longer than
> s's lifetime.

Before either of you get carried away, I just want to point out you are
having a semantic quibble over what is meant by "guts". You (Richard)
are thinking of obtaining a reference to internal state in a manner that
no longer enforces lifetime preservation of the owning object. As I
understand it, this is what Ville meant when he says "The implicit
conversion does turn a value-semantics type into a reference-semantics
type". Whereas Ville I am guessing is thinking of "guts" as "exposes
internal implementation details".

FWIW, I immediately understood Richard's intended meaning, but let's
please not get caught up in this :-). I feel confident we agree on the
nature of the problem, if not the terminology.

> Q: Have I seen specific examples in the wild?
>
> A: how would I know? The conversion is implicit. For short strings it will
> probably work by luck most of the time. Maybe even for long ones most of
> the time. Isn't that the nature of UB? Shouldn't we try to avoid it careful
> interface design?

To that question, I would like to present
https://github.com/Kitware/kwiver/pull/270, along with the observation
that this is not the first instance of exactly this sort of bug in our code.

This isn't precisely the same issue, but it's in the same family of
issues, and thus possibly serves as real world evidence that, yes,
programmers do make mistakes here.

> Or did the design principles of c++ recently change from "let's be sure" to
> "screw it, let's code and hope for the best"?

Indeed.

I do have some concern that we're about to ship C++17 with essentially
this exact same "bug". I would argue that the example I cited, and this
discussion, are strong evidence that we need better tools to prevent
this sort of thing.

Since we don't have such tools currently, I would have some inclination
to adopt a "better safe than sorry" approach to string_view and require
calling an explicit `string::view()` method to pass a string to a
function taking a string_view. We can always go back and add an implicit
conversion. It is *much* harder to take away or pare back an overly
broad conversion once it is in a published standard.

(And yes, I'm aware — see my other mail — of why we probably do want the
conversions in their current form...)

--
Matthew

Thiago Macieira

unread,
Sep 4, 2017, 1:50:55 PM9/4/17
to std-dis...@isocpp.org
On Monday, 4 September 2017 14:49:17 -03 Richard Hodges wrote:
> > I'm not sure how to achieve both those objectives.
>
> by simply requiring an explicit cast:
>
> foo(std::string_view("bar"s));

Explicitly doing something invalidates the premise of being implicit.

Richard Hodges

unread,
Sep 4, 2017, 1:51:51 PM9/4/17
to std-dis...@isocpp.org
> This recreates the dependency from string to string_view, which was deliberately removed before string_view was adopted.

an implicit constructor from std::string is about as dependent as it gets.

Also, this is not a valid argument to against requiring an explicit construction.
 

Ville Voutilainen

unread,
Sep 4, 2017, 1:56:52 PM9/4/17
to std-dis...@isocpp.org
On 4 September 2017 at 20:07, Richard Hodges <hodg...@gmail.com> wrote:
>> A string_view doesn't expose the guts of a string, so I'm not sure
> what you're talking about there.
> The construction of guts is semantically nothing more than this:
>
> auto guts2 = std::make_tuple(s.c_str() + s.size() / 2, s.size() -
> s.size() / 2);

Except that's semantically completely different. However, that's
beside the point; if you don't
wish to pay heed to helpful suggestions to avoid questionable
terminology, that's your choice.

> Q: Have I seen specific examples in the wild?
>
> A: how would I know? The conversion is implicit. For short strings it will

If you don't know, what exactly is the evidence based on which you are
suggesting to make the
change you are suggesting to make?

Have you checked what happens to your example when it's run through a
Core Guidelines checker?

Does a sanitizer catch the mistake? Does valgrind?

> probably work by luck most of the time. Maybe even for long ones most of the
> time. Isn't that the nature of UB? Shouldn't we try to avoid it careful
> interface design?

Well, no, we shouldn't avoid all UB by interface design. And where we
do avoid it, we need to do it
in an educated fashion. This issue is surely at least a minor wart in
string/string_view interoperability
API, but how big a wart and whether we should change that API is a
question that needs more rationale,
because string_view is all but shipping.

> Or did the design principles of c++ recently change from "let's be sure" to
> "screw it, let's code and hope for the best"?

They didn't change from something they never were to such hypothetical
principles, no.

Richard Hodges

unread,
Sep 4, 2017, 1:57:46 PM9/4/17
to std-dis...@isocpp.org
Explicitly doing something invalidates the premise of being implicit.

converting int to double is a reasonable implicit conversion.

converting std::string to const char* is an unreasonable implicit conversion (c.f.: there being no such implicit conversion operator in the STL).

implicitly converting std::string to std::string_view is, from a memory management, object lifetime management and code safety point of view *exactly the same* as implicitly converting to const char*.

You may think you want implicit conversions because they automatically speed up old code without a change.

But much of the time they will also introduce UB into code that used to work reliably - no editing required. 

That's not so cool.

requiring 10 key strokes in order to express intent unequivocally is not a high price to pay for code that remains functional.
  

Ville Voutilainen

unread,
Sep 4, 2017, 1:59:41 PM9/4/17
to std-dis...@isocpp.org
On 4 September 2017 at 20:51, Richard Hodges <hodg...@gmail.com> wrote:
>> This recreates the dependency from string to string_view, which was
>> deliberately removed before string_view was adopted.
>
> an implicit constructor from std::string is about as dependent as it gets.

Such a constructor doesn't exist. std::string has a conversion
operator that returns a std::string_view, but std::string_view
does not have constructors that take std::string.

> Also, this is not a valid argument to against requiring an explicit
> construction.


I'm merely pointing out that what you're proposing doesn't work as-is.
You need to make the string's conversion operator
explicit.

Richard Hodges

unread,
Sep 4, 2017, 2:00:57 PM9/4/17
to std-dis...@isocpp.org
> I'm merely pointing out that what you're proposing doesn't work as-is. 
> You need to make the string's conversion operator
> explicit.

Tomato, tomato. whatever. I'm glad we agree on the need for an explicit conversion.


Richard Hodges

unread,
Sep 4, 2017, 2:02:39 PM9/4/17
to std-dis...@isocpp.org
because string_view is all but shipping

a defect is a defect, even after shipping.

Many on the committee share my view.

Unfathomably, they seem to have allowed themselves to be browbeaten into accepting a sub-optimal library design.

Whoever pushed for this implicit interface ought to come out here and explain themselves. And volunteer to fix the bugs.


Ville Voutilainen

unread,
Sep 4, 2017, 2:04:05 PM9/4/17
to std-dis...@isocpp.org
On 4 September 2017 at 21:00, Richard Hodges <hodg...@gmail.com> wrote:
>> I'm merely pointing out that what you're proposing doesn't work as-is.
>> You need to make the string's conversion operator
>> explicit.
>
> Tomato, tomato. whatever. I'm glad we agree on the need for an explicit
> conversion.

If you wish to have anyone seriously listen to what you're proposing,
you need to describe
what you propose accurately, and avoid fuzzy mistakes that you think
don't matter. You're going
to deal with an audience that is immensely detail-oriented, and can
easily dismiss your suggestions
as so uneducated that they won't look at it twice, if it contains too
glaring errors and you shrug
them off that way.

Ville Voutilainen

unread,
Sep 4, 2017, 2:07:29 PM9/4/17
to std-dis...@isocpp.org
On 4 September 2017 at 21:02, Richard Hodges <hodg...@gmail.com> wrote:
>> because string_view is all but shipping
> a defect is a defect, even after shipping.

A defect that can break existing valid code is harder to fix than a
defect that doesn't.

> Many on the committee share my view.

They do? Are they willing to file NB comments, co-author your paper,
champion it through the committee?

> Unfathomably, they seem to have allowed themselves to be browbeaten into
> accepting a sub-optimal library design.

They seem to have kept all too quiet for a time span of couple of
years, instead of doing their job
pointing the issue out and emphasizing and explaining that it's a major problem.

> Whoever pushed for this implicit interface ought to come out here and
> explain themselves. And volunteer to fix the bugs.

What makes you think the implicit conversion hasn't been explained
when it was proposed?
Can you not find the rationale for it in the string_view proposals?

Matthew Woehlke

unread,
Sep 4, 2017, 2:09:44 PM9/4/17
to std-dis...@isocpp.org, Richard Hodges
On 2017-09-04 13:50, Richard Hodges wrote:
>> You can't add an API that takes both std::string and
> std::string_view implicitly.
>
> Good. because the behaviour of a std::string_view is not similar to the
> behaviour of a std::string.

Not so good if said API only cares about getting a bag of characters.
Forcing callers to explicitly convert a std::string to a string_view in
such cases is widely perceived as unnecessarily obnoxious.

Also, I'm having trouble coming up with a case where passing a
std::string to a function taking a string_view is dangerous. It's
*returning* or *storing* a string_view that is dangerous...

Unfortunately, the implicit conversion doesn't provide a good way to
distinguish these cases.

--
Matthew

T. C.

unread,
Sep 4, 2017, 3:13:32 PM9/4/17
to ISO C++ Standard - Discussion, hodg...@gmail.com
On Monday, September 4, 2017 at 1:50:51 PM UTC-4, Matthew Woehlke wrote:
Since we don't have such tools currently, 

Message has been deleted

Michael Kilburn

unread,
Sep 4, 2017, 3:29:48 PM9/4/17
to ISO C++ Standard - Discussion


On Monday, September 4, 2017 at 3:09:30 AM UTC-5, Richard Hodges wrote:


here's a well-intentioned offering to produce a substring from either a string-view or string in a uniform way:

#include <string>
#include <string_view>

std::string_view sub_string(std::string_view s, 
  std::size_t p, 
  std::size_t n = std::string_view::npos)
{
  return s.substr(p, n);
}
This is why (when designing my own "string_view", aka parray) I made related constructor explicit. And in every other place where ownership model changes... Forcing him to do some extra typing provides some protection

 


std::string sub_string(std::string&& s, 
  std::size_t p, 
  std::size_t n = std::string_view::npos)
{
  return s.substr(p, n);
}

std::string sub_string(std::string const& s, 
  std::size_t p, 
  std::size_t n = std::string_view::npos)
{
  return s.substr(p, n);
}

And here's how it will introduce random segfaults into user code.

int main()
{
  using namespace std::literals;

  auto source = "foobar"s;
  auto bar = sub_string(source, 3);

  // but uh-oh...
  bar = sub_string("foobar"s, 3);
  // now use bar at your peril...
}

gcc and clang don't produce any warnings here. I don't believe code reviews will find bugs like this reliably, and often neither will unit testing.

Allowing implicit conversions from std::string to std::string_view is all very nice, and I understand the intention - to allow algorithms to become more efficient by a simple, compatible interface change.

C++11 went a long way to removing c++'s reputation as a difficult language to get right.

This one design error in c++17 will re-award c++ the accolade of "most buggy and segfaultly language on the planet".

On grounds of safety alone, it is a design error and should be removed.

Matthew Woehlke

unread,
Sep 4, 2017, 3:41:21 PM9/4/17
to std-dis...@isocpp.org, T. C., hodg...@gmail.com
I was thinking of *enforcement* (as has been bandied about a bit in this
thread), not just static analysis. Or, put differently, tools *in the
language itself*.

--
Matthew

p_ha...@wargaming.net

unread,
Sep 4, 2017, 8:59:00 PM9/4/17
to ISO C++ Standard - Discussion
Maybe I'm looking in the wrong place, but in the chain of N3442 through N3921, there never seems to have been a question of whether string_view *should* have an implicit constructor from string, only the observation that the predecessors in Google, LLVM, and Bloomberg's codebases have it, and the benefits of it.

The entire final rationale in both the working papers and in the standard appears to be [string.view]/2:

[ Note: The library provides implicit conversions from const charT* and std::basic_string<charT, ...> to std::basic_string_view<charT, ...> so that user code can accept just std::basic_string_view<charT> as a non-templated parameter wherever a sequence of characters is expected. User-defined types should define their own implicit conversions to std::basic_string_view in order to interoperate with these functions. — end note ]

It's possibly slightly surprising that this note wasn't adjusted by P0254 which moved the std::basic_string implicit conversion into std::basic_string, which is nicely consistent with the second sentence of the note. The note's not incorrect, but an uncareful reading would possibly cause readers to expect that the two implicit conversions mentioned would be provided in the same place, and within the section to which the note is attached.

P0254 as I mentioned previously didn't address this question either, accepting the status quo.

So if questioning of this direction was done, it didn't show up in the string_view working papers themselves. I didn't track back into the predecessor codebases. I also haven't poked at the history of std-discussion or std-proposals reflectors, and other resources I'm aware of are committee-protected.

Interestingly, the Note in the standard refers to "parameters". So that rationale doesn't address the problematic use-case that spawned this discussion thread, which is about string_view as a *return* type.

We had a similar discussion locally (but with the experimental string_view with the implicit constructors) and we couldn't find a way to nicely specify in the code any mechanism to prevent this happening if the user wasn't paying attention.

Richard Hodges

unread,
Sep 5, 2017, 5:09:42 AM9/5/17
to std-dis...@isocpp.org
OK, I am sure the community is getting bored of this discussion. It's clear to me that people's minds are made up. 

So I'll accept the advice of my peers: "never return a string_view".

And leave you with this innocent little program which in good faith, obeys this sound advice.

#include <string_view>
#include <string>
#include <iostream>

std::string good_intention()   // this is OK, right? Returning a string? We're allowed to do that.
{
    return "abcdefghijklmnopqrstuvwxyz0123456789";
}

int main()
{
    auto s = std::string_view("good");   // all well and good. Almost always auto!
 
    // ...other stuff to distract us...

    s = good_intention();                // cool! performance and elegance!

    std::cout << s << std::endl;         // ahhhnd... boom!
}

The observant amongst us will see that:

a) If bad() returned a shorter string we'd probably never see a problem.

b) the last line of the program is UB, no matter how you cut it.

c) I have obeyed the best advice from the most experienced and knowledgeable people.

d) It took 3 lines of perfectly reasonable, value-semantic code to create invisible UB.

And just in case you don't believe me or can't see the problem, here it is demonstrated.


Bear in mind that I have never disobeyed the rule of "never return a string_view".

Now the rule has to be "never return a string if a user might assign it to a string_view". Anyone see a problem with that? Last time I checked library maintainers have no control over their user base, and the user base has been encouraged to use auto, almost always.

Very soon the rule will have to become "never use a string_view - its just to dangerous in a world of template expansions and almost-always-auto".

All because string_view is implicitly convertible from an rvalue-reference to a string.

Corollary:

Objects that hold references are not the same as references. The c++ language has special protection for references when assigned-to from a returned object.  string_view subverts that protection by isolating the assignment of the reference away from the point of construction.

Objects that hold references, when they allow implicit conversions, are landmines.

C++ has an explicit keyword - so that library maintainers can make it harder to lay landmines by mistake.

std::string_view could be safely convertible from std::string, but only if explicitly.

I'm not on a hobby horse here. I just like to know that my code and that of my developers will work today and tomorrow. As of c++17, I will necessarily be less sure of that.

Sorry to bring the bad news about our new toy.

I'm sure I've ruffled feathers. Well, that doesn't matter. Your feathers will be more ruffled when your trading systems give random incorrect prices, your game crashes just before the user saved his game for the night, or your nuclear reactor tells you it can't shut down because, "3456%#$%^dfhjdfsghs - segfault".

Cheers.


--

Bjorn Reese

unread,
Sep 5, 2017, 6:03:27 AM9/5/17
to std-dis...@isocpp.org
On 09/04/2017 07:14 PM, Richard Hodges wrote:

> I propose that specifically:
>
> std::string_view::string_view(std::string&&) = delete;
> std::string_view& std::string_view::operator=(std::string&&) = delete;

That is what boost::string_view (and boost:string_ref) did, but it was
reverted due to user complaints:

https://lists.boost.org/Archives/boost/2017/03/233400.php
https://svn.boost.org/trac10/ticket/12917

Paul "TBBle" Hampson

unread,
Sep 5, 2017, 6:45:56 AM9/5/17
to ISO C++ Standard - Discussion
On Tuesday, 5 September 2017 19:09:42 UTC+10, Richard Hodges wrote:
OK, I am sure the community is getting bored of this discussion. It's clear to me that people's minds are made up. 

So I'll accept the advice of my peers: "never return a string_view".

Note that this wasn't the advice I implied from the standard note, "so that user code can accept just std::basic_string_view<charT> as a non-templated parameter wherever a sequence of characters is expected", and some minimal historical research. What I got was "use string_view as a parameter type".

Helpfully, that's also appears to be the closest thing to a usage rule reached on the Boost mailing list, in the discussion Bjorn Reese linked while I was writing this.

On the topic of minds being made up, this discussion *has* changed my mind somewhat, as I'm now happy with the current behaviour and the observation that outside of function parameters, string_view is as safe as a char*.

I'm also now thinking about zstring_view, but that's for std-proposals, not here.

Ville Voutilainen

unread,
Sep 5, 2017, 6:55:56 AM9/5/17
to std-dis...@isocpp.org
On 5 September 2017 at 12:09, Richard Hodges <hodg...@gmail.com> wrote:
> And leave you with this innocent little program which in good faith, obeys
> this sound advice.
>
> #include <string_view>
> #include <string>
> #include <iostream>
>
> std::string good_intention() // this is OK, right? Returning a string?
> We're allowed to do that.
> {
> return "abcdefghijklmnopqrstuvwxyz0123456789";
> }
>
> int main()
> {
> auto s = std::string_view("good"); // all well and good. Almost always
> auto!
>
> // ...other stuff to distract us...
>
> s = good_intention(); // cool! performance and elegance!
>
> std::cout << s << std::endl; // ahhhnd... boom!
> }
>
> The observant amongst us will see that:


The observant among us will see that once I made it use
std::experimental::string_view, I got this from clang-tidy:

bad.cpp:16:5: warning:
std::experimental::fundamentals_v1::basic_string_view outlives its
value [misc-dangling-handle]
s = good_intention(); // cool! performance and elegance!
^

Chances are that clang-tidy's handle-checker doesn't grok the current
string_view conversion operator of std::string,
but perhaps making that work isn't all that far away.

Somehow I fail to believe that this issue will lead to such amounts of
doom as you indicated.

Richard Hodges

unread,
Sep 5, 2017, 7:33:31 AM9/5/17
to std-dis...@isocpp.org
string_view is as safe as a char*

I agree 100%

The reason we have std::string is because char* is about as unsafe as it gets. Basically we're back to C. A regression of 30 years.

What I got was "use string_view as a parameter type"

That goes some way to helping but the assignability from string still exhibits the same problem when its used as an argument:

int foo(std::string_view s) // refactored from std::string
{
if(some_condition()) {
s = good_intention();
}
return something_else(s); // boom!
}

The rules of string_view have now become:

* only accept a string_view as an argument, and
* never return it, and
* never re-assign it (In which case why not enforce that?)

and the obvious implication is that:

If we have a template function that does string-like stuff and returns a string-like result, you'd better litter that code with std::enable_if_t<not std::is_same<std::decay_t<StringLike>, std::string_view>::value>

because if you don't, someone somewhere is going to accept a string_view and call your template function, forgetting that it's going to return a string_view.

an example:

template<class StringLike>
auto get_base_currency(StringLike&& ccypair)
{
return ccypair.substr(0, 3);
}

for which the safety fix is:

template<class StringLike>
auto get_base_currency(StringLike&& ccypair)
{
return std::string(ccypair.substr(0, 3));
}

so we may as well have written:

auto get_base_currency(std::string const& ccypair) -> std::string
{
return ccypair.substr(0, 3);
}

But anyway, I know, I know.

You've got your new toy. You know how to play with it and you don't see a problem.

Much of my 30 year career is writing c++ has been picking up the mess after the kids have played with the new toys.

How many times have I had this conversation with a new programmer:
him: "your code doesn't work"
me: "what did you change?"
him: "nothing"
me: "what about this change here?"
him: "oh that, that doesn't matter..."

After string_view, this conversation could actually put the newbie in the right. He didn't change anything, some other dependency
created a string_view and his code magically stopped working.

But who cares? This looks cool and we can avoid typing 10 characters to use it.

The reality is that there will be an imperceptible increase in performance resulting from the use of string_view rather than
const std::string& in function arguments. It's a false optimisation, and it invites such unsafety that we must use static analysers
to even realise that a program is exhibiting UB.

Am I really the only person who sees a problem with that?



--

Paul "TBBle" Hampson

unread,
Sep 5, 2017, 8:02:15 AM9/5/17
to ISO C++ Standard - Discussion
On Tuesday, 5 September 2017 21:33:31 UTC+10, Richard Hodges wrote:
string_view is as safe as a char*

I agree 100%

The reason we have std::string is because char* is about as unsafe as it gets. Basically we're back to C. A regression of 30 years.

Except if you use it for ownership, char* isn't actually unsafe, as far as I know. And while in C we had a history of owning things in *, and just having to document when we did it, in C++ we have things like unique_ptr, string, and the STL containers to express ownership. The GSL even gives you owner<T> if you still need to own something in a raw pointer, and want to document it *usefully* (i.e. for a static analyzer and the casual reader)

The rest of the time, T* has reasonably-clear semantics, unless you're still writing C++ as "C with classes", assuming C semantics for any construct that looks the same a C. I did this for the first few years of C++, myself, so naturally I assume everyone starts that way. ^_^
 
What I got was "use string_view as a parameter type"

That goes some way to helping but the assignability from string still exhibits the same problem when its used as an argument:

int foo(std::string_view s) // refactored from std::string
{
if(some_condition()) {
s = good_intention();
}
return something_else(s); // boom!
}

I wouldn't have considered "assigning to s" as "using s as a parameter", but it's not an unusual pattern.

In fact, a lot of the examples of bad behaviour here have relied on basic_string_view::operator=(basic_string_view). Perhaps that's the piece that needs to be explicit? Although that would interfere with assignment from char* with an overload, and perhaps also mess with a future span<char> (e.g. as seen in the GSL).
 
And as noted in another older discussion about explicitness in the other direction, it would break an expectation that copy-initialisation and copy-construction are either both present or both absent.

Matthew Woehlke

unread,
Sep 5, 2017, 10:30:44 AM9/5/17
to std-dis...@isocpp.org, Richard Hodges
On 2017-09-05 05:09, Richard Hodges wrote:
> s = good_intention(); // cool! performance and elegance!

Now, hang on just a minute... can someone remind me what is the valid
use case for *assigning* to a string_view from a temporary? When will
that not end in tears?

The use case for *construction* is passing a string to a function that
takes a string_view. In that case, the string isn't destroyed until
*after* the function call, so constructing from a temporary is safe.

When is it safe to *assign* from a temporary?

--
Matthew

Matthew Woehlke

unread,
Sep 5, 2017, 10:37:57 AM9/5/17
to std-dis...@isocpp.org, Richard Hodges
On 2017-09-05 07:33, Richard Hodges wrote:
> The reality is that there will be an imperceptible increase in performance
> resulting from the use of string_view rather than
> const std::string& in function arguments. It's a false optimisation, and it
> invites such unsafety that we must use static analysers
> to even realise that a program is exhibiting UB.

I feel the need to jump in here and remind folks that performance isn't
the only motivating use for string_view:

foo(std::string const&);

bar(QLatin1String const& s);
foo(s); // grr...

- vs. -

foo(std::string_view);

bar(QLatin1String const& s);
foo({s.begin(), s.end()}); // ah, bliss...

(Yeah, yeah, that's probably not how one would actually *construct* the
string_view, but that's not the point.)

--
Matthew

Nevin Liber

unread,
Sep 5, 2017, 10:41:16 AM9/5/17
to std-dis...@isocpp.org
On Tue, Sep 5, 2017 at 9:30 AM, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
On 2017-09-05 05:09, Richard Hodges wrote:
>     s = good_intention();                // cool! performance and elegance!

Now, hang on just a minute... can someone remind me what is the valid
use case for *assigning* to a string_view from a temporary? When will
that not end in tears?

It is like anything else with reference semantics.  If you know that the lifetime of the underlying data is valid, it is safe.  If you don't know that, it isn't.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com>  +1-847-691-1404
Message has been deleted

Matthew Woehlke

unread,
Sep 5, 2017, 10:58:08 AM9/5/17
to std-dis...@isocpp.org, Nevin Liber
On 2017-09-05 10:40, Nevin Liber wrote:
> On Tue, Sep 5, 2017 at 9:30 AM, Matthew Woehlke <mwoehlk...@gmail.com>
> wrote:
>
>> On 2017-09-05 05:09, Richard Hodges wrote:
>>> s = good_intention(); // cool! performance and
>> elegance!
>>
>> Now, hang on just a minute... can someone remind me what is the valid
>> use case for *assigning* to a string_view from a temporary? When will
>> that not end in tears?
>>
>
> It is like anything else with reference semantics. If you know that the
> lifetime of the underlying data is valid, it is safe. If you don't know
> that, it isn't.

Sorry, let me amend that: from a temporary *std::string*...

Please give an example of how that can be safe.

If you can't, I would suggest that implies that said assignment operator
should be explicitly deleted, and hang the asymmetry...

--
Matthew

Ville Voutilainen

unread,
Sep 5, 2017, 11:37:42 AM9/5/17
to std-dis...@isocpp.org
On 5 September 2017 at 17:58, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
>>> Now, hang on just a minute... can someone remind me what is the valid
>>> use case for *assigning* to a string_view from a temporary? When will
>>> that not end in tears?
>>>
>>
>> It is like anything else with reference semantics. If you know that the
>> lifetime of the underlying data is valid, it is safe. If you don't know
>> that, it isn't.
>
> Sorry, let me amend that: from a temporary *std::string*...
>
> Please give an example of how that can be safe.
>
> If you can't, I would suggest that implies that said assignment operator
> should be explicitly deleted, and hang the asymmetry...

Except that as mentioned a couple of times, string_view's api doesn't
mention string, by design.
So it's not quite that simple.

Richard Hodges

unread,
Sep 5, 2017, 11:43:28 AM9/5/17
to std-dis...@isocpp.org
Except that as mentioned a couple of times, string_view's api doesn't mention string, by design.

But it's ok to point out that a design is faulty, right?



Message has been deleted

Ville Voutilainen

unread,
Sep 5, 2017, 12:02:11 PM9/5/17
to std-dis...@isocpp.org
On 5 September 2017 at 18:43, Richard Hodges <hodg...@gmail.com> wrote:
>> Except that as mentioned a couple of times, string_view's api doesn't
>> mention string, by design.
>
> But it's ok to point out that a design is faulty, right?


You can point out things that are faulty in your opinion to your
heart's content, but that's some distances away from
making an actual change.

Matthew Woehlke

unread,
Sep 5, 2017, 12:20:31 PM9/5/17
to std-dis...@isocpp.org
Hmm... yes, that does make it more difficult.

Okay, so we need a way for a conversion operator to express that if the
object being converted is a temporary, the resulting object is also a
temporary, and in such case, we need to control what operations can be
done with that temporary.

In particular, we would need a way to say that you can't construct a
non-temporary object via a conversion operator that returns a temporary.
I'm not sure if there is a reasonable language way for a class to
express that. Would it make sense for it to just be a blanket restriction?

We would also need a way to prevent assignment, but that might be hard,
since we don't also want to prevent:

foo(string_view s)
{
string_view s2;
s2 = std::move(s);
}

(...Or do we?)

IMHO, the still-most-critical question here is, if we get such features
in the future, will we be able to retrofit them into std::string without
breaking the world? (Note: I mainly care about ABI breakage. I *don't*
care about SC breaks if someone is using the conversion operator in a
dangerous way; their code was already broken, and making it fail to
compile is *good*.)

--
Matthew

Ville Voutilainen

unread,
Sep 5, 2017, 12:38:42 PM9/5/17
to std-dis...@isocpp.org
On 5 September 2017 at 19:20, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
> On 2017-09-05 11:37, Ville Voutilainen wrote:
>> On 5 September 2017 at 17:58, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
>>> I would suggest that implies that said assignment operator should
>>> be explicitly deleted, and hang the asymmetry...
>>
>> Except that as mentioned a couple of times, string_view's api
>> doesn't mention string, by design. So it's not quite that simple.
>
> Hmm... yes, that does make it more difficult.

We could try adding an assignment operator template that is disabled
if the incoming type is string_view itself.
That needs an implementation and a proposal, but it has a much better
chance of succeeding as a defect
report than making the conversion explicit.

> Okay, so we need a way for a conversion operator to express that if the
> object being converted is a temporary, the resulting object is also a
> temporary, and in such case, we need to control what operations can be
> done with that temporary.

That wouldn't be something that can be applied as a defect fix to C++17, though.

> We would also need a way to prevent assignment, but that might be hard,
> since we don't also want to prevent:
>
> foo(string_view s)
> {
> string_view s2;
> s2 = std::move(s);
> }
>
> (...Or do we?)

I don't think we want to prevent that. If your type is
copy-assignable, it should also be move-assignable.
Otherwise madness ensues.

> IMHO, the still-most-critical question here is, if we get such features
> in the future, will we be able to retrofit them into std::string without
> breaking the world? (Note: I mainly care about ABI breakage. I *don't*
> care about SC breaks if someone is using the conversion operator in a
> dangerous way; their code was already broken, and making it fail to
> compile is *good*.)


My quick guesstimate is that we could likely be able to retrofit that
in without abi breakage,
if we get such language facilities.

Richard Hodges

unread,
Sep 5, 2017, 1:23:20 PM9/5/17
to std-dis...@isocpp.org
>> Except that as mentioned a couple of times, string_view's api
>> doesn't mention string, by design. So it's not quite that simple.

Why is it so important that std::string_view is not dependent on std::string, when std::string is currently dependent on std::string_view?

What is the rationale for this seemingly arbitrary decision?

Because to reverse the dependency allows us to disallow broken constructors and conversions with the current language.

This whole std::string_view thing is starting to sound alarmingly like a religion. For a start, where's the empirical data to demonstrate that we'll see faster programs as a result of all this UB risk?

Last time I profiled a pricing server, it spent 90% of its time decoding character sequences into doubles and vice verse, and the rest of the time doing all the "slow" things like allocating memory and managing shared_ptr's and mutexes.

Completely contrary to expectations.




Richard Hodges

unread,
Sep 5, 2017, 1:25:01 PM9/5/17
to std-dis...@isocpp.org

>On 5 September 2017 at 18:21, Eyal Rozenberg <eya...@technion.ac.il> wrote:
> Well, ok, technically, but there's a good reason for the alternative constructor invocation has this form.

> On 09/05/2017 12:56 PM, Richard Hodges wrote:
> I disagree.

> auto x = X(); is not "returning an object", it is the "alternative constructor notation" which is recommended in order to avoid the most vexing parse problem.

Care to elaborate?
 

On 5 September 2017 at 18:21, Eyal Rozenberg <eya...@technion.ac.il> wrote:
Well, ok, technically, but there's a good reason for the alternative constructor invocation has this form.

On 09/05/2017 12:56 PM, Richard Hodges wrote:
I disagree.

auto x = X(); is not "returning an object", it is the "alternative constructor notation" which is recommended in order to avoid the most vexing parse problem.



On 5 September 2017 at 11:33, Eyal Rozenberg <eya...@technion.ac.il <mailto:eya...@technion.ac.il>> wrote:

    (off the list)

    Umm, you're not accepting the advice of your peers - you are
    returning a string view in the "all well and good" line.

    However, assignment from a string to a string view may be problematic.

    Eyal
        <mailto:p_ha...@wargaming.net> <mailto:p_ha...@wargaming.net

        <mailto:p_ha...@wargaming.net>>> wrote:

             On Tuesday, September 5, 2017 at 4:07:29 AM UTC+10, Ville
             Voutilainen wrote:

                 On 4 September 2017 at 21:02, Richard Hodges
        <hodg...@gmail.com <mailto:hodg...@gmail.com>>

                 wrote:
                 > Whoever pushed for this implicit interface ought to
        come out here and
                 > explain themselves. And volunteer to fix the bugs.

                 What makes you think the implicit conversion hasn't
        been explained
                 when it was proposed?
                 Can you not find the rationale for it in the
        string_view proposals?


             Maybe I'm looking in the wrong place, but in the chain of N3442
             <http://wg21.link/N3442> through N3921
        <http://wg21.link/N3921>,

             there never seems to have been a question of whether
        string_view
             *should* have an implicit constructor from string, only the
             observation that the predecessors in Google, LLVM, and
        Bloomberg's
             codebases have it, and the benefits of it.

             The entire final rationale in both the working papers and
        in the
             standard appears to be [string.view]/2
             <http://eel.is/c++draft/string.view#2
        <http://eel.is/c++draft/string.view#2>>:

             /[ Note: The library provides implicit conversions from

        const charT*
             and std::basic_string<charT, ...> to
        std::basic_string_view<charT,
             ...> so that user code can accept just
        std::basic_string_view<charT>
             as a non-templated parameter wherever a sequence of
        characters is
             expected. User-defined types should define their own implicit
             conversions to std::basic_string_view in order to
        interoperate with
             these functions. — end note ]/


             It's possibly slightly surprising that this note wasn't
        adjusted by
             P0254 <http://wg21.link/P0254> which moved the

        std::basic_string
             implicit conversion into std::basic_string, which is nicely
             consistent with the second sentence of the note. The note's not
             incorrect, but an uncareful reading would possibly cause
        readers to
             expect that the two implicit conversions mentioned would be
        provided
             in the same place, and within the section to which the note
        is attached.

             P0254 <http://wg21.link/P0254> as I mentioned previously didn't
             address this question either, accepting the /status quo/.

             To post to this group, send email to
        std-dis...@isocpp.org <mailto:std-discussion@isocpp.org>
             <mailto:std-discussion@isocpp.org
        <mailto:std-discussion@isocpp.org>>.

             Visit this group at
        https://groups.google.com/a/isocpp.org/group/std-discussion/



        --
        ---
        You received this message because you are subscribed to the
        Google Groups "ISO C++ Standard - Discussion" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to std-discussion+unsubscribe@isocpp.org

        To post to this group, send email to std-dis...@isocpp.org
        <mailto:std-discussion@isocpp.org>
        <mailto:std-discussion@isocpp.org
        <mailto:std-discussion@isocpp.org>>.

        Visit this group at
        https://groups.google.com/a/isocpp.org/group/std-discussion/

Nevin Liber

unread,
Sep 5, 2017, 1:40:10 PM9/5/17
to std-dis...@isocpp.org
On Tue, Sep 5, 2017 at 11:38 AM, Ville Voutilainen <ville.vo...@gmail.com> wrote:
We could try adding an assignment operator template that is disabled
if the incoming type is string_view itself.

Wouldn't that break:

const char* ApiThatKeepsStringSpaceAroundUntilExplicitlyReleased();
//...
string_view sv;
sv = ApiThatKeepsStringSpaceAroundUntilExplicitlyReleased();

I think the only way to solve this in general is via a trait, because we want to delete that assignment only to types that are managing the space directly.  Ugh.

Ville Voutilainen

unread,
Sep 5, 2017, 1:41:57 PM9/5/17
to std-dis...@isocpp.org
On 5 September 2017 at 20:39, Nevin Liber <ne...@eviloverlord.com> wrote:
> On Tue, Sep 5, 2017 at 11:38 AM, Ville Voutilainen
> <ville.vo...@gmail.com> wrote:
>>
>> We could try adding an assignment operator template that is disabled
>> if the incoming type is string_view itself.
>
>
> Wouldn't that break:
>
> const char* ApiThatKeepsStringSpaceAroundUntilExplicitlyReleased();
> //...
> string_view sv;
> sv = ApiThatKeepsStringSpaceAroundUntilExplicitlyReleased();

Well, we have experience on how to write these constraints from the
previous disambiguation
issues. But this is why I said it needs an implementation.

Matthew Woehlke

unread,
Sep 5, 2017, 2:18:24 PM9/5/17
to std-dis...@isocpp.org, Richard Hodges
On 2017-09-05 13:23, Richard Hodges wrote:
>>> Except that as mentioned a couple of times, string_view's api
>>> doesn't mention string, by design. So it's not quite that simple.
>
> Why is it so important that std::string_view is not dependent on
> std::string, when std::string is currently dependent on std::string_view?

Somewhat tangential answer, but... because it's important that it not be
implemented in a way that critical features (e.g. the safety features
we're trying to figure out in this thread) work for std::string but not
other libraries' string types.

At least, that may not be the original reason, but it seems a pertinent
reason for the purposes of this discussion...

> Because to reverse the dependency allows us to disallow broken constructors
> and conversions with the current language.

Yes... but *only for std::string*, which is itself a problem.

> This whole std::string_view thing is starting to sound alarmingly like a
> religion. For a start, where's the empirical data to demonstrate that we'll
> see faster programs as a result of all this UB risk?

...and as I've stated before, "faster programs" is only one of the
reasons for string_view. It's also to improve interoperability and
reduce use of `char const*`.

--
Matthew

Nevin Liber

unread,
Sep 5, 2017, 2:22:35 PM9/5/17
to std-dis...@isocpp.org
My point wasn't clear.  It isn't about writing the constraints (that is certainly doable); rather, we need a way to specify what types this constraint should be applied to, given that there are many different user defined string types out there, and not all of them directly manage the lifetime of the string space.  To me, that spells either a trait or some marker inside the class (comparators use a similar trick with is_transparent).  And I agree we need an implementation.

David Brown

unread,
Sep 6, 2017, 4:01:03 AM9/6/17
to std-dis...@isocpp.org
Perhaps I am missing something obvious, or something that has already
been explained. I am not an expert in these matters - consider this the
opinion of a relative novice at C++.

But I don't really understand the importance of this. To a the user, a
"string_view" is a view into a string. Sure, you might want to make
string_view objects from other things, such as C string literals. But
why should the /user/ care that string_view is disconnected from string?

A good API does two things:

1. It makes it easy to write correct code.
2. It makes it hard to write incorrect code.

It seems to me that string_view is failing on point 2 here - and failing
/badly/. You can keep telling us about reference semantics until you
are blue in the face, but the simple matter is that Richard's examples
look like perfectly good code, will compile without complain, and some
might well work for some tests - yet they are broken. That is /not/ good.

If fixing that means sacrificing a design goal of string_view - that it
be independent of string - then it looks to me that that design goal
should be sacrificed. I cannot honestly say I understand the relevance
of that design goal, but I have difficulty believing that it trumps the
key requirements of all APIs.

Ville Voutilainen

unread,
Sep 6, 2017, 4:38:27 AM9/6/17
to std-dis...@isocpp.org
On 6 September 2017 at 09:59, David Brown <da...@westcontrol.com> wrote:
> If fixing that means sacrificing a design goal of string_view - that it
> be independent of string - then it looks to me that that design goal
> should be sacrificed. I cannot honestly say I understand the relevance
> of that design goal, but I have difficulty believing that it trumps the
> key requirements of all APIs.


I'm not suggesting that it trumps the key requirements of all APIs, but since
it was the design goal in switching from string_view constructors that
take strings
to a conversion operator in string, it's something to keep in mind when writing
proposals to change the API of string_view, if someone decides to write
such a proposal.

Richard Hodges

unread,
Sep 6, 2017, 6:40:51 AM9/6/17
to std-dis...@isocpp.org
Is there a forum I can join so that I get early notification of proposals to be discussed for standards inclusion?

Since the golden days of c++11's Great Leap Forward, there seems to be more and more low quality code creeping into the standard with questionable design rationale.

I would like to help prevent this by offering my isights which have been acquired through 30 years of writing c++, and fixing the dangerous stuff that young developers do in their blind enthusiasm.

string_view in its current form is **really bad** and should never have even got into the net, let alone slipped through it.

Where do I sign up please?




p_ha...@wargaming.net

unread,
Sep 6, 2017, 7:20:55 AM9/6/17
to ISO C++ Standard - Discussion
On Wednesday, September 6, 2017 at 8:40:51 PM UTC+10, Richard Hodges wrote:
Is there a forum I can join so that I get early notification of proposals to be discussed for standards inclusion?

The std-proposals sibling mailing-list to this one is the obvious short hop, but I expect most proposals don't appear there.

For the papers presented to and by the C++ Standards Committee, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/ are the archives of the mailings, published before and after each meeting and including WG21 meeting minutes although not necessarily the various contained Working Group or Study Group meeting minutes where most of the detailed discussion apparently occurs. The minutes of the most recent meeting (July 2017 in Toronto) were published as N4691 if you want to see how much information is there.

To get involved in the meetings themselves, have a read of https://isocpp.org/std/meetings-and-participation and the surrounding pages, and see what level of participation suits you best.

Richard Hodges

unread,
Sep 6, 2017, 9:03:02 AM9/6/17
to std-dis...@isocpp.org
Are you saying that there is no public forum where proposals can be critically scrutinised by the user community before submission, a-la-boost?

That would explain a lot.


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.

Ricardo Fabiano de Andrade

unread,
Sep 6, 2017, 9:22:55 AM9/6/17
to std-dis...@isocpp.org
And as far as I know most of this is due to ISO rules/bureaucracy.
However, my personal opinion is that the committee does a great job trying to keep things transparent to the community despite those limitations.

I am with you that this specific string_view behavior seems problematic but for most part the standard is an awesome example of collective effort.
Let's not judge the whole by this sample :)

That said, string_view implementations are in the wild for quite while now and you or anyone else could have got that sooner.
Unfortunately, none of us could notice it before and maybe @Ville is right and there's not enough time to prevent that from shipping into C++17.

Now the question I pose is, if this string_view behavior is taken as a defect in C++17, what can be done about it and how much change can be afforded into defect?
As suggested by others, the simplest solution would be making the making explicit std::string::operator string_view() 
Would that be possible in a defect?

To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.

To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.

Ville Voutilainen

unread,
Sep 6, 2017, 9:28:30 AM9/6/17
to std-dis...@isocpp.org
On 6 September 2017 at 16:22, Ricardo Fabiano de Andrade
<ricardofabi...@gmail.com> wrote:
> And as far as I know most of this is due to ISO rules/bureaucracy.

Not really. The committee could solicit open-ended input from random
individuals if it really wanted. It chooses not to.

> However, my personal opinion is that the committee does a great job trying
> to keep things transparent to the community despite those limitations.

The proceedings of the committee are certainly much more open than
they are required to be, and to some extent
more open than a strict interpretation of some bureaucratic rules
suggest they should be. We have P-numbered
papers in order to be able to make them available, in contrast to ISO
rules according to which N-papers must
appear in LiveLink and only in LiveLink.

> That said, string_view implementations are in the wild for quite while now
> and you or anyone else could have got that sooner.
> Unfortunately, none of us could notice it before and maybe @Ville is right
> and there's not enough time to prevent that from shipping into C++17.

I would guesstimate that fixing the assignment from temporaries might
be doable. Constructors are harder.

> Now the question I pose is, if this string_view behavior is taken as a
> defect in C++17, what can be done about it and how much change can be
> afforded into defect?
> As suggested by others, the simplest solution would be making the making
> explicit std::string::operator string_view()
> Would that be possible in a defect?

Possible.. maybe. Likely, I doubt it. A change like that will already
break valid code.

Howard Hinnant

unread,
Sep 6, 2017, 10:03:22 AM9/6/17
to ISO C++ Standard - Discussion
On Sep 6, 2017, at 9:28 AM, Ville Voutilainen <ville.vo...@gmail.com> wrote:
>
>> As suggested by others, the simplest solution would be making the making
>> explicit std::string::operator string_view()
>> Would that be possible in a defect?
>
> Possible.. maybe. Likely, I doubt it. A change like that will already
> break valid code.

I doubt it too.

One of the main selling points of string_view is that it could serve as a “universal” parameter to bind to “all things string”, for example:

https://github.com/HowardHinnant/date/blob/master/tz.h#L443-L453

whereas without it (for various reasons), one needs overloads, for example:

https://github.com/HowardHinnant/date/blob/master/tz.h#L508-L533

This is such a major feature of string_view that the motivation for breaking it would have to be very large.

This suggested change breaks that feature. I would vote strongly against it. Take out this feature, and you might as well take out string_view entirely.

Like reference_wrapper, C++ programmers need to learn that types with the word “view” in the name have reference-like behavior.

Howard

signature.asc

p_ha...@wargaming.net

unread,
Sep 6, 2017, 10:15:56 AM9/6/17
to ISO C++ Standard - Discussion
On Wednesday, September 6, 2017 at 11:03:02 PM UTC+10, Richard Hodges wrote:
Are you saying that there is no public forum where proposals can be critically scrutinised by the user community before submission, a-la-boost?

I assume you mean "before acceptance"? I wasn't aware of boost having a pre-submission review process before the "Boost Library Submission Process".

If I have assumed correctly, then that's not what I'm saying at all.

Although it's not formally required (as far as I know, from the outside), most papers that're under consideration for acceptance into the standard should appear in the mailings I linked in my earlier reply. And papers that don't are very unlikely to get accepted on first-showing, unless they're quite trivial. Even trivial-seeming defect changes get "come back with a paper" responses from the working groups.

Also worth noting that the committee has expressed a strong preference for standardising existing practice, so particularly with library features, they're often coming *from* a large and well-used open-source codebase. In the present case, that's LLVM, Google Chromium, *and* Bloomberg BDE. They (plus Boost later) all put effectively *this* class in front of the public through their many and varied forums and contributions, half-a-decade before it became part of C++.

There's also the mechanisms for "Technical Specifications", which allow C++ features which the committee feels are valuable but not-yet-ready to bake for a bit and ensure the form they have taken for standardisation is correct and generally feasible. In the present case, that was the "Library Fundamentals TS", published in 2015, and containing string_view since November 2014, after 7 revisions over almost three years.

And since you compare it to Boost, when I first started learning about standardisation, the accepted wisdom seemed to be that to get a library into the C++ standard, your best bet _was_ to take it through Boost, and hence Boost was considered a lower-bar for entry than the C++ standard itself. Although that wisdom has shifted as the C++ library-distribution ecosystem has expanded, e.g. via github, I don't think the bar has been lowered. I simply believe there are now more ways to demonstrate that you have reached that bar.

Nicol Bolas

unread,
Sep 6, 2017, 11:56:50 AM9/6/17
to ISO C++ Standard - Discussion, da...@westcontrol.com


On Wednesday, September 6, 2017 at 4:01:03 AM UTC-4, David Brown wrote:
On 05/09/17 17:37, Ville Voutilainen wrote:
> On 5 September 2017 at 17:58, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
>>>> Now, hang on just a minute... can someone remind me what is the valid
>>>> use case for *assigning* to a string_view from a temporary? When will
>>>> that not end in tears?
>>>>
>>>
>>> It is like anything else with reference semantics.  If you know that the
>>> lifetime of the underlying data is valid, it is safe.  If you don't know
>>> that, it isn't.
>>
>> Sorry, let me amend that: from a temporary *std::string*...
>>
>> Please give an example of how that can be safe.
>>
>> If you can't, I would suggest that implies that said assignment operator
>> should be explicitly deleted, and hang the asymmetry...
>
> Except that as mentioned a couple of times, string_view's api doesn't
> mention string, by design.

Perhaps I am missing something obvious, or something that has already
been explained.  I am not an expert in these matters - consider this the
opinion of a relative novice at C++.

But I don't really understand the importance of this.

The importance of `string_view`'s independence from `std::string` is essentially why `string_view` exists.

The C++ community is filled with numerous string classes. Everybody has their own string types. While many C++ programmers do use `std::string`, many more never touch it.

`string_view` is intended to be a lingua franca among all contiguous string APIs. If you write a function that only reads from a string, you really don't care whether those bytes come from `std::string`, some fixed-length string, or any number of other alternatives. All you care about is that you're taking a non-modifiable contiguous array of bytes of some size. So you use `string_view` as the type in your parameter list, and everything is fine.

Users cannot modify `string_view`. They cannot give it constructors that take their string type. As such, they must rely on conversion operators on their string types that convert to `string_view`. So, since every other user's string type will have conversions to `string_view`, then it makes sense for `string` to also have one.

The committee is essentially eating its own dog food.

 To a the user, a
"string_view" is a view into a string.  Sure, you might want to make
string_view objects from other things, such as C string literals.  But
why should the /user/ care that string_view is disconnected from string?

A good API does two things:

1. It makes it easy to write correct code.
2. It makes it hard to write incorrect code.

It seems to me that string_view is failing on point 2 here - and failing
/badly/.  You can keep telling us about reference semantics until you
are blue in the face, but the simple matter is that Richard's examples
look like perfectly good code, will compile without complain, and some
might well work for some tests - yet they are broken.  That is /not/ good.

Yes. But this is a general problem in C++. It is not specific to `string_view`. It will be just as much a problem for `span`, for range adaptors, and so forth.

If you have a general solution to this general language problem, then by all means, let's get it into the language. But until we have a way to fix this language hole, we should not gimp types by making them less useful.

Yes, this means that people can do the wrong thing if they have the wrong expectations about what a value means. But we should not use that as a way to inhibit the utility of the type.

The other thing you don't understand is that one of Richard's examples is not a problem of implicit conversion. It's a problem with using `string_view` at all. If you believe that you can replace a value return with a reference return, that's on you. That's your fault, and you have only yourself to blame. If you're saying that such a change looks to you "like perfectly good code", then the only way to avoid this is to not have `string_view` at all.

We should not ditch a perfectly good type just because it can be misused.

Nicol Bolas

unread,
Sep 6, 2017, 12:03:39 PM9/6/17
to ISO C++ Standard - Discussion
On Wednesday, September 6, 2017 at 10:15:56 AM UTC-4, p_ha...@wargaming.net wrote:
On Wednesday, September 6, 2017 at 11:03:02 PM UTC+10, Richard Hodges wrote:
Are you saying that there is no public forum where proposals can be critically scrutinised by the user community before submission, a-la-boost?

I assume you mean "before acceptance"? I wasn't aware of boost having a pre-submission review process before the "Boost Library Submission Process".

If I have assumed correctly, then that's not what I'm saying at all.

Although it's not formally required (as far as I know, from the outside), most papers that're under consideration for acceptance into the standard should appear in the mailings I linked in my earlier reply. And papers that don't are very unlikely to get accepted on first-showing, unless they're quite trivial. Even trivial-seeming defect changes get "come back with a paper" responses from the working groups.

Also worth noting that the committee has expressed a strong preference for standardising existing practice, so particularly with library features, they're often coming *from* a large and well-used open-source codebase. In the present case, that's LLVM, Google Chromium, *and* Bloomberg BDE. They (plus Boost later) all put effectively *this* class in front of the public through their many and varied forums and contributions, half-a-decade before it became part of C++.

There's also the mechanisms for "Technical Specifications", which allow C++ features which the committee feels are valuable but not-yet-ready to bake for a bit and ensure the form they have taken for standardisation is correct and generally feasible. In the present case, that was the "Library Fundamentals TS", published in 2015, and containing string_view since November 2014, after 7 revisions over almost three years.

I think this is a really important point.

Nobody sprung `string_view` onto anybody. It's been debated, discussed, and talked about for years before finally getting into C++17. And in pretty much every form of the proposal, implicit conversion from `string` to `string_view` has been there.

`string_view` is not new. There's prior art for it, some dating from before C++11. And the issue of improper usage is not new either. The committee undoubtedly looked at these issues and decided that the good outweighed the bad.

You can disagree with that assessment, but it is what it is.

Richard Hodges

unread,
Sep 6, 2017, 1:00:33 PM9/6/17
to std-dis...@isocpp.org
That said, string_view implementations are in the wild for quite while now and you or anyone else could have got that sooner.

I have seen string_view (or equivalent) in boost for quite some time. Never saw a need to use it as std::string const& was perfectly 
adequate and the occasional redundant copy of a short (SSO-optimised) string well below the radar for optimisation. 

It's literally never the problem. 

When I took a longer look at it and saw the inherent danger of dangling references I recoiled from it in horror and put it back in the box. 
When the stuff started coming out about string_view in c++17 I naively assumed that the committee would consider the safety angles and 
fix the faulty interface, as they've been pretty good at that up until 2014.
 
re there being "lots of strings in the wild", fair enough. However I've always felt that these new strings were totally un-necessary. There's 
approximately one valid customisation of std::string that's required beyond whats's available - a string that handles utf8 encoding properly. 
And even that can be solved with a custom char trait. You don't even need a new interface.

Every other string implementation is a complete waste of time and effort. It seems whenever someone wants to write a new library they feel 
compelled to write a new string. It's a pointless, fruitless exercise that just creates a headache for the library users. Because when you start 
using more than one library, you start having to write glue code to get them to talk. If they just used std::string, using the libraries would be 
**easier and more performant**.
 
However, I completely understand the rationale for wanting a common "view of chars".

I really see no reason why there need to be any conversion operators to create one. Create it with compatible begin() and end() iterators. 
Job done. Just like the rest of the standard library. No need for any surprise attacks from implicit conversion.

The explicit keyword exists to protect us from such surprises, and here's the c++ committee - the global authority - dancing around with a 
loaded weapon with the safety catch firmly off.

Shameful.



Michael Kilburn

unread,
Sep 6, 2017, 6:52:42 PM9/6/17
to ISO C++ Standard - Discussion


On Wednesday, September 6, 2017 at 9:03:22 AM UTC-5, Howard Hinnant wrote:
On Sep 6, 2017, at 9:28 AM, Ville Voutilainen <ville.vo...@gmail.com> wrote:
>
>> As suggested by others, the simplest solution would be making the making
>> explicit std::string::operator string_view()
>> Would that be possible in a defect?
>
> Possible.. maybe. Likely, I doubt it. A change like that will already
> break valid code.

I doubt it too.

One of the main selling points of string_view is that it could serve as a “universal” parameter to bind to “all things string”, for example:

https://github.com/HowardHinnant/date/blob/master/tz.h#L443-L453

whereas without it (for various reasons), one needs overloads, for example:  

https://github.com/HowardHinnant/date/blob/master/tz.h#L508-L533

Not really, it will require user to explicitly construct string_view in order to use API you provided. I had to wrangle with exactly same problems when designing parray and ended up choosing this approach -- it requires user to do more typing, but by doing this he acknowledges his understanding that string data ownership rules are being changed in this transformation.

 
This is such a major feature of string_view that the motivation for breaking it would have to be very large.

This suggested change breaks that feature.  I would vote strongly against it.  Take out this feature, and you might as well take out string_view entirely.

Like reference_wrapper, C++ programmers need to learn that types with the word “view” in the name have reference-like behavior.

What about new rule: if temporary std::string gets "captured" via string_view -- it's lifetime gets extended until end of the scope? Similarly to:

string foo();
...
string const& r = foo();

Maybe implement this via new keyword like:

class string_view{
operator string_view const( [keepalive] string&& s) { ... }
};


now I am just kidding, tbh :-)

 

Howard

Thiago Macieira

unread,
Sep 6, 2017, 7:24:52 PM9/6/17
to std-dis...@isocpp.org
On Wednesday, 6 September 2017 10:02:55 -03 Richard Hodges wrote:
> Are you saying that there is no public forum where proposals can be
> critically scrutinised by the user community before submission, a-la-boost?

That's what the committee meetings are for. Having feedback from the mailing
lists before submitting such a paper is a good idea, but not a required step.

Debating submissions here also does not mean anything, if the arguments are
not taken to the committee by someone.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

p_ha...@wargaming.net

unread,
Sep 6, 2017, 9:06:13 PM9/6/17
to ISO C++ Standard - Discussion
On Thursday, September 7, 2017 at 3:00:33 AM UTC+10, Richard Hodges wrote:
I have seen string_view (or equivalent) in boost for quite some time. Never saw a need to use it as std::string const& was perfectly 
adequate and the occasional redundant copy of a short (SSO-optimised) string well below the radar for optimisation. 

[...]

I don't think string_view-as-SSO-copy-avoidance was really a motivating use-case. More of a happy side-effect.
 
re there being "lots of strings in the wild", fair enough. However I've always felt that these new strings were totally un-necessary. There's 
approximately one valid customisation of std::string that's required beyond whats's available - a string that handles utf8 encoding properly. 
And even that can be solved with a custom char trait. You don't even need a new interface.


A custom trait (or the more-usual case I've seen for multiple string types, a custom allocator) means that you can't use const std::string& as your parameter type. In our codebase, until we implemented the same (earlier) string_ref proposal that Boost used, a lot of our public string-taking APIs were template functions taking std::basic_string variants. Usually that template was just a wrapper to extract data() and length(), and pass them through to another API, and hence also needed an overload for char*, to handle string literals without needless copies. Annoyingly, this remains a problem for APIs which need the null-termination. We hit that limitation more often than we need efficient substring operations (lots of C code underlying our codebase), but we do hit the latter too.

David Brown

unread,
Sep 7, 2017, 7:11:48 AM9/7/17
to Nicol Bolas, ISO C++ Standard - Discussion
Thank you /very/ much for that post - it made a number of things a lot
clearer for me.

I have made some comments further down.

On 06/09/17 17:56, Nicol Bolas wrote:
>
>
> On Wednesday, September 6, 2017 at 4:01:03 AM UTC-4, David Brown wrote:
>
> On 05/09/17 17:37, Ville Voutilainen wrote:
> > On 5 September 2017 at 17:58, Matthew Woehlke
> <mwoehlk...@gmail.com <javascript:>> wrote:
> >>>> Now, hang on just a minute... can someone remind me what is the
> valid
> >>>> use case for *assigning* to a string_view from a temporary?
> When will
> >>>> that not end in tears?
> >>>>
> >>>
> >>> It is like anything else with reference semantics. If you know
> that the
> >>> lifetime of the underlying data is valid, it is safe. If you
> don't know
> >>> that, it isn't.
> >>
> >> Sorry, let me amend that: from a temporary *std::string*...
> >>
> >> Please give an example of how that can be safe.
> >>
> >> If you can't, I would suggest that implies that said assignment
> operator
> >> should be explicitly deleted, and hang the asymmetry...
> >
> > Except that as mentioned a couple of times, string_view's api doesn't
> > mention string, by design.
>
> Perhaps I am missing something obvious, or something that has already
> been explained. I am not an expert in these matters - consider this
> the
> opinion of a relative novice at C++.
>
> But I don't really understand the importance of this.
>
>
> The importance of `string_view`'s independence from `std::string` is
> essentially why `string_view` /exists/.
>
> The C++ community is filled with numerous string classes. Everybody has
> their own string types. While many C++ programmers do use `std::string`,
> many more never touch it.
>
> `string_view` is intended to be a lingua franca among all contiguous
> string APIs. If you write a function that only reads from a string, you
> really don't care whether those bytes come from `std::string`, some
> fixed-length string, or any number of other alternatives. All you care
> about is that you're taking a non-modifiable contiguous array of bytes
> of some size. So you use `string_view` as the type in your parameter
> list, and everything is fine.
>

That makes a lot of sense.

But what this also means is that string_view cannot, in general, be
returned by functions as there is no common way of being sure that the
data it references stays valid.

Thus you can write common functions like :

void display_message(std::string_view s);

But you cannot write :

std::string_view find_first_word(std::string_view s);

I suppose such a function is simply impossible to write - it has to
depend on the way you are storing strings.

Once we have concepts, it might be possible to write template functions
using a "String" concept function such as:

String find_first_word(String s);

But that would be for the future!


> Users cannot modify `string_view`. They cannot give it constructors that
> take their string type. As such, they must rely on conversion operators
> on their string types that convert to `string_view`. So, since every
> other user's string type will have conversions to `string_view`, then it
> makes sense for `string` to also have one.
>
> The committee is essentially eating its own dog food.
>
> To a the user, a
> "string_view" is a view into a string. Sure, you might want to make
> string_view objects from other things, such as C string literals. But
> why should the /user/ care that string_view is disconnected from
> string?
>
> A good API does two things:
>
> 1. It makes it easy to write correct code.
> 2. It makes it hard to write incorrect code.
>
> It seems to me that string_view is failing on point 2 here - and
> failing
> /badly/. You can keep telling us about reference semantics until you
> are blue in the face, but the simple matter is that Richard's examples
> look like perfectly good code, will compile without complain, and some
> might well work for some tests - yet they are broken. That is /not/
> good.
>
>
> Yes. But this is a /general problem/ in C++. It is not specific to
> `string_view`. It will be just as much a problem for `span`, for range
> adaptors, and so forth.
>
> If you have a general solution to this general language problem, then by
> all means, let's get it into the language. But until we have a way to
> fix this language hole, we should not gimp types by making them less useful.
>
> Yes, this means that people can do the wrong thing if they have the
> wrong expectations about what a value means. But we should not use that
> as a way to inhibit the utility of the type.
>
> The other thing you don't understand is that one of Richard's examples
> is not a problem of implicit conversion. It's a problem with using
> `string_view` at all. If you believe that you can replace a value return
> with a reference return, that's /on you/. That's your fault, and you
> have only yourself to blame. If you're saying that such a change looks
> to you "like perfectly good code", then the only way to avoid this is to
> /not have `string_view` at all/.
>
> We should not ditch a perfectly good type just because it can be misused.


I fully appreciate the principle here. I just hoped there were some way
to catch errors at compile time, thus stopping unsafe use.

Maybe calling the type "string_ref" rather than "string_view" would make
it more obvious to users that it has reference semantics?


Thiago Macieira

unread,
Sep 7, 2017, 7:33:10 AM9/7/17
to std-dis...@isocpp.org
On Thursday, 7 September 2017 08:11:39 -03 David Brown wrote:
> Maybe calling the type "string_ref" rather than "string_view" would make
> it more obvious to users that it has reference semantics?

Or it could lead people into believing it takes a reference count into the
data.

David Brown

unread,
Sep 7, 2017, 7:42:48 AM9/7/17
to std-dis...@isocpp.org
On 07/09/17 13:32, Thiago Macieira wrote:
> On Thursday, 7 September 2017 08:11:39 -03 David Brown wrote:
>> Maybe calling the type "string_ref" rather than "string_view" would make
>> it more obvious to users that it has reference semantics?
>
> Or it could lead people into believing it takes a reference count into the
> data.
>

Yes. Getting good names that make usage clear to everyone is never easy!


Nicol Bolas

unread,
Sep 7, 2017, 9:54:41 AM9/7/17
to ISO C++ Standard - Discussion, jmck...@gmail.com, da...@westcontrol.com
Nonsense. That is fundamentally no different from:

iterator find(Range &&rng, const T &value);

And yet, the Range TS will give us exactly such a function. You could call it with `find(vector{1, 2, 3}, 2)` and get an iterator to a destroyed value. You could call `find(get_something(), 2)` and get the same problem if it returns a prvalue.

Why is `string_view` so special In this regard?

Once we have concepts, it might be possible to write template functions
using a "String" concept function such as:

        String find_first_word(String s);

But that would be for the future!
 
This would be a template.

inkwizyt...@gmail.com

unread,
Sep 7, 2017, 6:31:53 PM9/7/17
to ISO C++ Standard - Discussion, jmck...@gmail.com, da...@westcontrol.com
probably because `string_view` work with more temporary data. I think you will more often see function returning `string` than `vector<char>` as value.
 
Once we have concepts, it might be possible to write template functions
using a "String" concept function such as:

        String find_first_word(String s);

But that would be for the future!
 
This would be a template.
But it could be one line inline template that have call to:
string_view find_first_word_impl(string_view x);


T. C.

unread,
Sep 7, 2017, 6:45:13 PM9/7/17
to ISO C++ Standard - Discussion, jmck...@gmail.com, da...@westcontrol.com


On Thursday, September 7, 2017 at 9:54:41 AM UTC-4, Nicol Bolas wrote:
Nonsense. That is fundamentally no different from:

iterator find(Range &&rng, const T &value);

And yet, the Range TS will give us exactly such a function. You could call it with `find(vector{1, 2, 3}, 2)` and get an iterator to a destroyed value. You could call `find(get_something(), 2)` and get the same problem if it returns a prvalue.

Except that the Range TS's find wraps the returned iterator in a dangling<> wrapper if the range is an rvalue.


Nicol Bolas

unread,
Sep 7, 2017, 11:52:46 PM9/7/17
to ISO C++ Standard - Discussion, jmck...@gmail.com, da...@westcontrol.com, inkwizyt...@gmail.com
Ranges-v3 has plenty of temporary ranges. Consider `range::ints(5, 10)`; that is a range of integers from 5 to 10. Now, consider the difference between:

for(auto x: ints(5, 10))

and

for(auto x: ints(5, 10) | some_range_adapter)

One of these works, one does not.

This is a general C++ problem; it is in no way specific to `string_view`.

Richard Hodges

unread,
Sep 8, 2017, 12:59:57 AM9/8/17
to std-dis...@isocpp.org, Nicol Bolas, da...@westcontrol.com
> Nonsense. That is fundamentally no different from:
> iterator find(Range &&rng, const T &value);

I would argue that it is very different.

Iterators look and feel like pointers. Their interface is pointery. There is a strong clue that one must take care. 

Even reference_wrapper's interface is pointery, which is a good thing.

In addition, neither iterators nor reference wrappers provide interfaces that masquerade as the real thing. All access to the referee is via an obvious de-referencing call.

string_view's interface looks and feels like a value_type. In fact it is a deliberate facsimile of std::string's interface, which is a well-known value type with well-behaved lifetime semantics.

So now we have two objects that look and feel exactly alike, one of which is completely safe, the other is in essence a raw pointer with no safety catch.

You are offered two hand grenades, the same weight. One is armed, the other not. You are invited to pull the pin on one. Wouldn't you, the hand-grenade user, wish to see a red mark on the live one? That's how I feel now about auto s = something_stringy();

The fact that string_view is implicitly constructible from string makes it all the more likely that someone will inadvertently create an invalid string_view and then call a method on it. There's no safety catch. No explicit conversion or operator* to warn the unwary coder.

It's completely contrary to the good practice of making it hard to do bad things.

I don't deny the usefulness of the string_view proxy object for marshalling between ill-behaved c++ apis (is this really anything more than a minor inconvenience?). I am affronted by its cavalier disregard for good design.

I am certain that in 2020 the community will look back and think, "man, that string_view was totally the auto_ptr of 2017 - we should deprecate that and bring out something safer".

in summary, if my code says:

auto x = y();

I want to be sure that either:

* x is a value, or

* If it isn't a value, I want to be sure that I must say *x or x->something() in order to get undefined behaviour.

 
Finally, on ranges.

Ranges are a convenient shorthand, that look kind-of like containers.

I see the parallel in your position, but there is a useful difference.

a range has the interface begin() and end() which yields iterators (ie. pointery things). There is still a clue.

I take the point about for(auto&& x  : make_range() | filter())

This is a problem. The design of a range and/or pipe should not allow this to compile. If it can't be made safe, it should not be in the standard library.






--

T. C.

unread,
Sep 8, 2017, 1:56:37 AM9/8/17
to ISO C++ Standard - Discussion, jmck...@gmail.com, da...@westcontrol.com, inkwizyt...@gmail.com
Ranges-v3 has plenty of temporary ranges. Consider `range::ints(5, 10)`; that is a range of integers from 5 to 10. Now, consider the difference between:

for(auto x: ints(5, 10))

and

for(auto x: ints(5, 10) | some_range_adapter)

One of these works, one does not.

This is not how this works either. ints(5, 10) is a view, and | is clever enough to move rvalue views into its result and reject rvalue containers.

David Brown

unread,
Sep 8, 2017, 4:51:15 AM9/8/17
to Nicol Bolas, ISO C++ Standard - Discussion, inkwizyt...@gmail.com
On 08/09/17 05:52, Nicol Bolas wrote:

>
>
> Ranges-v3 has plenty of temporary ranges. Consider `range::ints(5, 10)`;
> that is a range of integers from 5 to 10. Now, consider the difference
> between:
>
> |
> for(autox:ints(5,10))
> |
>
> and
>
> |
> for(autox:ints(5,10)|some_range_adapter)
> |
>
> One of these works, one does not.
>
> This is a general C++ problem; it is in no way specific to `string_view`.


Then is there a way to solve it? Since this is apparently a more
general C++ problem, it is going to cause more issues. I fear it is
going to be too easy to write incorrect code here - as C++ gets more and
more "Pythonic", people will expect this kind of thing to work. And
they will expect that things that don't work, will fail to compile -
which is one of the reasons for using C++ and not a language like Python.

I can think of three ways out (without having actual solutions for any
of them) :

1. Make it all "just work". That would mean extending the lifetime of
temporaries when using classes with reference semantics, just as is done
with normal references to temporaries. It would also mean fixing the
limitation that currently exists with binding a reference to a temporary
returned by a function. So for example:

string_view sv = function_returning_string();

would be treated somewhat like:

auto temp = function_returning_string();
string_view sv = temp;


2. Make it fail to compile when it would not work. I don't know if that
can be done at the moment while also letting it work easily in useful
and valid cases.

3. Have implementations flag bad cases as warnings or errors at compile
time (/not/ just with run-time sanitizers).


What we don't need is a feature that looks like it is simple and easy to
use, but has subtle problems. It is one thing to say "don't use
references like this" when you are talking about clearly marked
references with & - it is another when you are talking about a class
that hides the details (like string_view, and presumably also some of
the ranges classes). It is great to have features that give more
flexibility for expert C++ programmers to write neat and efficient code.
But we should never forget that most C++ programmers are /not/ experts,
and it is not helpful to have features that are easily used incorrectly.

Yes, I know I am asking for the best of all worlds here - I am asking
for "magic", and giving no ideas on how to implement it. And I also
know that it is too late to make changes to C++17. But it is never too
late to think about potential problems. And it is certainly never too
late to find implementation workarounds or development aids. Perhaps
the answer is to have implementations of string_view (and other
reference classes) marked with [[reference_semantics]] attributes and
combine that with compiler warnings. It's okay to add a little extra
workload on the class library developers if it helps the end users.

Richard Hodges

unread,
Sep 8, 2017, 9:27:55 AM9/8/17
to std-dis...@isocpp.org
What you want is to force users to think about the fact that they're converting the string to a view.

YES!!! - 
Because it's a dangerous implicit conversion.


On 4 September 2017 at 19:41, Nicol Bolas <jmck...@gmail.com> wrote:
On Monday, September 4, 2017 at 1:30:11 PM UTC-4, Matthew Woehlke wrote:
On 2017-09-04 11:10, Thiago Macieira wrote:
> I'm not alarmed.
>
> All *_view APIs (and QStringView, for that matter) need to remember never to
> return or store a view based on a parameter that was a view. If your API takes
> a string_view and you need to store for later, you store a string, not
> string_view.

I would expand that further: *any* API needs to be very careful about
returning a pointer or reference to a temporary object. In particular,
any API that returns a pointer or reference that is based in some manner
on its input parameters (which, for class members, includes `this`)
needs to be very, very careful of those input parameters possibly being
temporary objects.

TBH, I'm sort-of ambivalent about allowing construction of a string_view
from a temporary string. On the one hand, we want to allow things like:

  foo( string_view );
  foo( "bar"s );

...which is perfectly safe (assuming that `foo` does not try to "hold on
to" the string data in any way), because the temporary string will not
go out of scope until the call to `foo` completes.

On the other hand, we would clearly like to forbid, or at least make it
easy for compilers to warn about, this:

  string_view sv = "bar"s;

I'm not sure how to achieve both those objectives.

The information which the compiler lacks to be able to do this is the knowledge that `string_view` refers to memory managed by `string`, and therefore if the `string` is destroyed before that `string_view`, then there's a problem.

Basically, it's a significant portion of the "pass temporary through a reference" problem.

We might be able to get at some of the low-hanging fruit by using an attribute. If a conversion operator/constructor is labeled [[reference]], then what it converts to is a reference to memory managed by the object. So if the compiler can tell for certain that the reference will outlive the object that created it, then there can be a compiler warning.

Nicol Bolas

unread,
Sep 8, 2017, 10:50:13 AM9/8/17
to ISO C++ Standard - Discussion, jmck...@gmail.com, inkwizyt...@gmail.com, da...@westcontrol.com
On Friday, September 8, 2017 at 4:51:15 AM UTC-4, David Brown wrote:
On 08/09/17 05:52, Nicol Bolas wrote:

>
>
> Ranges-v3 has plenty of temporary ranges. Consider `range::ints(5, 10)`;
> that is a range of integers from 5 to 10. Now, consider the difference
> between:
>
> |
> for(autox:ints(5,10))
> |
>
> and
>
> |
> for(autox:ints(5,10)|some_range_adapter)
> |
>
> One of these works, one does not.
>
> This is a general C++ problem; it is in no way specific to `string_view`.


Then is there a way to solve it?

Not without some form of language feature.

Since this is apparently a more
general C++ problem, it is going to cause more issues.  I fear it is
going to be too easy to write incorrect code here - as C++ gets more and
more "Pythonic", people will expect this kind of thing to work.

The correct term is "functional".

Also, I agree with you. The problem is that the C++ standard committee actually rejected a proposal that could have fixed ths, on the (apparent) grounds that:

> the problem this aims to solve (a category of subtle object lifetime issues) is not big enough to warrant requiring such annotations.

I honestly have no idea what the committee was thinking there. Granted, that proposal is perhaps a bit more complex than is strictly needed (and using the `export` keyword is right out as an idea). But it would work to fix this problem.

What we don't need is a feature that looks like it is simple and easy to
use, but has subtle problems.

We already have plenty of those; what's one more?

Nicol Bolas

unread,
Sep 8, 2017, 11:18:43 AM9/8/17
to ISO C++ Standard - Discussion, jmck...@gmail.com, da...@westcontrol.com
On Friday, September 8, 2017 at 12:59:57 AM UTC-4, Richard Hodges wrote:
> Nonsense. That is fundamentally no different from:
> iterator find(Range &&rng, const T &value);

I would argue that it is very different.

Iterators look and feel like pointers. Their interface is pointery. There is a strong clue that one must take care. 

Even reference_wrapper's interface is pointery, which is a good thing.

In addition, neither iterators nor reference wrappers provide interfaces that masquerade as the real thing. All access to the referee is via an obvious de-referencing call.

string_view's interface looks and feels like a value_type. In fact it is a deliberate facsimile of std::string's interface, which is a well-known value type with well-behaved lifetime semantics.

You could say the exact same thing about a generic iterator range type. A random-access iterator range type would naturally mimic significant parts of the interface of `deque`, `vector`, and `string`. It would have `operator[]`, `begin`, `end`, `size`, and so forth. A contiguous iterator range would even have a `data` member.

And yet, it is just as much of a reference type as `string_view`. The same is true of `span`.

C++ programmers are just going to have to get used to these kinds of types. Assuming they haven't already.

So now we have two objects that look and feel exactly alike, one of which is completely safe, the other is in essence a raw pointer with no safety catch.

You are offered two hand grenades, the same weight. One is armed, the other not. You are invited to pull the pin on one. Wouldn't you, the hand-grenade user, wish to see a red mark on the live one? That's how I feel now about auto s = something_stringy();

I don't understand your problem with `auto s = something_stringy()`. If that returns a `string_view`, then the API has an implicit understanding of who owns that view. If that returns a `string`, then we know who owns it.

If there is confusion as to what it returns, then it's a bad API. Further, I don't see how the implicit conversion to `string_view` in any way matters with that case, which is what this thread is supposed to be about.

The fact that string_view is implicitly constructible from string makes it all the more likely that someone will inadvertently create an invalid string_view and then call a method on it. There's no safety catch. No explicit conversion or operator* to warn the unwary coder.

It's completely contrary to the good practice of making it hard to do bad things.

I don't deny the usefulness of the string_view proxy object for marshalling between ill-behaved c++ apis (is this really anything more than a minor inconvenience?).

APIs are not "ill-behaved" just because they don't conform to the string type you believe they ought to use. We need `string_view` to address reality, not "ill-behaved c++ apis".

I am affronted by its cavalier disregard for good design.

I am certain that in 2020 the community will look back and think, "man, that string_view was totally the auto_ptr of 2017 - we should deprecate that and bring out something safer".

You act like `string_view` appeared ex-nihilo into the standard this year. It is based on extensive existing practice across several C++ projects (one of them being a C++ compiler), and has been part of the Library Fundamentals TS version 1 for quite some time.

If there was going to be any serious pushback on being implicitly convertible, it would already have happened by now.

in summary, if my code says:

auto x = y();

I want to be sure that either:

* x is a value, or

* If it isn't a value, I want to be sure that I must say *x or x->something() in order to get undefined behaviour.

Then your expectation is the problem here. There are many C++ types where this expectation is not met. Lazily evaluated expressions being a big case, but there are plenty of others.

If you believe `auto x = ...` should result in a value type or a pointer type, then your beliefs need to be adjusted, not the standard.

Also, if we ever get one of the "operator-dot" proposals through, then your expectation will be even less valid.

Finally, on ranges.

Ranges are a convenient shorthand, that look kind-of like containers.

I see the parallel in your position, but there is a useful difference.

a range has the interface begin() and end() which yields iterators (ie. pointery things). There is still a clue.

Except that one of the main point of ranges is that you don't end up doing a lot of "pointer" things with them. You tend to pass them to algorithms and adaptors as though they were containers. That kind of thing encourages rvalue-treatment.

Ricardo Fabiano de Andrade

unread,
Sep 8, 2017, 1:42:49 PM9/8/17
to std-dis...@isocpp.org
On Fri, Sep 8, 2017 at 10:18 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, September 8, 2017 at 12:59:57 AM UTC-4, Richard Hodges wrote:
> Nonsense. That is fundamentally no different from:
> iterator find(Range &&rng, const T &value);

I would argue that it is very different.

Iterators look and feel like pointers. Their interface is pointery. There is a strong clue that one must take care. 

Even reference_wrapper's interface is pointery, which is a good thing.

In addition, neither iterators nor reference wrappers provide interfaces that masquerade as the real thing. All access to the referee is via an obvious de-referencing call.

string_view's interface looks and feels like a value_type. In fact it is a deliberate facsimile of std::string's interface, which is a well-known value type with well-behaved lifetime semantics.

You could say the exact same thing about a generic iterator range type. A random-access iterator range type would naturally mimic significant parts of the interface of `deque`, `vector`, and `string`. It would have `operator[]`, `begin`, `end`, `size`, and so forth. A contiguous iterator range would even have a `data` member.

And yet, it is just as much of a reference type as `string_view`. The same is true of `span`.

C++ programmers are just going to have to get used to these kinds of types. Assuming they haven't already.

Are you implying that the following will be / is supported?
random_access_range<int> a = vector<int>{ 1, 2, 3 };
 
Because that is what string_view does.


So now we have two objects that look and feel exactly alike, one of which is completely safe, the other is in essence a raw pointer with no safety catch.

You are offered two hand grenades, the same weight. One is armed, the other not. You are invited to pull the pin on one. Wouldn't you, the hand-grenade user, wish to see a red mark on the live one? That's how I feel now about auto s = something_stringy();

I don't understand your problem with `auto s = something_stringy()`. If that returns a `string_view`, then the API has an implicit understanding of who owns that view. If that returns a `string`, then we know who owns it.

If there is confusion as to what it returns, then it's a bad API. Further, I don't see how the implicit conversion to `string_view` in any way matters with that case, which is what this thread is supposed to be about.


I think the problem here is:
auto s = string_view{};
// ... few lines ...
s = something_stringy(); // it may return std::string
 
The fact that string_view is implicitly constructible from string makes it all the more likely that someone will inadvertently create an invalid string_view and then call a method on it. There's no safety catch. No explicit conversion or operator* to warn the unwary coder.

It's completely contrary to the good practice of making it hard to do bad things.

I don't deny the usefulness of the string_view proxy object for marshalling between ill-behaved c++ apis (is this really anything more than a minor inconvenience?).

APIs are not "ill-behaved" just because they don't conform to the string type you believe they ought to use. We need `string_view` to address reality, not "ill-behaved c++ apis".

Agreed. But I think it could be made less error-prone if it wasn't by the fact it accepts assignment from temporaries (construction is fine).
 

I am affronted by its cavalier disregard for good design.

I am certain that in 2020 the community will look back and think, "man, that string_view was totally the auto_ptr of 2017 - we should deprecate that and bring out something safer".

You act like `string_view` appeared ex-nihilo into the standard this year. It is based on extensive existing practice across several C++ projects (one of them being a C++ compiler), and has been part of the Library Fundamentals TS version 1 for quite some time.

If there was going to be any serious pushback on being implicitly convertible, it would already have happened by now.

An oversight might have happened in this case.
But yes, there has been plenty of time for reviewing string_view.
Unfortunately, the greater C++ community in general don't try features until they are in the standard.
So some more concrete feedback about string_view being error-prone may come only in "2020".
 

in summary, if my code says:

auto x = y();

I want to be sure that either:

* x is a value, or

* If it isn't a value, I want to be sure that I must say *x or x->something() in order to get undefined behaviour.

Then your expectation is the problem here. There are many C++ types where this expectation is not met. Lazily evaluated expressions being a big case, but there are plenty of others.

If you believe `auto x = ...` should result in a value type or a pointer type, then your beliefs need to be adjusted, not the standard.

Also, if we ever get one of the "operator-dot" proposals through, then your expectation will be even less valid. 

Finally, on ranges.

Ranges are a convenient shorthand, that look kind-of like containers.

I see the parallel in your position, but there is a useful difference.

a range has the interface begin() and end() which yields iterators (ie. pointery things). There is still a clue.

Except that one of the main point of ranges is that you don't end up doing a lot of "pointer" things with them. You tend to pass them to algorithms and adaptors as though they were containers. That kind of thing encourages rvalue-treatment.

--
It is loading more messages.
0 new messages