#include <string>
#include <string_view>
std::string_view sub_string(std::string_view s,
std::size_t p,
std::size_t n = std::string_view::npos)
{
return s.substr(p, n);
}
std::string sub_string(std::string&& s,
std::size_t p,
std::size_t n = std::string_view::npos)
{
return s.substr(p, n);
}
std::string sub_string(std::string const& s,
std::size_t p,
std::size_t n = std::string_view::npos)
{
return s.substr(p, n);
}
int main()
{
using namespace std::literals;
auto source = "foobar"s;
auto bar = sub_string(source, 3);
// but uh-oh...
bar = sub_string("foobar"s, 3);
// now use bar at your peril...
}
gcc and clang don't produce any warnings here. I don't believe code reviews will find bugs like this reliably, and often neither will unit testing.int main()
{
using namespace std::literals;
auto source = "foobar"s;
auto bar = sub_string(std::string_view(source), 3);
// but uh-oh...
bar = sub_string("foobar"s, 3);
}On 4 Sep 2017, at 14:02, Michael Hava <m...@live.at> wrote:
I see your concern, but isn't this a classic example of "don't store references to temporaries"?Maybe I'm missing something, but decltype(bar) should yield std::string and therefore be safe...
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Ok, now your example makes sense to me (was checking StackOverflow right now too).
I agree with you! This is a horrible issue – especially as most people are (apparently) starting to recommend using string_view for APIs.
This has a distinct auto_ptr-smell to it…
--
---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this topic, visit
https://groups.google.com/a/isocpp.org/d/topic/std-discussion/Gj5gt5E-po8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
std-discussio...@isocpp.org.
My mistake in the example was pointed out on stack overflow where I have corrected it (and therefore reintroduced the bug despite the well intentioned overloads).Repeated here:int main() { using namespace std::literals; auto source = "foobar"s; auto bar = sub_string(std::string_view(source), 3); // but uh-oh... bar = sub_string("foobar"s, 3); }The problem for me is that string_view is a copyable object by interface, but a reference by behaviour. Since there is an automatic conversion from temporary strings to string_views, these ‘references that look like objects’ will be created by mistake by users who are not necessarily sure of which overload will be selected in any given scenario, particularly when templates are involved.
This smells less like an auto_ptr issue, or even a dangling-reference issue, as it does a 'using an invalidated iterator' issue, or in this case, 'silently returning a reference to a const reference you got as a parameter'. I guess the latter *is* a dangling-reference issue.
but then this could be:auto [[force_store_allowed]] sv = somethingThatReturnsStringView();
It’s not that people cannot be taught how to use what is essentially a pointer to the guts of an object. People on the whole do understand how to use pointers, but a lot of effort has been spent encouraging them not to. For obvious reasons.My previous email (I hope) provides an insight into the problems I foresee. Developers interface with other people’s code. If that code exports a string_view, or an interface changes to provide a string_view alternative (because someone was hoping to provide a performance boost to their users), then user code will still compile, silently converting string references to string_views where previously it would have made copies, and code will quietly, insidiously, behind the developer’s backs, become UB.I understand that the string_view is a much-loved concept and emotions may run high over the issue - we all value performance.But implicit conversions have already been established to be a bad thing - particularly when the converted-to thing is a guts-exposer of the donor object.This is not the intent of conversion.std::string_view(const std::string&) should be explicit.std::string_view(std::string&&) should be deleted.because a string_view *is not* a "kind of string”.A string_view is a pointer to the private implementation of a string. It’s as dangerous as std::string::c_str().More so, because it’s implicit.
auto [[force_store_allowed]] sv = somethingThatReturnsStringView();
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
> My post was *specifically* talking about your [[allow_force_store]] proposal.Forgive me, this was not a proposal, merely the kind of thing that would be necessary to make implicit conversions safe. My proposal is that implicit conversions be disallowed.
--
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
Since we don't have such tools currently,
From this question on SO: https://stackoverflow.com/q/46032307/2015579here's a well-intentioned offering to produce a substring from either a string-view or string in a uniform way:#include <string> #include <string_view> std::string_view sub_string(std::string_view s, std::size_t p, std::size_t n = std::string_view::npos) { return s.substr(p, n); }
std::string sub_string(std::string&& s, std::size_t p, std::size_t n = std::string_view::npos) { return s.substr(p, n); } std::string sub_string(std::string const& s, std::size_t p, std::size_t n = std::string_view::npos) { return s.substr(p, n); }And here's how it will introduce random segfaults into user code.gcc and clang don't produce any warnings here. I don't believe code reviews will find bugs like this reliably, and often neither will unit testing.int main() { using namespace std::literals; auto source = "foobar"s; auto bar = sub_string(source, 3); // but uh-oh... bar = sub_string("foobar"s, 3);// now use bar at your peril...}Allowing implicit conversions from std::string to std::string_view is all very nice, and I understand the intention - to allow algorithms to become more efficient by a simple, compatible interface change.C++11 went a long way to removing c++'s reputation as a difficult language to get right.This one design error in c++17 will re-award c++ the accolade of "most buggy and segfaultly language on the planet".On grounds of safety alone, it is a design error and should be removed.
--
OK, I am sure the community is getting bored of this discussion. It's clear to me that people's minds are made up.So I'll accept the advice of my peers: "never return a string_view".
--
> string_view is as safe as a char*I agree 100%The reason we have std::string is because char* is about as unsafe as it gets. Basically we're back to C. A regression of 30 years.
> What I got was "use string_view as a parameter type"That goes some way to helping but the assignability from string still exhibits the same problem when its used as an argument:int foo(std::string_view s) // refactored from std::string{if(some_condition()) {s = good_intention();}return something_else(s); // boom!}
On 2017-09-05 05:09, Richard Hodges wrote:
> s = good_intention(); // cool! performance and elegance!
Now, hang on just a minute... can someone remind me what is the valid
use case for *assigning* to a string_view from a temporary? When will
that not end in tears?
> Well, ok, technically, but there's a good reason for the alternative constructor invocation has this form.
> On 09/05/2017 12:56 PM, Richard Hodges wrote:> I disagree.
> auto x = X(); is not "returning an object", it is the "alternative constructor notation" which is recommended in order to avoid the most vexing parse problem.
Well, ok, technically, but there's a good reason for the alternative constructor invocation has this form.
On 09/05/2017 12:56 PM, Richard Hodges wrote:
I disagree.
auto x = X(); is not "returning an object", it is the "alternative constructor notation" which is recommended in order to avoid the most vexing parse problem.
On 5 September 2017 at 11:33, Eyal Rozenberg <eya...@technion.ac.il <mailto:eya...@technion.ac.il>> wrote:
(off the list)
Umm, you're not accepting the advice of your peers - you are
returning a string view in the "all well and good" line.
However, assignment from a string to a string view may be problematic.
Eyal
<mailto:p_ha...@wargaming.net> <mailto:p_ha...@wargaming.net
<mailto:p_ha...@wargaming.net>>> wrote:
On Tuesday, September 5, 2017 at 4:07:29 AM UTC+10, Ville
Voutilainen wrote:
On 4 September 2017 at 21:02, Richard Hodges
<hodg...@gmail.com <mailto:hodg...@gmail.com>>
wrote:
> Whoever pushed for this implicit interface ought to
come out here and
> explain themselves. And volunteer to fix the bugs.
What makes you think the implicit conversion hasn't
been explained
when it was proposed?
Can you not find the rationale for it in the
string_view proposals?
Maybe I'm looking in the wrong place, but in the chain of N3442
there never seems to have been a question of whether
string_view
*should* have an implicit constructor from string, only the
observation that the predecessors in Google, LLVM, and
Bloomberg's
codebases have it, and the benefits of it.
The entire final rationale in both the working papers and
in the
standard appears to be [string.view]/2
<http://eel.is/c++draft/string.view#2
<http://eel.is/c++draft/string.view#2>>:
/[ Note: The library provides implicit conversions from
const charT*
and std::basic_string<charT, ...> to
std::basic_string_view<charT,
...> so that user code can accept just
std::basic_string_view<charT>
as a non-templated parameter wherever a sequence of
characters is
expected. User-defined types should define their own implicit
conversions to std::basic_string_view in order to
interoperate with
these functions. — end note ]/
It's possibly slightly surprising that this note wasn't
adjusted by
P0254 <http://wg21.link/P0254> which moved the
std::basic_string
implicit conversion into std::basic_string, which is nicely
consistent with the second sentence of the note. The note's not
incorrect, but an uncareful reading would possibly cause
readers to
expect that the two implicit conversions mentioned would be
provided
in the same place, and within the section to which the note
is attached.
P0254 <http://wg21.link/P0254> as I mentioned previously didn't
address this question either, accepting the /status quo/.
<mailto:std-discussion%2Bunsubscr...@isocpp.org>
<mailto:std-discussion+unsubscr...@isocpp.org
<mailto:std-discussion%2Bunsubscr...@isocpp.org>>.
To post to this group, send email to
std-dis...@isocpp.org <mailto:std-discussion@isocpp.org>
<mailto:std-discussion@isocpp.org
<mailto:std-discussion@isocpp.org>>.
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/
--
---
You received this message because you are subscribed to the
Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to std-discussion+unsubscribe@isocpp.org
<mailto:std-discussion%2Bunsubscr...@isocpp.org>
<mailto:std-discussion+unsubscri...@isocpp.org
<mailto:std-discussion%2Bunsubscr...@isocpp.org>>.
To post to this group, send email to std-dis...@isocpp.org
<mailto:std-discussion@isocpp.org>
<mailto:std-discussion@isocpp.org
<mailto:std-discussion@isocpp.org>>.
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/
We could try adding an assignment operator template that is disabled
if the incoming type is string_view itself.
Is there a forum I can join so that I get early notification of proposals to be discussed for standards inclusion?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
Are you saying that there is no public forum where proposals can be critically scrutinised by the user community before submission, a-la-boost?
On 05/09/17 17:37, Ville Voutilainen wrote:
> On 5 September 2017 at 17:58, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
>>>> Now, hang on just a minute... can someone remind me what is the valid
>>>> use case for *assigning* to a string_view from a temporary? When will
>>>> that not end in tears?
>>>>
>>>
>>> It is like anything else with reference semantics. If you know that the
>>> lifetime of the underlying data is valid, it is safe. If you don't know
>>> that, it isn't.
>>
>> Sorry, let me amend that: from a temporary *std::string*...
>>
>> Please give an example of how that can be safe.
>>
>> If you can't, I would suggest that implies that said assignment operator
>> should be explicitly deleted, and hang the asymmetry...
>
> Except that as mentioned a couple of times, string_view's api doesn't
> mention string, by design.
Perhaps I am missing something obvious, or something that has already
been explained. I am not an expert in these matters - consider this the
opinion of a relative novice at C++.
But I don't really understand the importance of this.
To a the user, a
"string_view" is a view into a string. Sure, you might want to make
string_view objects from other things, such as C string literals. But
why should the /user/ care that string_view is disconnected from string?
A good API does two things:
1. It makes it easy to write correct code.
2. It makes it hard to write incorrect code.
It seems to me that string_view is failing on point 2 here - and failing
/badly/. You can keep telling us about reference semantics until you
are blue in the face, but the simple matter is that Richard's examples
look like perfectly good code, will compile without complain, and some
might well work for some tests - yet they are broken. That is /not/ good.
On Wednesday, September 6, 2017 at 11:03:02 PM UTC+10, Richard Hodges wrote:Are you saying that there is no public forum where proposals can be critically scrutinised by the user community before submission, a-la-boost?I assume you mean "before acceptance"? I wasn't aware of boost having a pre-submission review process before the "Boost Library Submission Process".If I have assumed correctly, then that's not what I'm saying at all.Although it's not formally required (as far as I know, from the outside), most papers that're under consideration for acceptance into the standard should appear in the mailings I linked in my earlier reply. And papers that don't are very unlikely to get accepted on first-showing, unless they're quite trivial. Even trivial-seeming defect changes get "come back with a paper" responses from the working groups.Also worth noting that the committee has expressed a strong preference for standardising existing practice, so particularly with library features, they're often coming *from* a large and well-used open-source codebase. In the present case, that's LLVM, Google Chromium, *and* Bloomberg BDE. They (plus Boost later) all put effectively *this* class in front of the public through their many and varied forums and contributions, half-a-decade before it became part of C++.There's also the mechanisms for "Technical Specifications", which allow C++ features which the committee feels are valuable but not-yet-ready to bake for a bit and ensure the form they have taken for standardisation is correct and generally feasible. In the present case, that was the "Library Fundamentals TS", published in 2015, and containing string_view since November 2014, after 7 revisions over almost three years.
On Sep 6, 2017, at 9:28 AM, Ville Voutilainen <ville.vo...@gmail.com> wrote:
>
>> As suggested by others, the simplest solution would be making the making
>> explicit std::string::operator string_view()
>> Would that be possible in a defect?
>
> Possible.. maybe. Likely, I doubt it. A change like that will already
> break valid code.
I doubt it too.
One of the main selling points of string_view is that it could serve as a “universal” parameter to bind to “all things string”, for example:
https://github.com/HowardHinnant/date/blob/master/tz.h#L443-L453
whereas without it (for various reasons), one needs overloads, for example:
https://github.com/HowardHinnant/date/blob/master/tz.h#L508-L533
This is such a major feature of string_view that the motivation for breaking it would have to be very large.
This suggested change breaks that feature. I would vote strongly against it. Take out this feature, and you might as well take out string_view entirely.
Like reference_wrapper, C++ programmers need to learn that types with the word “view” in the name have reference-like behavior.
Howard
I have seen string_view (or equivalent) in boost for quite some time. Never saw a need to use it as std::string const& was perfectlyadequate and the occasional redundant copy of a short (SSO-optimised) string well below the radar for optimisation.
re there being "lots of strings in the wild", fair enough. However I've always felt that these new strings were totally un-necessary. There'sapproximately one valid customisation of std::string that's required beyond whats's available - a string that handles utf8 encoding properly.And even that can be solved with a custom char trait. You don't even need a new interface.
iterator find(Range &&rng, const T &value);
Once we have concepts, it might be possible to write template functions
using a "String" concept function such as:
String find_first_word(String s);
But that would be for the future!
Once we have concepts, it might be possible to write template functions
using a "String" concept function such as:
String find_first_word(String s);
But that would be for the future!This would be a template.
string_view find_first_word_impl(string_view x);
Nonsense. That is fundamentally no different from:iterator find(Range &&rng, const T &value);And yet, the Range TS will give us exactly such a function. You could call it with `find(vector{1, 2, 3}, 2)` and get an iterator to a destroyed value. You could call `find(get_something(), 2)` and get the same problem if it returns a prvalue.
for(auto x: ints(5, 10))for(auto x: ints(5, 10) | some_range_adapter)
--
Ranges-v3 has plenty of temporary ranges. Consider `range::ints(5, 10)`; that is a range of integers from 5 to 10. Now, consider the difference between:for(auto x: ints(5, 10))andfor(auto x: ints(5, 10) | some_range_adapter)One of these works, one does not.
On Monday, September 4, 2017 at 1:30:11 PM UTC-4, Matthew Woehlke wrote:On 2017-09-04 11:10, Thiago Macieira wrote:
> I'm not alarmed.
>
> All *_view APIs (and QStringView, for that matter) need to remember never to
> return or store a view based on a parameter that was a view. If your API takes
> a string_view and you need to store for later, you store a string, not
> string_view.
I would expand that further: *any* API needs to be very careful about
returning a pointer or reference to a temporary object. In particular,
any API that returns a pointer or reference that is based in some manner
on its input parameters (which, for class members, includes `this`)
needs to be very, very careful of those input parameters possibly being
temporary objects.
TBH, I'm sort-of ambivalent about allowing construction of a string_view
from a temporary string. On the one hand, we want to allow things like:
foo( string_view );
foo( "bar"s );
...which is perfectly safe (assuming that `foo` does not try to "hold on
to" the string data in any way), because the temporary string will not
go out of scope until the call to `foo` completes.
On the other hand, we would clearly like to forbid, or at least make it
easy for compilers to warn about, this:
string_view sv = "bar"s;
I'm not sure how to achieve both those objectives.
The information which the compiler lacks to be able to do this is the knowledge that `string_view` refers to memory managed by `string`, and therefore if the `string` is destroyed before that `string_view`, then there's a problem.
Basically, it's a significant portion of the "pass temporary through a reference" problem.
We might be able to get at some of the low-hanging fruit by using an attribute. If a conversion operator/constructor is labeled [[reference]], then what it converts to is a reference to memory managed by the object. So if the compiler can tell for certain that the reference will outlive the object that created it, then there can be a compiler warning.
On 08/09/17 05:52, Nicol Bolas wrote:
>
>
> Ranges-v3 has plenty of temporary ranges. Consider `range::ints(5, 10)`;
> that is a range of integers from 5 to 10. Now, consider the difference
> between:
>
> |
> for(autox:ints(5,10))
> |
>
> and
>
> |
> for(autox:ints(5,10)|some_range_adapter)
> |
>
> One of these works, one does not.
>
> This is a general C++ problem; it is in no way specific to `string_view`.
Then is there a way to solve it?
Since this is apparently a more
general C++ problem, it is going to cause more issues. I fear it is
going to be too easy to write incorrect code here - as C++ gets more and
more "Pythonic", people will expect this kind of thing to work.
What we don't need is a feature that looks like it is simple and easy to
use, but has subtle problems.
> Nonsense. That is fundamentally no different from:> iterator find(Range &&rng, const T &value);I would argue that it is very different.Iterators look and feel like pointers. Their interface is pointery. There is a strong clue that one must take care.Even reference_wrapper's interface is pointery, which is a good thing.In addition, neither iterators nor reference wrappers provide interfaces that masquerade as the real thing. All access to the referee is via an obvious de-referencing call.string_view's interface looks and feels like a value_type. In fact it is a deliberate facsimile of std::string's interface, which is a well-known value type with well-behaved lifetime semantics.
So now we have two objects that look and feel exactly alike, one of which is completely safe, the other is in essence a raw pointer with no safety catch.You are offered two hand grenades, the same weight. One is armed, the other not. You are invited to pull the pin on one. Wouldn't you, the hand-grenade user, wish to see a red mark on the live one? That's how I feel now about auto s = something_stringy();
The fact that string_view is implicitly constructible from string makes it all the more likely that someone will inadvertently create an invalid string_view and then call a method on it. There's no safety catch. No explicit conversion or operator* to warn the unwary coder.It's completely contrary to the good practice of making it hard to do bad things.I don't deny the usefulness of the string_view proxy object for marshalling between ill-behaved c++ apis (is this really anything more than a minor inconvenience?).
I am affronted by its cavalier disregard for good design.I am certain that in 2020 the community will look back and think, "man, that string_view was totally the auto_ptr of 2017 - we should deprecate that and bring out something safer".
in summary, if my code says:auto x = y();I want to be sure that either:* x is a value, or* If it isn't a value, I want to be sure that I must say *x or x->something() in order to get undefined behaviour.
Finally, on ranges.Ranges are a convenient shorthand, that look kind-of like containers.I see the parallel in your position, but there is a useful difference.a range has the interface begin() and end() which yields iterators (ie. pointery things). There is still a clue.
On Friday, September 8, 2017 at 12:59:57 AM UTC-4, Richard Hodges wrote:> Nonsense. That is fundamentally no different from:> iterator find(Range &&rng, const T &value);I would argue that it is very different.Iterators look and feel like pointers. Their interface is pointery. There is a strong clue that one must take care.Even reference_wrapper's interface is pointery, which is a good thing.In addition, neither iterators nor reference wrappers provide interfaces that masquerade as the real thing. All access to the referee is via an obvious de-referencing call.string_view's interface looks and feels like a value_type. In fact it is a deliberate facsimile of std::string's interface, which is a well-known value type with well-behaved lifetime semantics.You could say the exact same thing about a generic iterator range type. A random-access iterator range type would naturally mimic significant parts of the interface of `deque`, `vector`, and `string`. It would have `operator[]`, `begin`, `end`, `size`, and so forth. A contiguous iterator range would even have a `data` member.And yet, it is just as much of a reference type as `string_view`. The same is true of `span`.C++ programmers are just going to have to get used to these kinds of types. Assuming they haven't already.
So now we have two objects that look and feel exactly alike, one of which is completely safe, the other is in essence a raw pointer with no safety catch.You are offered two hand grenades, the same weight. One is armed, the other not. You are invited to pull the pin on one. Wouldn't you, the hand-grenade user, wish to see a red mark on the live one? That's how I feel now about auto s = something_stringy();I don't understand your problem with `auto s = something_stringy()`. If that returns a `string_view`, then the API has an implicit understanding of who owns that view. If that returns a `string`, then we know who owns it.
If there is confusion as to what it returns, then it's a bad API. Further, I don't see how the implicit conversion to `string_view` in any way matters with that case, which is what this thread is supposed to be about.
The fact that string_view is implicitly constructible from string makes it all the more likely that someone will inadvertently create an invalid string_view and then call a method on it. There's no safety catch. No explicit conversion or operator* to warn the unwary coder.It's completely contrary to the good practice of making it hard to do bad things.I don't deny the usefulness of the string_view proxy object for marshalling between ill-behaved c++ apis (is this really anything more than a minor inconvenience?).APIs are not "ill-behaved" just because they don't conform to the string type you believe they ought to use. We need `string_view` to address reality, not "ill-behaved c++ apis".
I am affronted by its cavalier disregard for good design.I am certain that in 2020 the community will look back and think, "man, that string_view was totally the auto_ptr of 2017 - we should deprecate that and bring out something safer".You act like `string_view` appeared ex-nihilo into the standard this year. It is based on extensive existing practice across several C++ projects (one of them being a C++ compiler), and has been part of the Library Fundamentals TS version 1 for quite some time.If there was going to be any serious pushback on being implicitly convertible, it would already have happened by now.
in summary, if my code says:auto x = y();I want to be sure that either:* x is a value, or* If it isn't a value, I want to be sure that I must say *x or x->something() in order to get undefined behaviour.Then your expectation is the problem here. There are many C++ types where this expectation is not met. Lazily evaluated expressions being a big case, but there are plenty of others.If you believe `auto x = ...` should result in a value type or a pointer type, then your beliefs need to be adjusted, not the standard.Also, if we ever get one of the "operator-dot" proposals through, then your expectation will be even less valid.
Finally, on ranges.Ranges are a convenient shorthand, that look kind-of like containers.I see the parallel in your position, but there is a useful difference.a range has the interface begin() and end() which yields iterators (ie. pointery things). There is still a clue.Except that one of the main point of ranges is that you don't end up doing a lot of "pointer" things with them. You tend to pass them to algorithms and adaptors as though they were containers. That kind of thing encourages rvalue-treatment.
--