Update to string_ref proposal

840 views
Skip to first unread message

Jeffrey Yasskin

unread,
Dec 26, 2012, 6:44:38 PM12/26/12
to std-pr...@isocpp.org
Hi folks,

I finally have an update for the string_ref proposal, converting it to
a set of changes against the C++14 draft. The most recent version
lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
and I've attached a snapshot.

The LWG is still deciding whether this should aim for a TS or C++14
(or both?), and feedback from here will help inform that decision.
This paper currently assumes C++14.

Let me know what you think,
Jeffrey Yasskin
string_ref.html

Sommerlad Peter (peter.sommerlad@hsr.ch)

unread,
Dec 28, 2012, 5:05:36 AM12/28/12
to std-pr...@isocpp.org
Hi,

> An alternate way to get compile-time string_refs would be to define a user-defined literal operator a'la N3468. Arguably, basic_string_ref is a better candidate for the ""s suffix thanbasic_string since basic_string isn't a literal type.

as author of N3468 I can understand your wish to have a constexpr literal suffix operator for string_ref, however, I propose that you do not insist on operator"" s() to only be useful to create basic_string_ref.

it is a matter of teachability (string_ref will be "new") and also of the need that you actually may want a std::string variable when doing:

auto hi="hello"s;
hi.append(", world!"s);

which is one of my major reasons for contributing operator"" s().

We have two options:

provide

constexpr std::string_ref operator"" s(char const *, size_t);

in a separate namespace std::literals::string_ref, which makes mixing std::string operator"" s(char const *, size_t) in the same scope impossible, or chose a different name for the suffix, e.g., sr:

constexpr std::string_ref operator"" sr(char const *, size_t);

or, if that seems to much to type,

constexpr std::string_ref operator"" r(char const *, size_t);

Otherwise, keep on with the good work.

Regards
Peter.
> --
>
>
>
> <string_ref.html>

--
Prof. Peter Sommerlad

Institut für Software: Bessere Software - Einfach, Schneller!
HSR Hochschule für Technik Rapperswil
Oberseestr 10, Postfach 1475, CH-8640 Rapperswil

http://ifs.hsr.ch http://cute-test.com http://linticator.com http://includator.com
tel:+41 55 222 49 84 == mobile:+41 79 432 23 32
fax:+41 55 222 46 29 == mailto:peter.s...@hsr.ch





Olaf van der Spek

unread,
Dec 28, 2012, 8:04:54 AM12/28/12
to std-pr...@isocpp.org
On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
I finally have an update for the string_ref proposal, converting it to
a set of changes against the C++14 draft. The most recent version
lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
and I've attached a snapshot.

The LWG is still deciding whether this should aim for a TS or C++14
(or both?), and feedback from here will help inform that decision.

What are the pros and cons?

std::string, vector and boost::iterator_range have pop_front() and pop_back(). String_ref isn't a container but pop_front() and pop_back() would still be useful. Could they be added?
Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?

Is there still time for bike shedding? str_ref is ever so slightly shorter. :p

String_ref obviously can't be null-terminated, but C strings aren't going anywhere any time soon unfortunately. Maybe a future zstr_ref proposal could address this.

Beman Dawes

unread,
Dec 29, 2012, 8:42:49 AM12/29/12
to std-pr...@isocpp.org
On Fri, Dec 28, 2012 at 8:04 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
>>
>> I finally have an update for the string_ref proposal, converting it to
>> a set of changes against the C++14 draft. The most recent version
>> lives at
>> https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
>> and I've attached a snapshot.

> Is there still time for bike shedding? str_ref is ever so slightly shorter.

Names are important, and I've planning to start such a discussion next
week when more people will be reading this list. But if you want to
start such a discussion now, feel free to do so. However, please start
a separate message thread, with a subject line like "[string-ref] Name
suggestion" so that (1) those that don't care can skip reading the
thread and (2) other technical comments don't get lost in the noise.

Thanks,

--Beman

Olaf van der Spek

unread,
Dec 29, 2012, 11:26:53 AM12/29/12
to std-pr...@isocpp.org
On Sat, Dec 29, 2012 at 2:42 PM, Beman Dawes <bda...@acm.org> wrote:
>> Is there still time for bike shedding? str_ref is ever so slightly shorter.
>
> Names are important, and I've planning to start such a discussion next
> week when more people will be reading this list. But if you want to
> start such a discussion now, feel free to do so. However, please start
> a separate message thread, with a subject line like "[string-ref] Name
> suggestion" so that (1) those that don't care can skip reading the
> thread and (2) other technical comments don't get lost in the noise.

I'll let you have the honor.

BTW, you've got mail.

--
Olaf

Beman Dawes

unread,
Dec 30, 2012, 1:55:55 PM12/30/12
to Jeffrey Yasskin, std-pr...@isocpp.org
On Wed, Dec 26, 2012 at 6:44 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
> Hi folks,
>
> I finally have an update for the string_ref proposal, converting it to
> a set of changes against the C++14 draft. The most recent version
> lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
> and I've attached a snapshot.

From the note in basic.string.ref:

"... User-defined types should define their own implicit conversions
to std::basic_string_ref in order to interoperate with these
functions."

The concern is that this requires an invasive change to the UDT.
That's a particular problem for 3rd party UDT's or UDT's the developer
has no control over.

Have you considered a generic constructor instead? For example,

template <class String>
basic_string_ref(const String& s)
: _beg(::std::string_ref_begin(s)), _end(::std::string_ref_end(s)) {}

Where _beg and _end are the private data members.

The user is granted permission to provide the two adapter functions in
namespace std.

For a TS, helper functions could be provided (in std) for std::basic_string:

// basic_string_ref helpers
template <class charT, class traits, class Allocator>
inline
const charT* string_ref_begin(const basic_string<charT, traits, Allocator>& s)
{
return s.c_str();
}
template <class charT, class traits, class Allocator>
inline
const charT* string_ref_end(const basic_string<charT, traits, Allocator>& s)
{
return s.c_str() + s.size();
}

A quick test of this is available at https://github.com/Beman/string_ref_ex.

--Beman

Dean Michael Berris

unread,
Jan 1, 2013, 8:15:25 PM1/1/13
to std-pr...@isocpp.org, Jeffrey Yasskin
On Mon, Dec 31, 2012 at 5:55 AM, Beman Dawes <bda...@acm.org> wrote:
> On Wed, Dec 26, 2012 at 6:44 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> Hi folks,
>>
>> I finally have an update for the string_ref proposal, converting it to
>> a set of changes against the C++14 draft. The most recent version
>> lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
>> and I've attached a snapshot.
>
> From the note in basic.string.ref:
>
> "... User-defined types should define their own implicit conversions
> to std::basic_string_ref in order to interoperate with these
> functions."
>
> The concern is that this requires an invasive change to the UDT.
> That's a particular problem for 3rd party UDT's or UDT's the developer
> has no control over.
>
> Have you considered a generic constructor instead? For example,
>
> template <class String>
> basic_string_ref(const String& s)
> : _beg(::std::string_ref_begin(s)), _end(::std::string_ref_end(s)) {}
>
> Where _beg and _end are the private data members.
>
> The user is granted permission to provide the two adapter functions in
> namespace std.
>

Why not just rely on ADL?

--
Dean Michael Berris
Google

Jeffrey Yasskin

unread,
Jan 2, 2013, 12:08:45 AM1/2/13
to std-pr...@isocpp.org
On Fri, Dec 28, 2012 at 8:04 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
>>
>> I finally have an update for the string_ref proposal, converting it to
>> a set of changes against the C++14 draft. The most recent version
>> lives at
>> https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
>> and I've attached a snapshot.
>>
>> The LWG is still deciding whether this should aim for a TS or C++14
>> (or both?), and feedback from here will help inform that decision.
>>
> What are the pros and cons?

Overall, we want string_ref in an official document of some sort as
soon as possible so that other TS'en can depend on it.

The bar for including string_ref in a TS is lower, so it's more likely
string_ref would get into the next one. It would also serve as initial
content for the "utility" TS, which may make it easier to add new
things. Putting it in a TS will allow us to change the class in
incompatible ways if we discover problems. However, a TS will be
explicitly beta and may require user code that adopts string_ref to
change the namespace of the class it uses when string_ref is
incorporated into a future C++ standard, even if we don't make any
changes to the class.

C++14 is more official and is a firmer basis for other TS'en. If we
have consensus that string_ref is the right thing to do, we shouldn't
delay it artificially by including an unnecessary TS step. Putting it
directly into the draft standard would let us stop discussing the
whole proposal repeatedly and just discuss changes. (It's possible
putting it in a TS would have the same effect, but I'm less confident
of that.)

Beman may have other tradeoffs to add.

> std::string, vector and boost::iterator_range have pop_front() and
> pop_back(). String_ref isn't a container but pop_front() and pop_back()
> would still be useful. Could they be added?
> Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?

When we discussed N3350
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html#classstd_1_1range_1a3c743e4e4b85290682234af4c2d7e50f),
there was an argument that pop_front is expected to destroy the
elements it pops, since it does so everywhere in the standard that it
currently exists. The committee seemed to prefer another name. Clearly
pop_front()==remove_prefix(1), so I don't see an urgent need to add
another method for this purpose. If I get clear direction from the
committee that they want to overload pop_front to have this meaning,
I'd be happy to change it back, but I don't expect to get that
direction.

> Is there still time for bike shedding? str_ref is ever so slightly shorter.
> :p

+1 for Beman's suggestion of starting a new thread for this. I don't
intend to participate much on that thread if at all, but I'm happy to
change the proposal if the discussion leans toward a particular other
option.

> String_ref obviously can't be null-terminated, but C strings aren't going
> anywhere any time soon unfortunately. Maybe a future zstr_ref proposal could
> address this.

Definitely a future proposal. I'll add a note to the paper mentioning this.

Also, sorry for forgetting your .data()/.size() constructor
suggestion. I've added a TODO to my local copy of the paper so I won't
forget it again.

Jeffrey

Jeffrey Yasskin

unread,
Jan 2, 2013, 12:17:46 AM1/2/13
to Beman Dawes, std-pr...@isocpp.org
On Sun, Dec 30, 2012 at 1:55 PM, Beman Dawes <bda...@acm.org> wrote:
> On Wed, Dec 26, 2012 at 6:44 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> Hi folks,
>>
>> I finally have an update for the string_ref proposal, converting it to
>> a set of changes against the C++14 draft. The most recent version
>> lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
>> and I've attached a snapshot.
>
> From the note in basic.string.ref:
>
> "... User-defined types should define their own implicit conversions
> to std::basic_string_ref in order to interoperate with these
> functions."
>
> The concern is that this requires an invasive change to the UDT.
> That's a particular problem for 3rd party UDT's or UDT's the developer
> has no control over.
>
> Have you considered a generic constructor instead? For example,
>
> template <class String>
> basic_string_ref(const String& s)
> : _beg(::std::string_ref_begin(s)), _end(::std::string_ref_end(s)) {}
>
> Where _beg and _end are the private data members.
>
> The user is granted permission to provide the two adapter functions in
> namespace std.

I've had bad luck with this kind of adaptation function causing ODR
violations in the past. That is:

Library A provides an adaptation point. Library B defines a type that
could logically be used with library A but doesn't define the
adaptation point.
Libraries C and D want to use B with A and both define the adaptation point.
Binary E wants to use both libraries C and D. ODR violation: Boom,
either at link time (good but hard to recover from if you can't modify
libraries C or D, and if you can modify them, why can't you modify
library B?) or run time (very bad).

That said, the standard's style is generally to allow this kind of
extension, so I'll give in if you think I should add it.

> For a TS, helper functions could be provided (in std) for std::basic_string:

For basic_string, it's easy enough just to define the conversion
constructor. The difference would be in the implicit conversions
allowed with "string_ref sr(non_string_type)", where going through a
string_ref_{begin,end} template would allow fewer and would thereby
allow fewer accidental dangling pointers, which is a plus.

Jeffrey

Olaf van der Spek

unread,
Jan 3, 2013, 7:51:10 AM1/3/13
to std-pr...@isocpp.org
On Wed, Jan 2, 2013 at 6:08 AM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
> C++14 is more official and is a firmer basis for other TS'en. If we
> have consensus that string_ref is the right thing to do, we shouldn't
> delay it artificially by including an unnecessary TS step. Putting it
> directly into the draft standard would let us stop discussing the
> whole proposal repeatedly and just discuss changes. (It's possible
> putting it in a TS would have the same effect, but I'm less confident
> of that.)

Your list of open questions is short (mine is a bit longer ;), what's
stopping string_ref from getting into C++14?
AFAIK the concept is good, it's just some details that need to be worked out.
It's a shame a reference implementation isn't available yet (in Boost).

> Beman may have other tradeoffs to add.
>
>> std::string, vector and boost::iterator_range have pop_front() and
>> pop_back(). String_ref isn't a container but pop_front() and pop_back()
>> would still be useful. Could they be added?
>> Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?
>
> When we discussed N3350
> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html#classstd_1_1range_1a3c743e4e4b85290682234af4c2d7e50f),
> there was an argument that pop_front is expected to destroy the
> elements it pops, since it does so everywhere in the standard that it
> currently exists. The committee seemed to prefer another name. Clearly
> pop_front()==remove_prefix(1), so I don't see an urgent need to add

Not really, semantics are different if range is empty. Remove_prefix
and remove_suffix aren't symmetric in that case either.
I like pop_back() much better, it's not a problem in
boost::iterator_range (AFAIK) and I've no idea what destroying a char
would imply.
The difference with an owning container is already indicated by the type name.

Wouldn't clear() have to be removed too?

> another method for this purpose. If I get clear direction from the
> committee that they want to overload pop_front to have this meaning,
> I'd be happy to change it back, but I don't expect to get that
> direction.

> Also, sorry for forgetting your .data()/.size() constructor
> suggestion. I've added a TODO to my local copy of the paper so I won't
> forget it again.

Using data() and size() probably isn't right, it should be begin() and
end(). The point is being able to construct a string_ref from other
string-like types.

> starts_with(), ends_with()

Is a non-member variant being provided too? Otherwise it's not easily
usable on std::string and other string-like types.

BTW, don't forget my comments at
https://groups.google.com/a/isocpp.org/forum/?hl=en&fromgroups=#!topic/std-proposals/ZUnktXzj0RE
--
Olaf

Zhihao Yuan

unread,
Jan 3, 2013, 12:58:49 PM1/3/13
to std-pr...@isocpp.org
On Thu, Jan 3, 2013 at 6:51 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
>> starts_with(), ends_with()
>
> Is a non-member variant being provided too? Otherwise it's not easily
> usable on std::string and other string-like types.

The proposal added them on basic.string for an appropriate English
reading.

BTW, just curious, can I use

string_ref("prefixtree").starts_with(string("prefix"));

?

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://4bsd.biz/

Olaf van der Spek

unread,
Jan 3, 2013, 1:02:02 PM1/3/13
to std-pr...@isocpp.org
On Thu, Jan 3, 2013 at 6:58 PM, Zhihao Yuan <lic...@gmail.com> wrote:
> On Thu, Jan 3, 2013 at 6:51 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
>>> starts_with(), ends_with()
>>
>> Is a non-member variant being provided too? Otherwise it's not easily
>> usable on std::string and other string-like types.
>
> The proposal added them on basic.string for an appropriate English
> reading.

What about other string-like types?

> BTW, just curious, can I use
>
> string_ref("prefixtree").starts_with(string("prefix"));

Why not this?

string_ref("prefixtree").starts_with("prefix")

This is even shorter but requires non-member functions:

starts_with("prefixtree", "prefix")

Olaf

Beman Dawes

unread,
Jan 4, 2013, 6:58:44 AM1/4/13
to std-pr...@isocpp.org
On Tue, Jan 1, 2013 at 8:15 PM, Dean Michael Berris
<dbe...@googlers.com> wrote:
> On Mon, Dec 31, 2012 at 5:55 AM, Beman Dawes <bda...@acm.org> wrote:
...
>> Have you considered a generic constructor instead? For example,
>>
>> template <class String>
>> basic_string_ref(const String& s)
>> : _beg(::std::string_ref_begin(s)), _end(::std::string_ref_end(s)) {}
>>
>> Where _beg and _end are the private data members.
>>
>> The user is granted permission to provide the two adapter functions in
>> namespace std.
>>
>
> Why not just rely on ADL?

No good reason, and that's what the little prototype did initially.
But I don't understand ADL pitfalls very well, so avoid using it when
there is an alternative.

I'd defer to library experts if they have a strong opinion, and also
to Jeffery since it is his proposal.

--Beman

Jeffrey Yasskin

unread,
Jan 4, 2013, 1:59:27 PM1/4/13
to std-pr...@isocpp.org
On Thu, Jan 3, 2013 at 12:51 PM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Wed, Jan 2, 2013 at 6:08 AM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> C++14 is more official and is a firmer basis for other TS'en. If we
>> have consensus that string_ref is the right thing to do, we shouldn't
>> delay it artificially by including an unnecessary TS step. Putting it
>> directly into the draft standard would let us stop discussing the
>> whole proposal repeatedly and just discuss changes. (It's possible
>> putting it in a TS would have the same effect, but I'm less confident
>> of that.)
>
> Your list of open questions is short (mine is a bit longer ;), what's
> stopping string_ref from getting into C++14?
> AFAIK the concept is good, it's just some details that need to be worked out.
> It's a shame a reference implementation isn't available yet (in Boost).

I think the details are what people are worried about. Personally, I
think we can get them worked out by Bristol and get string_ref into
C++14, but there's enough doubt in the minds of more experienced
committee members that I'm not certain.

Note that the feedback we need from this list is not "which target
should we aim for". It's "which pieces are right and wrong".

>> Beman may have other tradeoffs to add.
>>
>>> std::string, vector and boost::iterator_range have pop_front() and
>>> pop_back(). String_ref isn't a container but pop_front() and pop_back()
>>> would still be useful. Could they be added?
>>> Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?
>>
>> When we discussed N3350
>> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html#classstd_1_1range_1a3c743e4e4b85290682234af4c2d7e50f),
>> there was an argument that pop_front is expected to destroy the
>> elements it pops, since it does so everywhere in the standard that it
>> currently exists. The committee seemed to prefer another name. Clearly
>> pop_front()==remove_prefix(1), so I don't see an urgent need to add
>
> Not really, semantics are different if range is empty. Remove_prefix
> and remove_suffix aren't symmetric in that case either.

Now that I'm looking at them again, I think I gave
remove_{prefix,suffix} the wrong semantics. If n>size(), remove_prefix
throws out_of_range, while remove_suffix returns the whole string.
That can't be right. Should they be:

remove_prefix(n) is equivalent to "*this = substr(min(n, size()), npos)"
remove_suffix(n) is equivalent to "*this = substr(0, size() - min(n, size()))"

Or should they Require that n <= size()?

> I like pop_back() much better, it's not a problem in
> boost::iterator_range (AFAIK) and I've no idea what destroying a char
> would imply.
> The difference with an owning container is already indicated by the type name.

I thought pop_back was fine, but the committee disagreed, so it's
changed. You seem to like assuming away programmer mistakes, while I
and the committee want to catch as many as possible within the bounds
of C++, so we're likely to disagree on things like this.

> Wouldn't clear() have to be removed too?

Apparently not. Nobody objected to it.

>> another method for this purpose. If I get clear direction from the
>> committee that they want to overload pop_front to have this meaning,
>> I'd be happy to change it back, but I don't expect to get that
>> direction.
>
>> Also, sorry for forgetting your .data()/.size() constructor
>> suggestion. I've added a TODO to my local copy of the paper so I won't
>> forget it again.
>
> Using data() and size() probably isn't right, it should be begin() and
> end(). The point is being able to construct a string_ref from other
> string-like types.

We'll have to wait for contiguous_iterator_tag then, or take Beman's
suggestion for string_ref_begin() (contiguous_begin()? ew, wart.). I'm
happy with leaving that constructor out of string_ref for the
initially standardized version and adding it later. string_ref doesn't
have to be complete the first time, it just has to be better than what
we have.

>> starts_with(), ends_with()
>
> Is a non-member variant being provided too? Otherwise it's not easily
> usable on std::string and other string-like types.

No, the paper includes, "The non-member equivalents produce calls that
are somewhat ambiguous between starts_with(haystack, needle) vs
starts_with(needle, haystack), while haystack.starts_with(needle) is
the only English reading of the member version. These queries apply
equally well to basic_string, so I've added them there too."

Zhihao and you got the syntax I expect for non-string_ref types:
string_ref("prefixtree").starts_with("prefix")

I think I've answered all of them now. What have I missed?

Olaf van der Spek

unread,
Jan 4, 2013, 3:54:50 PM1/4/13
to std-pr...@isocpp.org
On Fri, Jan 4, 2013 at 7:59 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
> Note that the feedback we need from this list is not "which target
> should we aim for". It's "which pieces are right and wrong".

Ah, ok

>>> Beman may have other tradeoffs to add.
>>>
>>>> std::string, vector and boost::iterator_range have pop_front() and
>>>> pop_back(). String_ref isn't a container but pop_front() and pop_back()
>>>> would still be useful. Could they be added?
>>>> Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?
>>>
>>> When we discussed N3350
>>> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html#classstd_1_1range_1a3c743e4e4b85290682234af4c2d7e50f),
>>> there was an argument that pop_front is expected to destroy the
>>> elements it pops, since it does so everywhere in the standard that it
>>> currently exists. The committee seemed to prefer another name. Clearly
>>> pop_front()==remove_prefix(1), so I don't see an urgent need to add
>>
>> Not really, semantics are different if range is empty. Remove_prefix
>> and remove_suffix aren't symmetric in that case either.
>
> Now that I'm looking at them again, I think I gave
> remove_{prefix,suffix} the wrong semantics. If n>size(), remove_prefix
> throws out_of_range, while remove_suffix returns the whole string.
> That can't be right. Should they be:
>
> remove_prefix(n) is equivalent to "*this = substr(min(n, size()), npos)"
> remove_suffix(n) is equivalent to "*this = substr(0, size() - min(n, size()))"
>
> Or should they Require that n <= size()?

I'd go for the latter. Those functions are defined as noexcept.

>> I like pop_back() much better, it's not a problem in
>> boost::iterator_range (AFAIK) and I've no idea what destroying a char
>> would imply.
>> The difference with an owning container is already indicated by the type name.
>
> I thought pop_back was fine, but the committee disagreed, so it's
> changed. You seem to like assuming away programmer mistakes, while I
> and the committee want to catch as many as possible within the bounds
> of C++, so we're likely to disagree on things like this.

I'm not sure that's it.
pop_back() is existing practice in both Boost and D (and the idea was
to standardize existing practice wasn't it?)
Textually pop_back and remove_suffix seem equivalent (to me), nothing
tells me their behaviour is (kinda) different.
Would remove_suffix(1) really avoid mistakes?

http://dlang.org/phobos/std_range.html

>> Wouldn't clear() have to be removed too?
>
> Apparently not. Nobody objected to it.

Seems inconsistent.

> We'll have to wait for contiguous_iterator_tag then, or take Beman's
> suggestion for string_ref_begin() (contiguous_begin()? ew, wart.). I'm
> happy with leaving that constructor out of string_ref for the
> initially standardized version and adding it later. string_ref doesn't
> have to be complete the first time, it just has to be better than what
> we have.

That's easy :p
Waiting for contiguous_iterator_tag or explicit charT* seems fine.

>>> starts_with(), ends_with()
>>
>> Is a non-member variant being provided too? Otherwise it's not easily
>> usable on std::string and other string-like types.
>
> No, the paper includes, "The non-member equivalents produce calls that
> are somewhat ambiguous between starts_with(haystack, needle) vs
> starts_with(needle, haystack), while haystack.starts_with(needle) is
> the only English reading of the member version. These queries apply
> equally well to basic_string, so I've added them there too."
>
> Zhihao and you got the syntax I expect for non-string_ref types:
> string_ref("prefixtree").starts_with("prefix")

Having to include "string_ref" there doesn't seem entirely right.

>> BTW, don't forget my comments at
>> https://groups.google.com/a/isocpp.org/forum/?hl=en&fromgroups=#!topic/std-proposals/ZUnktXzj0RE
>
> I think I've answered all of them now. What have I missed?

The 3 stoi overloads, they look unclean. Wouldn't adding an overload
also have the potential to break code die to ambiguity?


--
Olaf

Jeffrey Yasskin

unread,
Jan 4, 2013, 4:58:32 PM1/4/13
to std-pr...@isocpp.org
On Fri, Jan 4, 2013 at 12:54 PM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Fri, Jan 4, 2013 at 7:59 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> Now that I'm looking at them again, I think I gave
>> remove_{prefix,suffix} the wrong semantics. If n>size(), remove_prefix
>> throws out_of_range, while remove_suffix returns the whole string.
>> That can't be right. Should they be:
>>
>> remove_prefix(n) is equivalent to "*this = substr(min(n, size()), npos)"
>> remove_suffix(n) is equivalent to "*this = substr(0, size() - min(n, size()))"
>>
>> Or should they Require that n <= size()?
>
> I'd go for the latter. Those functions are defined as noexcept.

noexcept would imply the first, I think. We don't generally have
noexcept functions with Requires clauses. (I'm sure there are some,
but for example string::front() isn't noexcept.) Google's and LLVM's
versions of this, though, assert(size()>=n).

>>> I like pop_back() much better, it's not a problem in
>>> boost::iterator_range (AFAIK) and I've no idea what destroying a char
>>> would imply.
>>> The difference with an owning container is already indicated by the type name.
>>
>> I thought pop_back was fine, but the committee disagreed, so it's
>> changed. You seem to like assuming away programmer mistakes, while I
>> and the committee want to catch as many as possible within the bounds
>> of C++, so we're likely to disagree on things like this.
>
> I'm not sure that's it.
> pop_back() is existing practice in both Boost and D (and the idea was
> to standardize existing practice wasn't it?)

remove_back is existing practice in Google's version of this. LLVM's
version uses drop_back. Bloomberg doesn't define it. So I don't think
existing practice decides this.

> Textually pop_back and remove_suffix seem equivalent (to me), nothing
> tells me their behaviour is (kinda) different.
> Would remove_suffix(1) really avoid mistakes?
>
> http://dlang.org/phobos/std_range.html
>
>>> BTW, don't forget my comments at
>>> https://groups.google.com/a/isocpp.org/forum/?hl=en&fromgroups=#!topic/std-proposals/ZUnktXzj0RE
>>
>> I think I've answered all of them now. What have I missed?
>
> The 3 stoi overloads, they look unclean. Wouldn't adding an overload
> also have the potential to break code die to ambiguity?

If a class has an implicit conversion to both const char* and
std::string (or has a template conversion operator), and is being
passed directly to stoi(), that could break. This seems less likely
than either that const char* is being passed to stoi() or that a class
with only an implicit conversion to std::string is being passed.

It's definitely annoying to have to deal with backwards compatibility
like this, but string_ref exists primarily for the convenience of
people using C++, and only secondarily to clean up the standard.

Jeffrey

Olaf van der Spek

unread,
Jan 4, 2013, 5:11:30 PM1/4/13
to std-pr...@isocpp.org
On Fri, Jan 4, 2013 at 10:58 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>>> Or should they Require that n <= size()?
>>
>> I'd go for the latter. Those functions are defined as noexcept.
>
> noexcept would imply the first, I think. We don't generally have
> noexcept functions with Requires clauses. (I'm sure there are some,
> but for example string::front() isn't noexcept.) Google's and LLVM's
> versions of this, though, assert(size()>=n).

I assumed require would be the precondition and breaking it would be UB.

>>>> I like pop_back() much better, it's not a problem in
>>>> boost::iterator_range (AFAIK) and I've no idea what destroying a char
>>>> would imply.
>>>> The difference with an owning container is already indicated by the type name.
>>>
>>> I thought pop_back was fine, but the committee disagreed, so it's
>>> changed. You seem to like assuming away programmer mistakes, while I
>>> and the committee want to catch as many as possible within the bounds
>>> of C++, so we're likely to disagree on things like this.
>>
>> I'm not sure that's it.
>> pop_back() is existing practice in both Boost and D (and the idea was
>> to standardize existing practice wasn't it?)
>
> remove_back is existing practice in Google's version of this. LLVM's
> version uses drop_back. Bloomberg doesn't define it. So I don't think
> existing practice decides this.

All use back instead of suffix.
LLVM's defaults N to 1. Is Google's version public?

>> Textually pop_back and remove_suffix seem equivalent (to me), nothing
>> tells me their behaviour is (kinda) different.
>> Would remove_suffix(1) really avoid mistakes?
>>
>> http://dlang.org/phobos/std_range.html



--
Olaf

Jeffrey Yasskin

unread,
Jan 4, 2013, 5:16:02 PM1/4/13
to std-pr...@isocpp.org
On Fri, Jan 4, 2013 at 2:11 PM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Fri, Jan 4, 2013 at 10:58 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> remove_back is existing practice in Google's version of this. LLVM's
>> version uses drop_back. Bloomberg doesn't define it. So I don't think
>> existing practice decides this.
>
> All use back instead of suffix.
> LLVM's defaults N to 1. Is Google's version public?

https://code.google.com/p/re2/source/browse/re2/stringpiece.h,
http://www.icu-project.org/apiref/icu4c/classicu_1_1StringPiece.html,
and https://code.google.com/searchframe#OAMlx_jo-ck/src/base/string_piece.h&type=cs&l=102
(Chrome) all look derived from Google's version. I typo'ed
remove_back, it's actually remove_suffix.

Gabriel Dos Reis

unread,
Jan 4, 2013, 6:43:36 PM1/4/13
to std-pr...@isocpp.org
Olaf van der Spek <olafv...@gmail.com> writes:

[...]

| > I thought pop_back was fine, but the committee disagreed, so it's
| > changed. You seem to like assuming away programmer mistakes, while I
| > and the committee want to catch as many as possible within the bounds
| > of C++, so we're likely to disagree on things like this.
|
| I'm not sure that's it.
| pop_back() is existing practice in both Boost and D (and the idea was
| to standardize existing practice wasn't it?)

No, we are not required to standardize "existing practice" if in
hindsight it is not such a bright idea.

Beman Dawes

unread,
Jan 5, 2013, 8:52:00 AM1/5/13
to std-pr...@isocpp.org
On Wed, Jan 2, 2013 at 12:08 AM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
> On Fri, Dec 28, 2012 at 8:04 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
>> On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
>>>
....
>>> The LWG is still deciding whether this should aim for a TS or C++14
>>> (or both?), and feedback from here will help inform that decision.
>>>
>> What are the pros and cons?
>
> Overall, we want string_ref in an official document of some sort as
> soon as possible so that other TS'en can depend on it.

Agreed.

> The bar for including string_ref in a TS is lower, so it's more likely
> string_ref would get into the next one. It would also serve as initial
> content for the "utility" TS, which may make it easier to add new
> things. Putting it in a TS will allow us to change the class in
> incompatible ways if we discover problems. However, a TS will be
> explicitly beta and may require user code that adopts string_ref to
> change the namespace of the class it uses when string_ref is
> incorporated into a future C++ standard, even if we don't make any
> changes to the class.

That's a nice summary of the TS target.

> C++14 is more official and is a firmer basis for other TS'en. If we
> have consensus that string_ref is the right thing to do, we shouldn't
> delay it artificially by including an unnecessary TS step. Putting it
> directly into the draft standard would let us stop discussing the
> whole proposal repeatedly and just discuss changes. (It's possible
> putting it in a TS would have the same effect, but I'm less confident
> of that.)

Having "consensus that string_ref is the right thing to do" with
reasonably correct proposed wording is good enough for a TS, but for
the standard itself there are a lot of additional hurdles. Some
committee members may want more real-world experience, others may be
concerned about the volume of changes to existing components, the
impact on books about C++, possible ABI breakage, maturity of wording,
impact on teaching, etc. These concerns often come from folks who do
not follow the doings of the LWG closely, so only arise late in the
process and tend to delay progress. Whether any of these concerns will
arise for string_ref is hard to predict, but I'd would not be
surprised if at least a few of them surface.

> Beman may have other tradeoffs to add.

I rate the probability of finishing technical work on a Library
Utility TS in 2013 as 70% or better, and by 2014 as 95%. I rate the
probability of finishing technical work on C++14 in 2013 as 25%, by
2014 as 65%, and by 2015 as 95%.

Remember too that if other TS's are gated on string_ref, and
string_ref is part of C++14, we would have to hold the other TS's
waiting for a delayed C++14.

--Beman

Jeffrey Yasskin

unread,
Jan 10, 2013, 7:54:08 PM1/10/13
to std-pr...@isocpp.org
I've uploaded a new version to
http://jeffrey.yasskin.info/cxx/2012/string_ref.html with updates from
this thread. I'm planning to send this off to Clark tomorrow, although
if https://groups.google.com/a/isocpp.org/d/topic/std-proposals/5sW8yp5i8mo/discussion
keeps agreeing on string_view, I may switch to that name first.

Nicol Bolas

unread,
Jan 10, 2013, 8:48:09 PM1/10/13
to std-pr...@isocpp.org


On Thursday, January 10, 2013 4:54:08 PM UTC-8, Jeffrey Yasskin wrote:
I've uploaded a new version to
http://jeffrey.yasskin.info/cxx/2012/string_ref.html with updates from
this thread. I'm planning to send this off to Clark tomorrow, although
if https://groups.google.com/a/isocpp.org/d/topic/std-proposals/5sW8yp5i8mo/discussion
keeps agreeing on string_view, I may switch to that name first.

In the section on user-defined conversion, you said:

Ultimately, I think we want to allow this conversion based on detecting contiguous ranges. Any constructor we add to work around that is going to look like a wart in a couple years. I think we'll be better off making users explicitly convert when they can't add an appropriate conversion operator, and then we can add the optimal constructor when contiguous iterators make it into the library.

This assumes that every string type out there can be detected as a "contiguous range", which is a concept that doesn't exist even in Boost.Range. It's hard to know what users would have to provide or if such things can be provided without modifying the type.

Also, it seems to put on a significant short-term problem over a long-term goal, when it's not entirely clear how long term this goal will be. Or if the eventual contiguous iterator and contiguous range proposals will even fit into the needs for basic_string_ref. Current discussions on the Range mailing list are starting to talk about things like ranges as primitive types with no iterators. If they go that route, I'm not sure if that's going to be a viable solution for basic_string_ref going forward.

I don't like the idea of making a proposal somewhat incomplete (you do recognize the need for user-defined conversions) on the assumption that in a year or two someone will come along with a fix. I say we give the user the ability to provide something that will work and be extensible now. If the eventual Range proposal comes along that can't handle things this way, then we already have our solution. If another solution appears, we can use that. It may leave a wart, but it will be a very small and harmless one. And it's not like anyone will really notice one more wart in our string classes...

Olaf van der Spek

unread,
Jan 11, 2013, 3:52:06 AM1/11/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 2:48 AM, Nicol Bolas <jmck...@gmail.com> wrote:
> I don't like the idea of making a proposal somewhat incomplete (you do
> recognize the need for user-defined conversions) on the assumption that in a
> year or two someone will come along with a fix. I say we give the user the
> ability to provide something that will work and be extensible now. If the
> eventual Range proposal comes along that can't handle things this way, then
> we already have our solution. If another solution appears, we can use that.
> It may leave a wart, but it will be a very small and harmless one. And it's
> not like anyone will really notice one more wart in our string classes...

We don't have to wait a year or two, we could write a proposal to fix
the contiguous range detection issue tomorrow.
Iterators aren't going away any time soon (I expect), so basing a
solution on them seems ok.

I'd go for the addition of an explicit conversion to pointer on
contiguous iterators, such that this works:

std::array<char, 10> v;
char* b(v.begin());
char* e(v.end());

Seems simple and useful in other situations too.
--
Olaf

Nicol Bolas

unread,
Jan 11, 2013, 4:43:55 AM1/11/13
to std-pr...@isocpp.org


On Friday, January 11, 2013 12:52:06 AM UTC-8, Olaf van der Spek wrote:
On Fri, Jan 11, 2013 at 2:48 AM, Nicol Bolas <jmck...@gmail.com> wrote:
> I don't like the idea of making a proposal somewhat incomplete (you do
> recognize the need for user-defined conversions) on the assumption that in a
> year or two someone will come along with a fix. I say we give the user the
> ability to provide something that will work and be extensible now. If the
> eventual Range proposal comes along that can't handle things this way, then
> we already have our solution. If another solution appears, we can use that.
> It may leave a wart, but it will be a very small and harmless one. And it's
> not like anyone will really notice one more wart in our string classes...

We don't have to wait a year or two, we could write a proposal to fix
the contiguous range detection issue tomorrow.

True. But if we put basic_string_ref/view into a TS, it won't necessarily have the contiguous range fix as part of the TS. So there will be no extensibility mechanism. Also, if that proposal doesn't make it for C++14 and this one does, then again, we won't have an extensibility mechanism.

This is one of those "system integration" problems that comes with having multiple proposals in parallel, all with dependencies on each other. A proposal that has progessed farther handwaves some of its features by saying, "well, that proposal will take care of that case," rather than dealing with the problem internally and being functionally complete.

At the very least, there should be some fall-back mechanism in this proposal which can be used if the real fix isn't available by the time the final paper, whether a TS or C++14, is released. At least that way, we can be sure that this proposal will stand on its own.

Oh, and one of the nice things about TS's is that you can change them when you pull them into the standard. So we don't have to stick to an extensibility mechanism we don't like when we move the class to C++14 standard.

Olaf van der Spek

unread,
Jan 11, 2013, 5:15:05 AM1/11/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 10:43 AM, Nicol Bolas <jmck...@gmail.com> wrote:
>> We don't have to wait a year or two, we could write a proposal to fix
>> the contiguous range detection issue tomorrow.
>
>
> True. But if we put basic_string_ref/view into a TS, it won't necessarily
> have the contiguous range fix as part of the TS. So there will be no
> extensibility mechanism. Also, if that proposal doesn't make it for C++14
> and this one does, then again, we won't have an extensibility mechanism.

Actually can't we just define operator string_ref() on other classes?

> Oh, and one of the nice things about TS's is that you can change them when
> you pull them into the standard. So we don't have to stick to an
> extensibility mechanism we don't like when we move the class to C++14
> standard.

You can, but breaking compatibility still isn't nice.


--
Olaf

Nevin Liber

unread,
Jan 11, 2013, 10:18:27 AM1/11/13
to std-pr...@isocpp.org
On 11 January 2013 02:52, Olaf van der Spek <olafv...@gmail.com> wrote:
We don't have to wait a year or two, we could write a proposal to fix
the contiguous range detection issue tomorrow.

FYI:  I'm working on a contiguous_iterator_tag proposal for Bristol (but it probably won't be done before the mailing deadline next week).
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Olaf van der Spek

unread,
Jan 11, 2013, 10:35:36 AM1/11/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 4:18 PM, Nevin Liber <ne...@eviloverlord.com> wrote:
> On 11 January 2013 02:52, Olaf van der Spek <olafv...@gmail.com> wrote:
>>
>> We don't have to wait a year or two, we could write a proposal to fix
>> the contiguous range detection issue tomorrow.
>
>
> FYI: I'm working on a contiguous_iterator_tag proposal for Bristol (but it
> probably won't be done before the mailing deadline next week).

Great!
Isn't that deadline today though?

Could you also consider a conversion operator to T* for such iterators?

--
Olaf

Nevin Liber

unread,
Jan 11, 2013, 11:02:13 AM1/11/13
to std-pr...@isocpp.org
On 11 January 2013 09:35, Olaf van der Spek <olafv...@gmail.com> wrote:
On Fri, Jan 11, 2013 at 4:18 PM, Nevin Liber <ne...@eviloverlord.com> wrote:
> On 11 January 2013 02:52, Olaf van der Spek <olafv...@gmail.com> wrote:
>>
>> We don't have to wait a year or two, we could write a proposal to fix
>> the contiguous range detection issue tomorrow.
>
>
> FYI:  I'm working on a contiguous_iterator_tag proposal for Bristol (but it
> probably won't be done before the mailing deadline next week).

Great!
Isn't that deadline today though?

It could be, in which case it still won't be done by this mailing..
 
Could you also consider a conversion operator to T* for such iterators?

Yes, I am putting that in there (leaning against requiring a conversion from T* to iterator, though). 

Nicol Bolas

unread,
Jan 11, 2013, 11:17:10 AM1/11/13
to std-pr...@isocpp.org


On Friday, January 11, 2013 2:15:05 AM UTC-8, Olaf van der Spek wrote:
On Fri, Jan 11, 2013 at 10:43 AM, Nicol Bolas <jmck...@gmail.com> wrote:
>> We don't have to wait a year or two, we could write a proposal to fix
>> the contiguous range detection issue tomorrow.
>
>
> True. But if we put basic_string_ref/view into a TS, it won't necessarily
> have the contiguous range fix as part of the TS. So there will be no
> extensibility mechanism. Also, if that proposal doesn't make it for C++14
> and this one does, then again, we won't have an extensibility mechanism.

Actually can't we just define operator string_ref() on other classes?

The point would be for code we don't own. Like CString and the myriad of other string classes used by the multitude of C++ libraries out there.

Olaf van der Spek

unread,
Jan 11, 2013, 11:30:44 AM1/11/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 5:17 PM, Nicol Bolas <jmck...@gmail.com> wrote:
> The point would be for code we don't own. Like CString and the myriad of
> other string classes used by the multitude of C++ libraries out there.

I hope the owners of these libs include support themselves.


--
Olaf

Nicol Bolas

unread,
Jan 11, 2013, 12:08:06 PM1/11/13
to std-pr...@isocpp.org

Reality always ensues. That's why we like customization points to be free functions, so that they can be implemented without modifying the class.

Jeffrey Yasskin

unread,
Jan 11, 2013, 12:46:54 PM1/11/13
to std-pr...@isocpp.org
See also https://groups.google.com/a/isocpp.org/d/msg/std-proposals/8t5EFJfLn0I/26ihRCq8N2sJ.
I'm unconvinced that there's enough benefit from making up an external
adaptation mechanism to outweigh the fact that we know it's going to
be obsolete "soon", so it won't be in this mailing's paper. If there's
an outcry from more people, I can always add it to the pre-Bristol
mailing.

Thanks,
Jeffrey

Nicol Bolas

unread,
Jan 11, 2013, 1:08:56 PM1/11/13
to std-pr...@isocpp.org


On Friday, January 11, 2013 9:46:54 AM UTC-8, Jeffrey Yasskin wrote:
On Fri, Jan 11, 2013 at 9:08 AM, Nicol Bolas <jmck...@gmail.com> wrote:
>
>
> On Friday, January 11, 2013 8:30:44 AM UTC-8, Olaf van der Spek wrote:
>>
>> On Fri, Jan 11, 2013 at 5:17 PM, Nicol Bolas <jmck...@gmail.com> wrote:
>> > The point would be for code we don't own. Like CString and the myriad of
>> > other string classes used by the multitude of C++ libraries out there.
>>
>> I hope the owners of these libs include support themselves.
>
>
> Reality always ensues. That's why we like customization points to be free
> functions, so that they can be implemented without modifying the class.

See also https://groups.google.com/a/isocpp.org/d/msg/std-proposals/8t5EFJfLn0I/26ihRCq8N2sJ.

That's going to be no less of a problem than the eventual solution with contiguous iterators/ranges, because they'll still have to use existing customization points to expose them (std::begin/end). And those can still conflict between libraries. So I don't see how the eventual solution fixes this problem.

Also, I'm not sure I see the logic in not providing this for that reason. Given users of libraries A and users of libraries B, the only people who will encounter that problem are the intersection of those users. You are suggesting that we inconvenience the union of the users of libraries A and B. That's hurting a lot more people, which seems to me to not be a good thing.

Ville Voutilainen

unread,
Jan 11, 2013, 1:12:15 PM1/11/13
to std-pr...@isocpp.org
On 11 January 2013 20:08, Nicol Bolas <jmck...@gmail.com> wrote:
> Also, I'm not sure I see the logic in not providing this for that reason.
> Given users of libraries A and users of libraries B, the only people who
> will encounter that problem are the intersection of those users. You are
> suggesting that we inconvenience the union of the users of libraries A and
> B. That's hurting a lot more people, which seems to me to not be a good
> thing.

I fail to see what this storm in a teacup is about. I can use the
(const char*, size) constructor
to construct a string_view from most existing library strings.

tvan...@gmail.com

unread,
Jan 11, 2013, 2:04:06 PM1/11/13
to std-pr...@isocpp.org
I think the issue is auto conversion. 

Can a function that takes a string_view automatically take a CString, or do I need to call a conversion. 

Sorry for topposting.

Tony

Sent from my BlackBerry 10 smartphone.

From: Ville Voutilainen
Sent: Friday, January 11, 2013 1:12 PM
Subject: Re: [std-proposals] Re: Update to string_ref proposal

--



Ville Voutilainen

unread,
Jan 11, 2013, 2:12:56 PM1/11/13
to std-pr...@isocpp.org
On 11 January 2013 21:04, <tvan...@gmail.com> wrote:
I think the issue is auto conversion. 

Can a function that takes a string_view automatically take a CString, or do I need to call a conversion. 

How many people does it help if string_view can do the automatic conversion, but string can't? Or do we plan
to allow string to perform such conversions too?

Vicente J. Botet Escriba

unread,
Jan 11, 2013, 3:06:01 PM1/11/13
to std-pr...@isocpp.org
Le 11/01/13 01:54, Jeffrey Yasskin a �crit :
Hi,

I remember that someone suggested to remove the clear() member function
as the behavior doesn't corresponds to string::clear(). I don't remember
the outcome of this remark.

Vicente

Jeffrey Yasskin

unread,
Jan 11, 2013, 3:25:23 PM1/11/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 12:06 PM, Vicente J. Botet Escriba
<vicent...@wanadoo.fr> wrote:
> Le 11/01/13 01:54, Jeffrey Yasskin a écrit :
IIRC, Olaf suggested it (because I didn't take his suggestion to
rename remove_prefix to pop_front), and I decided not to take the
suggestion unless I hear wider agreement.

Nicol Bolas

unread,
Jan 11, 2013, 3:40:43 PM1/11/13
to std-pr...@isocpp.org

We shouldn't limit new classes just because of limitations on old ones. Making the same mistake over and over again isn't helping anyone. Consistency isn't as important as correctness and useful functionality.

Furthermore, that CString->std::wstring conversion would have to be a copy, whereas the conversion to std::wstring_ref/view would not be. That's why it's far more useful to have this implicit conversion because it takes no real time or performance. Whereas a conversion to basic_string would be something significant.

Ville Voutilainen

unread,
Jan 11, 2013, 3:45:00 PM1/11/13
to std-pr...@isocpp.org
On 11 January 2013 22:40, Nicol Bolas <jmck...@gmail.com> wrote:
>> How many people does it help if string_view can do the automatic
>> conversion, but string can't? Or do we plan
>> to allow string to perform such conversions too?
> We shouldn't limit new classes just because of limitations on old ones.
> Making the same mistake over and over again isn't helping anyone.
> Consistency isn't as important as correctness and useful functionality.

Well consistency and convenience would be the argument for having such
conversions
for string as well...

> Furthermore, that CString->std::wstring conversion would have to be a copy,
> whereas the conversion to std::wstring_ref/view would not be. That's why
> it's far more useful to have this implicit conversion because it takes no
> real time or performance. Whereas a conversion to basic_string would be
> something significant.

...but this is a reasonable argument against it.

Vicente J. Botet Escriba

unread,
Jan 12, 2013, 7:12:26 AM1/12/13
to std-pr...@isocpp.org
Le 11/01/13 21:25, Jeffrey Yasskin a �crit :
> On Fri, Jan 11, 2013 at 12:06 PM, Vicente J. Botet Escriba
> <vicent...@wanadoo.fr> wrote:
>> Le 11/01/13 01:54, Jeffrey Yasskin a �crit :
The same argument that renamed pop_front to remove_prefix should apply,
the clear function doesn't has the same behavior on string_ref and
string, so either the function is removed or a different name is used.

Vicente


Olaf van der Spek

unread,
Jan 12, 2013, 10:31:20 AM1/12/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 9:25 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> I remember that someone suggested to remove the clear() member function as
>> the behavior doesn't corresponds to string::clear(). I don't remember the
>> outcome of this remark.
>
> IIRC, Olaf suggested it (because I didn't take his suggestion to
> rename remove_prefix to pop_front), and I decided not to take the
> suggestion unless I hear wider agreement.

You make me sound like a bad guy. :p
I was merely surprised with the inconsistency. I don't want clear() to
be renamed.


On Fri, Jan 11, 2013 at 8:12 PM, Ville Voutilainen
<ville.vo...@gmail.com> wrote:
> How many people does it help if string_view can do the automatic conversion,
> but string can't? Or do we plan

Lots

> to allow string to perform such conversions too?

I assume we do.

--
Olaf

Sebastian Gesemann

unread,
Jan 22, 2013, 7:21:28 AM1/22/13
to std-pr...@isocpp.org
Just a comment on the string_ref proposal regarding one of the
comments to the "why not change ... ?" questions:

> Make basic_string_ref<char> mutable
> … and use basic_string_ref<const char> for the constant case. The
> constant case is enough more common than the mutable case that it
> needs to be the default. Making the mutable case the default would
> prevent passing string literals into string_ref parameters, [...]
> We could use typedef basic_string_ref<const char> string_ref to
> make the immutable case the default while still supporting the
> mutable case using the same template. I haven't gone this way
> because it would complicate the template's definition without
> significantly helping users.

I think that in any case -- whether mutability will be included or not
-- the name of the pseudo string reference type to a constant string
should reflect the constness: string_cref. So, with mutability support
we could simply define the type aliases like this:

using string_cref = basic_string_ref<const char>;
using string_ref = basic_string_ref<char>;

and without mutability support I would prefer adding 'c' also to
basic_string_ref as well:

using string_cref = basic_string_cref<char>;

just my two cents,
Sebastian

cart...@gmail.com

unread,
Jan 22, 2013, 9:37:51 AM1/22/13
to std-pr...@isocpp.org
When one of the stated goals of the proposal is: 

Note: The library provides implicit conversions from const charT* and std::basic_string<charT, ...> to std::basic_string_ref<charT, ...> so that user code can accept just std::basic_string_ref<charT> as a parameter wherever a sequence of characters is expected

it seems like a failure to me that many of the string operations are declared with a pair of const charT* and basic_string_ref overloads. E.g.,

constexpr int compare(basic_string_ref s) const noexcept;
constexpr int compare(const charT* s) const noexcept;

If a single overload on basic_string_ref isn't good enough for basic_string_ref members, why should it be acceptable for "user code"?


Add a sub-subclause "x.y.2 basic_string_ref iterator support [string.ref.iterators]"
typedef implementation-defined const_iterator;
A random-access, contiguous iterator type.

Where is the notion of a contiguous iterator defined?


In the declaration of the interface, const charT& basic_string_ref::at( size_t pos ) const 
  1. Is declared with a parameter of type size_t instead of size_type
  2. is not declared constexpr
which is inconsistent with the detailed description later on: constexpr const_reference at( size_type pos ) const.

Jeffrey Yasskin

unread,
Jan 22, 2013, 12:43:01 PM1/22/13
to std-pr...@isocpp.org
The thread you want is
https://groups.google.com/a/isocpp.org/d/topic/std-proposals/5sW8yp5i8mo/discussion.
I'm going to ignore naming discussions on this thread.

Thanks,
Jeffrey

Olaf van der Spek

unread,
Jan 27, 2013, 9:15:06 AM1/27/13
to std-pr...@isocpp.org
On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
I finally have an update for the string_ref proposal, converting it to
a set of changes against the C++14 draft. The most recent version
lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
and I've attached a snapshot.

explicit basic_string(basic_string_ref<charT, traits> str, const Allocator& a = Allocator()); 

Why is this string constructor explicit? 

I expect a lot of code to be like:

string_ref f();

string a = f(); // construction
a = f(); // assignment

I've heard the cost argument, but string construction is known to be expensive and construction from const char* or const string& is as expensive but implicit.

Jeffrey Yasskin

unread,
Mar 15, 2013, 2:41:43 AM3/15/13
to std-pr...@isocpp.org
On Tue, Jan 22, 2013 at 6:37 AM, <cart...@gmail.com> wrote:
> When one of the stated goals of the proposal is:
>
>> Note: The library provides implicit conversions from const charT* and
>> std::basic_string<charT, ...> to std::basic_string_ref<charT, ...> so that
>> user code can accept just std::basic_string_ref<charT> as a parameter
>> wherever a sequence of characters is expected
>
>
> it seems like a failure to me that many of the string operations are
> declared with a pair of const charT* and basic_string_ref overloads. E.g.,
>
>> constexpr int compare(basic_string_ref s) const noexcept;
>> constexpr int compare(const charT* s) const noexcept;
>
>
> If a single overload on basic_string_ref isn't good enough for
> basic_string_ref members, why should it be acceptable for "user code"?

The standard has to be much more careful about allowing optimization
than user code does. That is, if I don't add the charT* overload, then
implementations aren't allowed to optimize the case of a very long
charT* string for which it could bail out of compare() before
calculating the whole length. This was requested in the LWG discussion
in Portland.

In the codebases I've seen that actually use string_ref (LLVM, Google,
and Chromium), the worry about long charT* strings has never been an
actual problem, and functions are just declared with a single
string_ref overload. The worry about backward-compatibility that
forced all the non-basic_string_ref members to also include a
basic_string overload has also never been a problem with these
codebases.

>> Add a sub-subclause "x.y.2 basic_string_ref iterator support
>> [string.ref.iterators]"
>> typedef implementation-defined const_iterator;
>> A random-access, contiguous iterator type.
>
>
> Where is the notion of a contiguous iterator defined?

It wasn't. Is now. Thanks. :)

> In the declaration of the interface, const charT& basic_string_ref::at(
> size_t pos ) const
>
> Is declared with a parameter of type size_t instead of size_type
> is not declared constexpr
>
> which is inconsistent with the detailed description later on: constexpr
> const_reference at( size_type pos ) const.

Also fixed. Thanks. Sorry for the long delay in getting back to you.

http://htmlpreview.github.com/?https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_view.html

Olaf van der Spek

unread,
Oct 10, 2014, 3:33:44 PM10/10/14
to std-pr...@isocpp.org
Somebody? 

Ville Voutilainen

unread,
Oct 10, 2014, 4:41:35 PM10/10/14
to std-pr...@isocpp.org
http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4081.html#string.view.ops
"This conversion is explicit to avoid accidental O(N) operations on
type mismatches."

Olaf van der Spek

unread,
Oct 11, 2014, 8:31:10 AM10/11/14
to std-pr...@isocpp.org
On Fri, Oct 10, 2014 at 10:41 PM, Ville Voutilainen
<ville.vo...@gmail.com> wrote:
>>> Why is this string constructor explicit?
>>>
>>> I expect a lot of code to be like:
>>>
>>> string_ref f();
>>>
>>> string a = f(); // construction
>>> a = f(); // assignment
>>>
>>> I've heard the cost argument, but string construction is known to be
>>> expensive and construction from const char* or const string& is as expensive
>>> but implicit.
>>
>>
>> Somebody?
>
> http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4081.html#string.view.ops
> "This conversion is explicit to avoid accidental O(N) operations on
> type mismatches."

That's what I said. Just to be clear, this is inconsistant with the
string constructor from char* isn't it?


--
Olaf

Ville Voutilainen

unread,
Oct 11, 2014, 8:42:38 AM10/11/14
to std-pr...@isocpp.org
On 11 October 2014 15:31, Olaf van der Spek <olafv...@gmail.com> wrote:
>>>> I've heard the cost argument, but string construction is known to be
>>>> expensive and construction from const char* or const string& is as expensive
>>>> but implicit.
>>>
>>>
>>> Somebody?
>>
>> http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4081.html#string.view.ops
>> "This conversion is explicit to avoid accidental O(N) operations on
>> type mismatches."
>
> That's what I said. Just to be clear, this is inconsistant with the
> string constructor from char* isn't it?

Oh, pardon me, I misread that signature to be capable of converting to a string
with a different character type. Yes, it's inconsistent with the
string constructor
from char*, apparently string_view is trying to be more protective. Do remember
that string constructor from char* was made non-explicit because it was thought
very common to invoke a
void foo(const std::string&);
with
foo("literal");
Apparently the same kind of commonality hasn't been convincing for string_view.
I'll dig through some committee discussions to see what, if any, has been said
about this point.

Matthew Fioravante

unread,
Oct 23, 2014, 11:56:59 PM10/23/14
to std-pr...@isocpp.org


On Friday, October 10, 2014 4:41:35 PM UTC-4, Ville Voutilainen wrote:

http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4081.html#string.view.ops
"This conversion is explicit to avoid accidental O(N) operations on
type mismatches."

I'm not sure I agree with this decision. In my view (pun not intended), string_view should become the default way to pass around const references to string. That is passing around const std::string& and const char* will disappear in favor of std::string_view (by value). Similarly we won't pass around const std::vector<T>& anymore but instead std::array_view<T>. Not only are the views more efficient (one less indirection), but they also make interfaces much more flexible by accepting a range like concept instead of hard container type. Having to say foo(std::string_view("abc")) is very clumsy.

The most common use of the const char* constructor is for string literals. Constructing a string_view with a string literal will not be O(N) because the compiler will be able to inline the call. Constructing a string_view with an arbitrary const char* will require an O(N) strlen() but that's already the reality of dealing with null terminated strings. You call strlen() over and over again at each level of the call stack. This is one reason why this proposal is so important and why null terminated strings need to go away. Once string_view becomes the standard, const char* will start to go away except for C compatibility and other edge cases and the O(N) issue will be a corner case, not a real problem.


Matthew Fioravante

unread,
Oct 24, 2014, 12:14:46 AM10/24/14
to std-pr...@isocpp.org
By value string_view should become the defacto way to pass around read only strings and const string& and const char* will go away. Similarly, we won't pass around const vector<T>& anymore in favor of array_view<T> (by value). 

Passing the views by value is more efficient because there is one less indirection. Its the same as directly passing the raw pointer to the character array + the length instead of passing a pointer to the pointer to the character array (const string&). More importantly, it also makes our interfaces much more flexible because the string data can be sourced from anything (string, char[], char*, user defined string class, etc..).

The most common use case of a char* constructor is for string literals. In this use case, the call strlen() will trivially be optimized out. When string_view becomes standard practice, const char* will show up less often (for legacy code / C compatibility) and thus the potential to have sneaky O(N) strlen() calls hidden in your code will be reduced.

Finally, when users really do want to construct from an arbitrary const char*, they should know already that they have to worry about strlen() calls because that's the nature of null terminated strings. You have to call strlen() over and over again at each level of the call stack.

In order to make string_view the defacto string type, it must be easy to use. If we cannot say foo("abc") but instead have to say foo(string_view("abc")), its going to be a big impediment to string_view. I believe in this case, ease of use should win out over potential performance bugs due to hidden strlen() calls, which will go away when string_view is used to be the normal way to represent read only strings.

Olaf van der Spek

unread,
Oct 25, 2014, 10:43:19 AM10/25/14
to std-pr...@isocpp.org
On Fri, Oct 24, 2014 at 5:56 AM, Matthew Fioravante
<fmatth...@gmail.com> wrote:
> I'm not sure I agree with this decision. In my view (pun not intended),
> string_view should become the default way to pass around const references to
> string. That is passing around const std::string& and const char* will
> disappear in favor of std::string_view (by value). Similarly we won't pass
> around const std::vector<T>& anymore but instead std::array_view<T>. Not

std::array_view<const T> ;)

> only are the views more efficient (one less indirection), but they also make
> interfaces much more flexible by accepting a range like concept instead of
> hard container type. Having to say foo(std::string_view("abc")) is very
> clumsy.

foo("abc"sv) might work, but I think foo("abc") is a bit cleaner.

Nicola Gigante

unread,
Oct 25, 2014, 11:04:00 AM10/25/14
to std-pr...@isocpp.org
Hi all,

In my opinion, the inability to call a function with a string literal without extra syntax
is going to kill the usage of string_view, especially in code written by novices or
not so expert programmers, who instead are the people who should be most
encouraged to use safe and high-level functionalities.

The O(n) strlen() call on string literal is already a false problem. Compilers inline
calls like that.

But even if we’re worried by the cost of a non-inlined strlen() being overlooked,
why don’t we provide a non-explicit constructor that specifically works only with
string literals? i.e.

template<size_t N>
basic_string_view(CharT literal[N]);

This way, conversions from string literals would be convenient and costly conversions
from unknown-length strings would be safely made explicit.

What are the negative aspects of this solution?

Bye,
Nicola

Olaf van der Spek

unread,
Oct 25, 2014, 11:09:36 AM10/25/14
to std-pr...@isocpp.org
On Sat, Oct 25, 2014 at 5:03 PM, Nicola Gigante
<nicola....@gmail.com> wrote:
> template<size_t N>
> basic_string_view(CharT literal[N]);
>
> This way, conversions from string literals would be convenient and costly conversions
> from unknown-length strings would be safely made explicit.
>
> What are the negative aspects of this solution?

It 'works' for arrays (buffers) too but doesn't use the right size
unless the string consumes the entire buffer.


--
Olaf

Nicola Gigante

unread,
Oct 25, 2014, 11:30:08 AM10/25/14
to std-pr...@isocpp.org
Good point!

Bye,
Nicola

Billy Donahue

unread,
Oct 26, 2014, 11:24:47 PM10/26/14