Update to string_ref proposal

840 views
Skip to first unread message

Jeffrey Yasskin

unread,
Dec 26, 2012, 6:44:38 PM12/26/12
to std-pr...@isocpp.org
Hi folks,

I finally have an update for the string_ref proposal, converting it to
a set of changes against the C++14 draft. The most recent version
lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
and I've attached a snapshot.

The LWG is still deciding whether this should aim for a TS or C++14
(or both?), and feedback from here will help inform that decision.
This paper currently assumes C++14.

Let me know what you think,
Jeffrey Yasskin
string_ref.html

Sommerlad Peter (peter.sommerlad@hsr.ch)

unread,
Dec 28, 2012, 5:05:36 AM12/28/12
to std-pr...@isocpp.org
Hi,

> An alternate way to get compile-time string_refs would be to define a user-defined literal operator a'la N3468. Arguably, basic_string_ref is a better candidate for the ""s suffix thanbasic_string since basic_string isn't a literal type.

as author of N3468 I can understand your wish to have a constexpr literal suffix operator for string_ref, however, I propose that you do not insist on operator"" s() to only be useful to create basic_string_ref.

it is a matter of teachability (string_ref will be "new") and also of the need that you actually may want a std::string variable when doing:

auto hi="hello"s;
hi.append(", world!"s);

which is one of my major reasons for contributing operator"" s().

We have two options:

provide

constexpr std::string_ref operator"" s(char const *, size_t);

in a separate namespace std::literals::string_ref, which makes mixing std::string operator"" s(char const *, size_t) in the same scope impossible, or chose a different name for the suffix, e.g., sr:

constexpr std::string_ref operator"" sr(char const *, size_t);

or, if that seems to much to type,

constexpr std::string_ref operator"" r(char const *, size_t);

Otherwise, keep on with the good work.

Regards
Peter.
> --
>
>
>
> <string_ref.html>

--
Prof. Peter Sommerlad

Institut für Software: Bessere Software - Einfach, Schneller!
HSR Hochschule für Technik Rapperswil
Oberseestr 10, Postfach 1475, CH-8640 Rapperswil

http://ifs.hsr.ch http://cute-test.com http://linticator.com http://includator.com
tel:+41 55 222 49 84 == mobile:+41 79 432 23 32
fax:+41 55 222 46 29 == mailto:peter.s...@hsr.ch





Olaf van der Spek

unread,
Dec 28, 2012, 8:04:54 AM12/28/12
to std-pr...@isocpp.org
On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
I finally have an update for the string_ref proposal, converting it to
a set of changes against the C++14 draft. The most recent version
lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
and I've attached a snapshot.

The LWG is still deciding whether this should aim for a TS or C++14
(or both?), and feedback from here will help inform that decision.

What are the pros and cons?

std::string, vector and boost::iterator_range have pop_front() and pop_back(). String_ref isn't a container but pop_front() and pop_back() would still be useful. Could they be added?
Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?

Is there still time for bike shedding? str_ref is ever so slightly shorter. :p

String_ref obviously can't be null-terminated, but C strings aren't going anywhere any time soon unfortunately. Maybe a future zstr_ref proposal could address this.

Beman Dawes

unread,
Dec 29, 2012, 8:42:49 AM12/29/12
to std-pr...@isocpp.org
On Fri, Dec 28, 2012 at 8:04 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
>>
>> I finally have an update for the string_ref proposal, converting it to
>> a set of changes against the C++14 draft. The most recent version
>> lives at
>> https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
>> and I've attached a snapshot.

> Is there still time for bike shedding? str_ref is ever so slightly shorter.

Names are important, and I've planning to start such a discussion next
week when more people will be reading this list. But if you want to
start such a discussion now, feel free to do so. However, please start
a separate message thread, with a subject line like "[string-ref] Name
suggestion" so that (1) those that don't care can skip reading the
thread and (2) other technical comments don't get lost in the noise.

Thanks,

--Beman

Olaf van der Spek

unread,
Dec 29, 2012, 11:26:53 AM12/29/12
to std-pr...@isocpp.org
On Sat, Dec 29, 2012 at 2:42 PM, Beman Dawes <bda...@acm.org> wrote:
>> Is there still time for bike shedding? str_ref is ever so slightly shorter.
>
> Names are important, and I've planning to start such a discussion next
> week when more people will be reading this list. But if you want to
> start such a discussion now, feel free to do so. However, please start
> a separate message thread, with a subject line like "[string-ref] Name
> suggestion" so that (1) those that don't care can skip reading the
> thread and (2) other technical comments don't get lost in the noise.

I'll let you have the honor.

BTW, you've got mail.

--
Olaf

Beman Dawes

unread,
Dec 30, 2012, 1:55:55 PM12/30/12
to Jeffrey Yasskin, std-pr...@isocpp.org
On Wed, Dec 26, 2012 at 6:44 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
> Hi folks,
>
> I finally have an update for the string_ref proposal, converting it to
> a set of changes against the C++14 draft. The most recent version
> lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
> and I've attached a snapshot.

From the note in basic.string.ref:

"... User-defined types should define their own implicit conversions
to std::basic_string_ref in order to interoperate with these
functions."

The concern is that this requires an invasive change to the UDT.
That's a particular problem for 3rd party UDT's or UDT's the developer
has no control over.

Have you considered a generic constructor instead? For example,

template <class String>
basic_string_ref(const String& s)
: _beg(::std::string_ref_begin(s)), _end(::std::string_ref_end(s)) {}

Where _beg and _end are the private data members.

The user is granted permission to provide the two adapter functions in
namespace std.

For a TS, helper functions could be provided (in std) for std::basic_string:

// basic_string_ref helpers
template <class charT, class traits, class Allocator>
inline
const charT* string_ref_begin(const basic_string<charT, traits, Allocator>& s)
{
return s.c_str();
}
template <class charT, class traits, class Allocator>
inline
const charT* string_ref_end(const basic_string<charT, traits, Allocator>& s)
{
return s.c_str() + s.size();
}

A quick test of this is available at https://github.com/Beman/string_ref_ex.

--Beman

Dean Michael Berris

unread,
Jan 1, 2013, 8:15:25 PM1/1/13
to std-pr...@isocpp.org, Jeffrey Yasskin
On Mon, Dec 31, 2012 at 5:55 AM, Beman Dawes <bda...@acm.org> wrote:
> On Wed, Dec 26, 2012 at 6:44 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> Hi folks,
>>
>> I finally have an update for the string_ref proposal, converting it to
>> a set of changes against the C++14 draft. The most recent version
>> lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
>> and I've attached a snapshot.
>
> From the note in basic.string.ref:
>
> "... User-defined types should define their own implicit conversions
> to std::basic_string_ref in order to interoperate with these
> functions."
>
> The concern is that this requires an invasive change to the UDT.
> That's a particular problem for 3rd party UDT's or UDT's the developer
> has no control over.
>
> Have you considered a generic constructor instead? For example,
>
> template <class String>
> basic_string_ref(const String& s)
> : _beg(::std::string_ref_begin(s)), _end(::std::string_ref_end(s)) {}
>
> Where _beg and _end are the private data members.
>
> The user is granted permission to provide the two adapter functions in
> namespace std.
>

Why not just rely on ADL?

--
Dean Michael Berris
Google

Jeffrey Yasskin

unread,
Jan 2, 2013, 12:08:45 AM1/2/13
to std-pr...@isocpp.org
On Fri, Dec 28, 2012 at 8:04 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
>>
>> I finally have an update for the string_ref proposal, converting it to
>> a set of changes against the C++14 draft. The most recent version
>> lives at
>> https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
>> and I've attached a snapshot.
>>
>> The LWG is still deciding whether this should aim for a TS or C++14
>> (or both?), and feedback from here will help inform that decision.
>>
> What are the pros and cons?

Overall, we want string_ref in an official document of some sort as
soon as possible so that other TS'en can depend on it.

The bar for including string_ref in a TS is lower, so it's more likely
string_ref would get into the next one. It would also serve as initial
content for the "utility" TS, which may make it easier to add new
things. Putting it in a TS will allow us to change the class in
incompatible ways if we discover problems. However, a TS will be
explicitly beta and may require user code that adopts string_ref to
change the namespace of the class it uses when string_ref is
incorporated into a future C++ standard, even if we don't make any
changes to the class.

C++14 is more official and is a firmer basis for other TS'en. If we
have consensus that string_ref is the right thing to do, we shouldn't
delay it artificially by including an unnecessary TS step. Putting it
directly into the draft standard would let us stop discussing the
whole proposal repeatedly and just discuss changes. (It's possible
putting it in a TS would have the same effect, but I'm less confident
of that.)

Beman may have other tradeoffs to add.

> std::string, vector and boost::iterator_range have pop_front() and
> pop_back(). String_ref isn't a container but pop_front() and pop_back()
> would still be useful. Could they be added?
> Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?

When we discussed N3350
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html#classstd_1_1range_1a3c743e4e4b85290682234af4c2d7e50f),
there was an argument that pop_front is expected to destroy the
elements it pops, since it does so everywhere in the standard that it
currently exists. The committee seemed to prefer another name. Clearly
pop_front()==remove_prefix(1), so I don't see an urgent need to add
another method for this purpose. If I get clear direction from the
committee that they want to overload pop_front to have this meaning,
I'd be happy to change it back, but I don't expect to get that
direction.

> Is there still time for bike shedding? str_ref is ever so slightly shorter.
> :p

+1 for Beman's suggestion of starting a new thread for this. I don't
intend to participate much on that thread if at all, but I'm happy to
change the proposal if the discussion leans toward a particular other
option.

> String_ref obviously can't be null-terminated, but C strings aren't going
> anywhere any time soon unfortunately. Maybe a future zstr_ref proposal could
> address this.

Definitely a future proposal. I'll add a note to the paper mentioning this.

Also, sorry for forgetting your .data()/.size() constructor
suggestion. I've added a TODO to my local copy of the paper so I won't
forget it again.

Jeffrey

Jeffrey Yasskin

unread,
Jan 2, 2013, 12:17:46 AM1/2/13
to Beman Dawes, std-pr...@isocpp.org
On Sun, Dec 30, 2012 at 1:55 PM, Beman Dawes <bda...@acm.org> wrote:
> On Wed, Dec 26, 2012 at 6:44 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> Hi folks,
>>
>> I finally have an update for the string_ref proposal, converting it to
>> a set of changes against the C++14 draft. The most recent version
>> lives at https://github.com/google/cxx-std-draft/blob/string-ref-paper/string_ref.html,
>> and I've attached a snapshot.
>
> From the note in basic.string.ref:
>
> "... User-defined types should define their own implicit conversions
> to std::basic_string_ref in order to interoperate with these
> functions."
>
> The concern is that this requires an invasive change to the UDT.
> That's a particular problem for 3rd party UDT's or UDT's the developer
> has no control over.
>
> Have you considered a generic constructor instead? For example,
>
> template <class String>
> basic_string_ref(const String& s)
> : _beg(::std::string_ref_begin(s)), _end(::std::string_ref_end(s)) {}
>
> Where _beg and _end are the private data members.
>
> The user is granted permission to provide the two adapter functions in
> namespace std.

I've had bad luck with this kind of adaptation function causing ODR
violations in the past. That is:

Library A provides an adaptation point. Library B defines a type that
could logically be used with library A but doesn't define the
adaptation point.
Libraries C and D want to use B with A and both define the adaptation point.
Binary E wants to use both libraries C and D. ODR violation: Boom,
either at link time (good but hard to recover from if you can't modify
libraries C or D, and if you can modify them, why can't you modify
library B?) or run time (very bad).

That said, the standard's style is generally to allow this kind of
extension, so I'll give in if you think I should add it.

> For a TS, helper functions could be provided (in std) for std::basic_string:

For basic_string, it's easy enough just to define the conversion
constructor. The difference would be in the implicit conversions
allowed with "string_ref sr(non_string_type)", where going through a
string_ref_{begin,end} template would allow fewer and would thereby
allow fewer accidental dangling pointers, which is a plus.

Jeffrey

Olaf van der Spek

unread,
Jan 3, 2013, 7:51:10 AM1/3/13
to std-pr...@isocpp.org
On Wed, Jan 2, 2013 at 6:08 AM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
> C++14 is more official and is a firmer basis for other TS'en. If we
> have consensus that string_ref is the right thing to do, we shouldn't
> delay it artificially by including an unnecessary TS step. Putting it
> directly into the draft standard would let us stop discussing the
> whole proposal repeatedly and just discuss changes. (It's possible
> putting it in a TS would have the same effect, but I'm less confident
> of that.)

Your list of open questions is short (mine is a bit longer ;), what's
stopping string_ref from getting into C++14?
AFAIK the concept is good, it's just some details that need to be worked out.
It's a shame a reference implementation isn't available yet (in Boost).

> Beman may have other tradeoffs to add.
>
>> std::string, vector and boost::iterator_range have pop_front() and
>> pop_back(). String_ref isn't a container but pop_front() and pop_back()
>> would still be useful. Could they be added?
>> Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?
>
> When we discussed N3350
> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html#classstd_1_1range_1a3c743e4e4b85290682234af4c2d7e50f),
> there was an argument that pop_front is expected to destroy the
> elements it pops, since it does so everywhere in the standard that it
> currently exists. The committee seemed to prefer another name. Clearly
> pop_front()==remove_prefix(1), so I don't see an urgent need to add

Not really, semantics are different if range is empty. Remove_prefix
and remove_suffix aren't symmetric in that case either.
I like pop_back() much better, it's not a problem in
boost::iterator_range (AFAIK) and I've no idea what destroying a char
would imply.
The difference with an owning container is already indicated by the type name.

Wouldn't clear() have to be removed too?

> another method for this purpose. If I get clear direction from the
> committee that they want to overload pop_front to have this meaning,
> I'd be happy to change it back, but I don't expect to get that
> direction.

> Also, sorry for forgetting your .data()/.size() constructor
> suggestion. I've added a TODO to my local copy of the paper so I won't
> forget it again.

Using data() and size() probably isn't right, it should be begin() and
end(). The point is being able to construct a string_ref from other
string-like types.

> starts_with(), ends_with()

Is a non-member variant being provided too? Otherwise it's not easily
usable on std::string and other string-like types.

BTW, don't forget my comments at
https://groups.google.com/a/isocpp.org/forum/?hl=en&fromgroups=#!topic/std-proposals/ZUnktXzj0RE
--
Olaf

Zhihao Yuan

unread,
Jan 3, 2013, 12:58:49 PM1/3/13
to std-pr...@isocpp.org
On Thu, Jan 3, 2013 at 6:51 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
>> starts_with(), ends_with()
>
> Is a non-member variant being provided too? Otherwise it's not easily
> usable on std::string and other string-like types.

The proposal added them on basic.string for an appropriate English
reading.

BTW, just curious, can I use

string_ref("prefixtree").starts_with(string("prefix"));

?

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://4bsd.biz/

Olaf van der Spek

unread,
Jan 3, 2013, 1:02:02 PM1/3/13
to std-pr...@isocpp.org
On Thu, Jan 3, 2013 at 6:58 PM, Zhihao Yuan <lic...@gmail.com> wrote:
> On Thu, Jan 3, 2013 at 6:51 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
>>> starts_with(), ends_with()
>>
>> Is a non-member variant being provided too? Otherwise it's not easily
>> usable on std::string and other string-like types.
>
> The proposal added them on basic.string for an appropriate English
> reading.

What about other string-like types?

> BTW, just curious, can I use
>
> string_ref("prefixtree").starts_with(string("prefix"));

Why not this?

string_ref("prefixtree").starts_with("prefix")

This is even shorter but requires non-member functions:

starts_with("prefixtree", "prefix")

Olaf

Beman Dawes

unread,
Jan 4, 2013, 6:58:44 AM1/4/13
to std-pr...@isocpp.org
On Tue, Jan 1, 2013 at 8:15 PM, Dean Michael Berris
<dbe...@googlers.com> wrote:
> On Mon, Dec 31, 2012 at 5:55 AM, Beman Dawes <bda...@acm.org> wrote:
...
>> Have you considered a generic constructor instead? For example,
>>
>> template <class String>
>> basic_string_ref(const String& s)
>> : _beg(::std::string_ref_begin(s)), _end(::std::string_ref_end(s)) {}
>>
>> Where _beg and _end are the private data members.
>>
>> The user is granted permission to provide the two adapter functions in
>> namespace std.
>>
>
> Why not just rely on ADL?

No good reason, and that's what the little prototype did initially.
But I don't understand ADL pitfalls very well, so avoid using it when
there is an alternative.

I'd defer to library experts if they have a strong opinion, and also
to Jeffery since it is his proposal.

--Beman

Jeffrey Yasskin

unread,
Jan 4, 2013, 1:59:27 PM1/4/13
to std-pr...@isocpp.org
On Thu, Jan 3, 2013 at 12:51 PM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Wed, Jan 2, 2013 at 6:08 AM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> C++14 is more official and is a firmer basis for other TS'en. If we
>> have consensus that string_ref is the right thing to do, we shouldn't
>> delay it artificially by including an unnecessary TS step. Putting it
>> directly into the draft standard would let us stop discussing the
>> whole proposal repeatedly and just discuss changes. (It's possible
>> putting it in a TS would have the same effect, but I'm less confident
>> of that.)
>
> Your list of open questions is short (mine is a bit longer ;), what's
> stopping string_ref from getting into C++14?
> AFAIK the concept is good, it's just some details that need to be worked out.
> It's a shame a reference implementation isn't available yet (in Boost).

I think the details are what people are worried about. Personally, I
think we can get them worked out by Bristol and get string_ref into
C++14, but there's enough doubt in the minds of more experienced
committee members that I'm not certain.

Note that the feedback we need from this list is not "which target
should we aim for". It's "which pieces are right and wrong".

>> Beman may have other tradeoffs to add.
>>
>>> std::string, vector and boost::iterator_range have pop_front() and
>>> pop_back(). String_ref isn't a container but pop_front() and pop_back()
>>> would still be useful. Could they be added?
>>> Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?
>>
>> When we discussed N3350
>> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html#classstd_1_1range_1a3c743e4e4b85290682234af4c2d7e50f),
>> there was an argument that pop_front is expected to destroy the
>> elements it pops, since it does so everywhere in the standard that it
>> currently exists. The committee seemed to prefer another name. Clearly
>> pop_front()==remove_prefix(1), so I don't see an urgent need to add
>
> Not really, semantics are different if range is empty. Remove_prefix
> and remove_suffix aren't symmetric in that case either.

Now that I'm looking at them again, I think I gave
remove_{prefix,suffix} the wrong semantics. If n>size(), remove_prefix
throws out_of_range, while remove_suffix returns the whole string.
That can't be right. Should they be:

remove_prefix(n) is equivalent to "*this = substr(min(n, size()), npos)"
remove_suffix(n) is equivalent to "*this = substr(0, size() - min(n, size()))"

Or should they Require that n <= size()?

> I like pop_back() much better, it's not a problem in
> boost::iterator_range (AFAIK) and I've no idea what destroying a char
> would imply.
> The difference with an owning container is already indicated by the type name.

I thought pop_back was fine, but the committee disagreed, so it's
changed. You seem to like assuming away programmer mistakes, while I
and the committee want to catch as many as possible within the bounds
of C++, so we're likely to disagree on things like this.

> Wouldn't clear() have to be removed too?

Apparently not. Nobody objected to it.

>> another method for this purpose. If I get clear direction from the
>> committee that they want to overload pop_front to have this meaning,
>> I'd be happy to change it back, but I don't expect to get that
>> direction.
>
>> Also, sorry for forgetting your .data()/.size() constructor
>> suggestion. I've added a TODO to my local copy of the paper so I won't
>> forget it again.
>
> Using data() and size() probably isn't right, it should be begin() and
> end(). The point is being able to construct a string_ref from other
> string-like types.

We'll have to wait for contiguous_iterator_tag then, or take Beman's
suggestion for string_ref_begin() (contiguous_begin()? ew, wart.). I'm
happy with leaving that constructor out of string_ref for the
initially standardized version and adding it later. string_ref doesn't
have to be complete the first time, it just has to be better than what
we have.

>> starts_with(), ends_with()
>
> Is a non-member variant being provided too? Otherwise it's not easily
> usable on std::string and other string-like types.

No, the paper includes, "The non-member equivalents produce calls that
are somewhat ambiguous between starts_with(haystack, needle) vs
starts_with(needle, haystack), while haystack.starts_with(needle) is
the only English reading of the member version. These queries apply
equally well to basic_string, so I've added them there too."

Zhihao and you got the syntax I expect for non-string_ref types:
string_ref("prefixtree").starts_with("prefix")

I think I've answered all of them now. What have I missed?

Olaf van der Spek

unread,
Jan 4, 2013, 3:54:50 PM1/4/13
to std-pr...@isocpp.org
On Fri, Jan 4, 2013 at 7:59 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
> Note that the feedback we need from this list is not "which target
> should we aim for". It's "which pieces are right and wrong".

Ah, ok

>>> Beman may have other tradeoffs to add.
>>>
>>>> std::string, vector and boost::iterator_range have pop_front() and
>>>> pop_back(). String_ref isn't a container but pop_front() and pop_back()
>>>> would still be useful. Could they be added?
>>>> Maybe pop_front(size_t n) could also be added (instead of remove_prefix)?
>>>
>>> When we discussed N3350
>>> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3350.html#classstd_1_1range_1a3c743e4e4b85290682234af4c2d7e50f),
>>> there was an argument that pop_front is expected to destroy the
>>> elements it pops, since it does so everywhere in the standard that it
>>> currently exists. The committee seemed to prefer another name. Clearly
>>> pop_front()==remove_prefix(1), so I don't see an urgent need to add
>>
>> Not really, semantics are different if range is empty. Remove_prefix
>> and remove_suffix aren't symmetric in that case either.
>
> Now that I'm looking at them again, I think I gave
> remove_{prefix,suffix} the wrong semantics. If n>size(), remove_prefix
> throws out_of_range, while remove_suffix returns the whole string.
> That can't be right. Should they be:
>
> remove_prefix(n) is equivalent to "*this = substr(min(n, size()), npos)"
> remove_suffix(n) is equivalent to "*this = substr(0, size() - min(n, size()))"
>
> Or should they Require that n <= size()?

I'd go for the latter. Those functions are defined as noexcept.

>> I like pop_back() much better, it's not a problem in
>> boost::iterator_range (AFAIK) and I've no idea what destroying a char
>> would imply.
>> The difference with an owning container is already indicated by the type name.
>
> I thought pop_back was fine, but the committee disagreed, so it's
> changed. You seem to like assuming away programmer mistakes, while I
> and the committee want to catch as many as possible within the bounds
> of C++, so we're likely to disagree on things like this.

I'm not sure that's it.
pop_back() is existing practice in both Boost and D (and the idea was
to standardize existing practice wasn't it?)
Textually pop_back and remove_suffix seem equivalent (to me), nothing
tells me their behaviour is (kinda) different.
Would remove_suffix(1) really avoid mistakes?

http://dlang.org/phobos/std_range.html

>> Wouldn't clear() have to be removed too?
>
> Apparently not. Nobody objected to it.

Seems inconsistent.

> We'll have to wait for contiguous_iterator_tag then, or take Beman's
> suggestion for string_ref_begin() (contiguous_begin()? ew, wart.). I'm
> happy with leaving that constructor out of string_ref for the
> initially standardized version and adding it later. string_ref doesn't
> have to be complete the first time, it just has to be better than what
> we have.

That's easy :p
Waiting for contiguous_iterator_tag or explicit charT* seems fine.

>>> starts_with(), ends_with()
>>
>> Is a non-member variant being provided too? Otherwise it's not easily
>> usable on std::string and other string-like types.
>
> No, the paper includes, "The non-member equivalents produce calls that
> are somewhat ambiguous between starts_with(haystack, needle) vs
> starts_with(needle, haystack), while haystack.starts_with(needle) is
> the only English reading of the member version. These queries apply
> equally well to basic_string, so I've added them there too."
>
> Zhihao and you got the syntax I expect for non-string_ref types:
> string_ref("prefixtree").starts_with("prefix")

Having to include "string_ref" there doesn't seem entirely right.

>> BTW, don't forget my comments at
>> https://groups.google.com/a/isocpp.org/forum/?hl=en&fromgroups=#!topic/std-proposals/ZUnktXzj0RE
>
> I think I've answered all of them now. What have I missed?

The 3 stoi overloads, they look unclean. Wouldn't adding an overload
also have the potential to break code die to ambiguity?


--
Olaf

Jeffrey Yasskin

unread,
Jan 4, 2013, 4:58:32 PM1/4/13
to std-pr...@isocpp.org
On Fri, Jan 4, 2013 at 12:54 PM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Fri, Jan 4, 2013 at 7:59 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> Now that I'm looking at them again, I think I gave
>> remove_{prefix,suffix} the wrong semantics. If n>size(), remove_prefix
>> throws out_of_range, while remove_suffix returns the whole string.
>> That can't be right. Should they be:
>>
>> remove_prefix(n) is equivalent to "*this = substr(min(n, size()), npos)"
>> remove_suffix(n) is equivalent to "*this = substr(0, size() - min(n, size()))"
>>
>> Or should they Require that n <= size()?
>
> I'd go for the latter. Those functions are defined as noexcept.

noexcept would imply the first, I think. We don't generally have
noexcept functions with Requires clauses. (I'm sure there are some,
but for example string::front() isn't noexcept.) Google's and LLVM's
versions of this, though, assert(size()>=n).

>>> I like pop_back() much better, it's not a problem in
>>> boost::iterator_range (AFAIK) and I've no idea what destroying a char
>>> would imply.
>>> The difference with an owning container is already indicated by the type name.
>>
>> I thought pop_back was fine, but the committee disagreed, so it's
>> changed. You seem to like assuming away programmer mistakes, while I
>> and the committee want to catch as many as possible within the bounds
>> of C++, so we're likely to disagree on things like this.
>
> I'm not sure that's it.
> pop_back() is existing practice in both Boost and D (and the idea was
> to standardize existing practice wasn't it?)

remove_back is existing practice in Google's version of this. LLVM's
version uses drop_back. Bloomberg doesn't define it. So I don't think
existing practice decides this.

> Textually pop_back and remove_suffix seem equivalent (to me), nothing
> tells me their behaviour is (kinda) different.
> Would remove_suffix(1) really avoid mistakes?
>
> http://dlang.org/phobos/std_range.html
>
>>> BTW, don't forget my comments at
>>> https://groups.google.com/a/isocpp.org/forum/?hl=en&fromgroups=#!topic/std-proposals/ZUnktXzj0RE
>>
>> I think I've answered all of them now. What have I missed?
>
> The 3 stoi overloads, they look unclean. Wouldn't adding an overload
> also have the potential to break code die to ambiguity?

If a class has an implicit conversion to both const char* and
std::string (or has a template conversion operator), and is being
passed directly to stoi(), that could break. This seems less likely
than either that const char* is being passed to stoi() or that a class
with only an implicit conversion to std::string is being passed.

It's definitely annoying to have to deal with backwards compatibility
like this, but string_ref exists primarily for the convenience of
people using C++, and only secondarily to clean up the standard.

Jeffrey

Olaf van der Spek

unread,
Jan 4, 2013, 5:11:30 PM1/4/13
to std-pr...@isocpp.org
On Fri, Jan 4, 2013 at 10:58 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>>> Or should they Require that n <= size()?
>>
>> I'd go for the latter. Those functions are defined as noexcept.
>
> noexcept would imply the first, I think. We don't generally have
> noexcept functions with Requires clauses. (I'm sure there are some,
> but for example string::front() isn't noexcept.) Google's and LLVM's
> versions of this, though, assert(size()>=n).

I assumed require would be the precondition and breaking it would be UB.

>>>> I like pop_back() much better, it's not a problem in
>>>> boost::iterator_range (AFAIK) and I've no idea what destroying a char
>>>> would imply.
>>>> The difference with an owning container is already indicated by the type name.
>>>
>>> I thought pop_back was fine, but the committee disagreed, so it's
>>> changed. You seem to like assuming away programmer mistakes, while I
>>> and the committee want to catch as many as possible within the bounds
>>> of C++, so we're likely to disagree on things like this.
>>
>> I'm not sure that's it.
>> pop_back() is existing practice in both Boost and D (and the idea was
>> to standardize existing practice wasn't it?)
>
> remove_back is existing practice in Google's version of this. LLVM's
> version uses drop_back. Bloomberg doesn't define it. So I don't think
> existing practice decides this.

All use back instead of suffix.
LLVM's defaults N to 1. Is Google's version public?

>> Textually pop_back and remove_suffix seem equivalent (to me), nothing
>> tells me their behaviour is (kinda) different.
>> Would remove_suffix(1) really avoid mistakes?
>>
>> http://dlang.org/phobos/std_range.html



--
Olaf

Jeffrey Yasskin

unread,
Jan 4, 2013, 5:16:02 PM1/4/13
to std-pr...@isocpp.org
On Fri, Jan 4, 2013 at 2:11 PM, Olaf van der Spek <olafv...@gmail.com> wrote:
> On Fri, Jan 4, 2013 at 10:58 PM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
>> remove_back is existing practice in Google's version of this. LLVM's
>> version uses drop_back. Bloomberg doesn't define it. So I don't think
>> existing practice decides this.
>
> All use back instead of suffix.
> LLVM's defaults N to 1. Is Google's version public?

https://code.google.com/p/re2/source/browse/re2/stringpiece.h,
http://www.icu-project.org/apiref/icu4c/classicu_1_1StringPiece.html,
and https://code.google.com/searchframe#OAMlx_jo-ck/src/base/string_piece.h&type=cs&l=102
(Chrome) all look derived from Google's version. I typo'ed
remove_back, it's actually remove_suffix.

Gabriel Dos Reis

unread,
Jan 4, 2013, 6:43:36 PM1/4/13
to std-pr...@isocpp.org
Olaf van der Spek <olafv...@gmail.com> writes:

[...]

| > I thought pop_back was fine, but the committee disagreed, so it's
| > changed. You seem to like assuming away programmer mistakes, while I
| > and the committee want to catch as many as possible within the bounds
| > of C++, so we're likely to disagree on things like this.
|
| I'm not sure that's it.
| pop_back() is existing practice in both Boost and D (and the idea was
| to standardize existing practice wasn't it?)

No, we are not required to standardize "existing practice" if in
hindsight it is not such a bright idea.

Beman Dawes

unread,
Jan 5, 2013, 8:52:00 AM1/5/13
to std-pr...@isocpp.org
On Wed, Jan 2, 2013 at 12:08 AM, Jeffrey Yasskin <jyas...@googlers.com> wrote:
> On Fri, Dec 28, 2012 at 8:04 AM, Olaf van der Spek <olafv...@gmail.com> wrote:
>> On Thursday, December 27, 2012 12:44:38 AM UTC+1, Jeffrey Yasskin wrote:
>>>
....
>>> The LWG is still deciding whether this should aim for a TS or C++14
>>> (or both?), and feedback from here will help inform that decision.
>>>
>> What are the pros and cons?
>
> Overall, we want string_ref in an official document of some sort as
> soon as possible so that other TS'en can depend on it.

Agreed.

> The bar for including string_ref in a TS is lower, so it's more likely
> string_ref would get into the next one. It would also serve as initial
> content for the "utility" TS, which may make it easier to add new
> things. Putting it in a TS will allow us to change the class in
> incompatible ways if we discover problems. However, a TS will be
> explicitly beta and may require user code that adopts string_ref to
> change the namespace of the class it uses when string_ref is
> incorporated into a future C++ standard, even if we don't make any
> changes to the class.

That's a nice summary of the TS target.

> C++14 is more official and is a firmer basis for other TS'en. If we
> have consensus that string_ref is the right thing to do, we shouldn't
> delay it artificially by including an unnecessary TS step. Putting it
> directly into the draft standard would let us stop discussing the
> whole proposal repeatedly and just discuss changes. (It's possible
> putting it in a TS would have the same effect, but I'm less confident
> of that.)

Having "consensus that string_ref is the right thing to do" with
reasonably correct proposed wording is good enough for a TS, but for
the standard itself there are a lot of additional hurdles. Some
committee members may want more real-world experience, others may be
concerned about the volume of changes to existing components, the
impact on books about C++, possible ABI breakage, maturity of wording,
impact on teaching, etc. These concerns often come from folks who do
not follow the doings of the LWG closely, so only arise late in the
process and tend to delay progress. Whether any of these concerns will
arise for string_ref is hard to predict, but I'd would not be
surprised if at least a few of them surface.

> Beman may have other tradeoffs to add.

I rate the probability of finishing technical work on a Library
Utility TS in 2013 as 70% or better, and by 2014 as 95%. I rate the
probability of finishing technical work on C++14 in 2013 as 25%, by
2014 as 65%, and by 2015 as 95%.

Remember too that if other TS's are gated on string_ref, and
string_ref is part of C++14, we would have to hold the other TS's
waiting for a delayed C++14.

--Beman

Jeffrey Yasskin

unread,
Jan 10, 2013, 7:54:08 PM1/10/13
to std-pr...@isocpp.org
I've uploaded a new version to
http://jeffrey.yasskin.info/cxx/2012/string_ref.html with updates from
this thread. I'm planning to send this off to Clark tomorrow, although
if https://groups.google.com/a/isocpp.org/d/topic/std-proposals/5sW8yp5i8mo/discussion
keeps agreeing on string_view, I may switch to that name first.

Nicol Bolas

unread,
Jan 10, 2013, 8:48:09 PM1/10/13
to std-pr...@isocpp.org


On Thursday, January 10, 2013 4:54:08 PM UTC-8, Jeffrey Yasskin wrote:
I've uploaded a new version to
http://jeffrey.yasskin.info/cxx/2012/string_ref.html with updates from
this thread. I'm planning to send this off to Clark tomorrow, although
if https://groups.google.com/a/isocpp.org/d/topic/std-proposals/5sW8yp5i8mo/discussion
keeps agreeing on string_view, I may switch to that name first.

In the section on user-defined conversion, you said:

Ultimately, I think we want to allow this conversion based on detecting contiguous ranges. Any constructor we add to work around that is going to look like a wart in a couple years. I think we'll be better off making users explicitly convert when they can't add an appropriate conversion operator, and then we can add the optimal constructor when contiguous iterators make it into the library.

This assumes that every string type out there can be detected as a "contiguous range", which is a concept that doesn't exist even in Boost.Range. It's hard to know what users would have to provide or if such things can be provided without modifying the type.

Also, it seems to put on a significant short-term problem over a long-term goal, when it's not entirely clear how long term this goal will be. Or if the eventual contiguous iterator and contiguous range proposals will even fit into the needs for basic_string_ref. Current discussions on the Range mailing list are starting to talk about things like ranges as primitive types with no iterators. If they go that route, I'm not sure if that's going to be a viable solution for basic_string_ref going forward.

I don't like the idea of making a proposal somewhat incomplete (you do recognize the need for user-defined conversions) on the assumption that in a year or two someone will come along with a fix. I say we give the user the ability to provide something that will work and be extensible now. If the eventual Range proposal comes along that can't handle things this way, then we already have our solution. If another solution appears, we can use that. It may leave a wart, but it will be a very small and harmless one. And it's not like anyone will really notice one more wart in our string classes...

Olaf van der Spek

unread,
Jan 11, 2013, 3:52:06 AM1/11/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 2:48 AM, Nicol Bolas <jmck...@gmail.com> wrote:
> I don't like the idea of making a proposal somewhat incomplete (you do
> recognize the need for user-defined conversions) on the assumption that in a
> year or two someone will come along with a fix. I say we give the user the
> ability to provide something that will work and be extensible now. If the
> eventual Range proposal comes along that can't handle things this way, then
> we already have our solution. If another solution appears, we can use that.
> It may leave a wart, but it will be a very small and harmless one. And it's
> not like anyone will really notice one more wart in our string classes...

We don't have to wait a year or two, we could write a proposal to fix
the contiguous range detection issue tomorrow.
Iterators aren't going away any time soon (I expect), so basing a
solution on them seems ok.

I'd go for the addition of an explicit conversion to pointer on
contiguous iterators, such that this works:

std::array<char, 10> v;
char* b(v.begin());
char* e(v.end());

Seems simple and useful in other situations too.
--
Olaf

Nicol Bolas

unread,
Jan 11, 2013, 4:43:55 AM1/11/13
to std-pr...@isocpp.org


On Friday, January 11, 2013 12:52:06 AM UTC-8, Olaf van der Spek wrote:
On Fri, Jan 11, 2013 at 2:48 AM, Nicol Bolas <jmck...@gmail.com> wrote:
> I don't like the idea of making a proposal somewhat incomplete (you do
> recognize the need for user-defined conversions) on the assumption that in a
> year or two someone will come along with a fix. I say we give the user the
> ability to provide something that will work and be extensible now. If the
> eventual Range proposal comes along that can't handle things this way, then
> we already have our solution. If another solution appears, we can use that.
> It may leave a wart, but it will be a very small and harmless one. And it's
> not like anyone will really notice one more wart in our string classes...

We don't have to wait a year or two, we could write a proposal to fix
the contiguous range detection issue tomorrow.

True. But if we put basic_string_ref/view into a TS, it won't necessarily have the contiguous range fix as part of the TS. So there will be no extensibility mechanism. Also, if that proposal doesn't make it for C++14 and this one does, then again, we won't have an extensibility mechanism.

This is one of those "system integration" problems that comes with having multiple proposals in parallel, all with dependencies on each other. A proposal that has progessed farther handwaves some of its features by saying, "well, that proposal will take care of that case," rather than dealing with the problem internally and being functionally complete.

At the very least, there should be some fall-back mechanism in this proposal which can be used if the real fix isn't available by the time the final paper, whether a TS or C++14, is released. At least that way, we can be sure that this proposal will stand on its own.

Oh, and one of the nice things about TS's is that you can change them when you pull them into the standard. So we don't have to stick to an extensibility mechanism we don't like when we move the class to C++14 standard.

Olaf van der Spek

unread,
Jan 11, 2013, 5:15:05 AM1/11/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 10:43 AM, Nicol Bolas <jmck...@gmail.com> wrote:
>> We don't have to wait a year or two, we could write a proposal to fix
>> the contiguous range detection issue tomorrow.
>
>
> True. But if we put basic_string_ref/view into a TS, it won't necessarily
> have the contiguous range fix as part of the TS. So there will be no
> extensibility mechanism. Also, if that proposal doesn't make it for C++14
> and this one does, then again, we won't have an extensibility mechanism.

Actually can't we just define operator string_ref() on other classes?

> Oh, and one of the nice things about TS's is that you can change them when
> you pull them into the standard. So we don't have to stick to an
> extensibility mechanism we don't like when we move the class to C++14
> standard.

You can, but breaking compatibility still isn't nice.


--
Olaf

Nevin Liber

unread,
Jan 11, 2013, 10:18:27 AM1/11/13
to std-pr...@isocpp.org
On 11 January 2013 02:52, Olaf van der Spek <olafv...@gmail.com> wrote:
We don't have to wait a year or two, we could write a proposal to fix
the contiguous range detection issue tomorrow.

FYI:  I'm working on a contiguous_iterator_tag proposal for Bristol (but it probably won't be done before the mailing deadline next week).
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Olaf van der Spek

unread,
Jan 11, 2013, 10:35:36 AM1/11/13
to std-pr...@isocpp.org
On Fri, Jan 11, 2013 at 4:18 PM, Nevin Liber <ne...@eviloverlord.com> wrote:
> On 11 January 2013 02:52, Olaf van der Spek <olafv...@gmail.com> wrote:
>>
>> We don't have to wait a year or two, we could write a proposal to fix
>> the contiguous range detection issue tomorrow.
>
>
> FYI: I'm working on a contiguous_iterator_tag proposal for Bristol (but it
> probably won't be done before the mailing deadline next week).

Great!
Isn't that deadline today though?

Could you also consider a conversion operator to T* for such iterators?

--
Olaf

Nevin Liber

unread,
Jan 11, 2013, 11:02:13 AM1/11/13
to std-pr...@isocpp.org
On 11 January 2013 09:35, Olaf van der Spek <olafv...@gmail.com> wrote:
On Fri, Jan 11, 2013 at 4:18 PM, Nevin Liber <ne...@eviloverlord.com> wrote:
> On 11 January 2013 02:52, Olaf van der Spek <olafv...@gmail.com> wrote:
>>
>> We don't have to wait a year or two, we could write a proposal to fix
>> the contiguous range detection issue tomorrow.
>
>
> FYI:  I'm working on a contiguous_iterator_tag proposal for Bristol (but it
> probably won't be done before the mailing deadline next week).

Great!
Isn't that deadline today though?

It could be, in which case it still won't be done by this mailing..
 
Could you also consider a conversion operator to T* for such iterators?

Yes, I am putting that in there (leaning against requiring a conversion from T* to iterator, though). 

Nicol Bolas

unread,
Jan 11, 2013, 11:17:10 AM1/11/13
to std-pr...@isocpp.org


On Friday, January 11, 2013 2:15:05 AM UTC-8, Olaf van der Spek wrote:
On Fri, Jan 11, 2013 at 10:43 AM, Nicol Bolas <jmck...@gmail.com> wrote:
>> We don't have to wait a year or two, we could write a proposal to fix
>> the contiguous range detection issue tomorrow.
>
>
> True. But if we put basic_string_ref/view into a TS, it won't necessarily
> have the contiguous range fix as part of the TS. So there will be no
> extensibility mechanism. Also, if that proposal doesn't make it for C++14
> and this one does, then again, we won't have an extensibility mechanism.

Actually can't we just define operator string_ref() on other classes?

The point would be for code we don't own. Like CString and the myriad of other string classes used by the multitude of C++ libraries out there.

Olaf van der Spek

unread,
Jan 11, 2013, 11:30:44 AM1/11/13