string view constructed from iterator pair?

1,106 views
Skip to first unread message

Evan Teran

unread,
Oct 29, 2015, 2:24:31 AM10/29/15
to ISO C++ Standard - Future Proposals
I was just looking at the fine details of string_view. And it occurred to me that being able to construct one from an iterator pair might make sense. Assuming that the string_view internally just stores a Ch* and a size_type pair. I *think* it could be defined like this:

template <class Ran>
string_view(Ran first, Ran last) {
    s_ = &*first;
    n_ = distance(first, last);
}

or something similar. At first glance, this seems like it would be useful. Often algorithms which we would use to find a portion of the string will return an iterator. Which could then be fed into string_view directly :-)

Is there any reason not to support this? Is there something obvious I'm missing? I suppose we could always manually do this and use the string_view(const Ch*, size_type) constructor. But I think accepting iterators directly would make for cleaner, more readable code.

Jeffrey Yasskin

unread,
Oct 29, 2015, 3:02:54 AM10/29/15
to std-pr...@isocpp.org
We need to be able to identify "contiguous" iterators in order to
accept iterators in string_view's constructor. otherwise, ((&*first) +
distance(first, last) - 1) might not be equal to &*(last-1).
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposal...@isocpp.org.
> To post to this group, send email to std-pr...@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.

Brent Friedman

unread,
Oct 29, 2015, 3:25:24 AM10/29/15
to std-pr...@isocpp.org
Even if you did know that the iterators were contiguous, the proposed implementation does not work.

string_view(end(), end()) should create a legal string_view, but dereferencing end as in the given implementation is UB.

Evan Teran

unread,
Oct 29, 2015, 10:24:27 AM10/29/15
to ISO C++ Standard - Future Proposals

@Jeffery, that's a fair enough point. But could probably be addressed by saying "the result is undefined if the iterators are not part of the same contiguous sequence" or similar. There's lot of things in c++ that if you pas the wrong parameters, it is simply undefined. For example, just about every algorithm.

@Brent, another valid point, but easy enough to fix:

template <class Ran>
string_view(Ran first, Ran last) {
    if(first != last) {
        s_ = &*first;
        n_ = distance(first, last);
    } else {
        s_ = nullptr;
        n_ = 0;
    }
}

Both of these points are reasonable, but don't seem insurmountable. Anyway, thanks for quick thoughts on this.

Bengt Gustafsson

unread,
Oct 29, 2015, 11:44:03 AM10/29/15
to ISO C++ Standard - Future Proposals


string_view(end(), end()) should create a legal string_view, but dereferencing end as in the given implementation is UB.

Enlighten me: I don't see what the problem is here. The created string_view has two equal iterators so n_ would be 0. Of course it is not allowed to dereference the object using for instance operator[] but
that's always true of an empty string_view, isn't it?

The only thing the "solution" that Even suggests changes is that the subtraction of two equal iterators is replaced with a != test. I can't see that this could be a problem for an iterator type that has operator- defined. That would mean that
there would be iterator values that compare equal but don't return 0 when subtracted from each other. That's bizarre (even though it is possible to construct such classes of course, with (im)proper overloads). I see the problem for non-random access iterators where one can be some kind of sentinel, but then operator- is not defined anyway... and typically the iterator type for the ends are different which is not supported in the suggested API.

Evan Teran

unread,
Oct 29, 2015, 11:56:54 AM10/29/15
to ISO C++ Standard - Future Proposals
@Bengt,

So I think the part that Brent was objecting to was the unconditional "s_ = &*first;" in the initial version. The concern being the deference then reference of end() if given string_view(end(), end()); But as shown, this is trivially worked around.

I agree about your assessment of the iterators. And I also thought it was important to insist that the iterators be rand access for a couple of reasons:

1. std::distance would be O(1)
2. last would have to be not a sentinel (I think a sentinel is incompatible with being a random access, right?)

We could simplify the constructor slightly like this:

template <class Ran>
string_view(Ran first, Ran last) {
    n_ = distance(first, last);
    if(n_ != 0) {
        s_ = &*first;        
    } else {
        s_ = nullptr;
    }
}

Then we don't need to directly compare the iterators, we only care if there is any distance between them. Thus avoiding the bizarre case of iterators that are equal but have a distance. (which is honestly evil, i hope no one does that)


Nicol Bolas

unread,
Oct 29, 2015, 12:14:21 PM10/29/15
to ISO C++ Standard - Future Proposals
On Thursday, October 29, 2015 at 11:56:54 AM UTC-4, Evan Teran wrote:
I agree about your assessment of the iterators. And I also thought it was important to insist that the iterators be rand access for a couple of reasons:

They don't have to be random access; they have to be *contiguous*. That's a higher standard. `deque` provides random access iterators, but you can't stick `deque<char>` iterators into a `string_view`.

Brent Friedman

unread,
Oct 29, 2015, 12:24:33 PM10/29/15
to std-pr...@isocpp.org
Evan,

Yes that does address the issue I was raising. Of course, if strings and vectors used pointer iterators then we wouldn't need the if test or the overload at all.

I think this constructor has a lot of room to be accidentally abused, however.

One way to work around this is with a "make_*" style utility that uses domain knowledge to convert iterators. For example,

string_view make_string_view(array<char>::iterator b, array<char>::iterator e);
string_view make_string_view(vector<char>::iterator b, vector<char>::iterator e);
string_view make_string_view(string::iterator b, string::iterator e);

You can then overload make_string_view to support your own iterator types if you know that they are contiguous.

I'll also point out that contiguous iterators are not sufficient; begin must also be less than end from the perspective of its underlying memory location. IE, it doesn't work with reverse_iterator (and cannot be made to).

Nicol Bolas

unread,
Oct 29, 2015, 12:54:29 PM10/29/15
to ISO C++ Standard - Future Proposals
On Thursday, October 29, 2015 at 12:24:33 PM UTC-4, Brent Friedman wrote:
Evan,

Yes that does address the issue I was raising. Of course, if strings and vectors used pointer iterators then we wouldn't need the if test or the overload at all.

Even if those two types use pointer iterators, that shouldn't mean that contiguous iterators ought to be pointers. What about range adaptors? Some of those can be contiguous iterators, but they're certainly not pointers.

Of course, you wouldn't be able to use range adaptors in string_views either. They're more for template code. So there's another strike against this overload. `string_view` is meant for arrays of characters, which in C++ is specified by pointers.

Ultimately, what you want isn't for std::string and std::vector iterators to be pointers. You want them to be convertible to pointers (and vice-versa).

Evan Teran

unread,
Oct 29, 2015, 1:07:42 PM10/29/15
to ISO C++ Standard - Future Proposals

Fair enough. Regarding the contiguous and ordering issues, personally, I file them under the "give the function bad input, get UB" category, but I agree that it's a real concern. If too many inputs result in UB, then it's too easy to misuse or abuse. I think that is enough to make this a non-starter.

That being said, I actually like the "make_string_view" idea, which could have overloads for the various things that we know will work. I have another thought on string_views, but it's mostly unrelated, so I'll make that a separate post.

Thanks for the feedback.

Tony V E

unread,
Oct 29, 2015, 1:16:07 PM10/29/15
to Standard Proposals
On Thu, Oct 29, 2015 at 12:24 PM, Brent Friedman <fourt...@gmail.com> wrote:
Evan,

Yes that does address the issue I was raising. Of course, if strings and vectors used pointer iterators then we wouldn't need the if test or the overload at all.

I think this constructor has a lot of room to be accidentally abused, however.

One way to work around this is with a "make_*" style utility that uses domain knowledge to convert iterators. For example,

string_view make_string_view(array<char>::iterator b, array<char>::iterator e);
string_view make_string_view(vector<char>::iterator b, vector<char>::iterator e);
string_view make_string_view(string::iterator b, string::iterator e);

Just note that array<char>::iterator may be char *, and so might vector<char>::iterator, and string::iterator.
So is that 3 function declarations or 1?
 

Brent Friedman

unread,
Oct 29, 2015, 1:17:57 PM10/29/15
to std-pr...@isocpp.org
Just note that array<char>::iterator may be char *, and so might vector<char>::iterator, and string::iterator.
So is that 3 function declarations or 1?

Yep, you're right. This is another reason why I think the iterators should be pointers. It makes cross-plat development awful.

But I don't want to take this thread too far off-topic. 

Bjorn Reese

unread,
Oct 29, 2015, 1:33:19 PM10/29/15
to std-pr...@isocpp.org
On 10/29/2015 06:16 PM, Tony V E wrote:
>
>
> On Thu, Oct 29, 2015 at 12:24 PM, Brent Friedman <fourt...@gmail.com
> <mailto:fourt...@gmail.com>> wrote:

> One way to work around this is with a "make_*" style utility that
> uses domain knowledge to convert iterators. For example,
>
> string_view make_string_view(array<char>::iterator b,
> array<char>::iterator e);
> string_view make_string_view(vector<char>::iterator b,
> vector<char>::iterator e);
> string_view make_string_view(string::iterator b, string::iterator e);
>
>
> Just note that array<char>::iterator may be char *, and so might
> vector<char>::iterator, and string::iterator.
> So is that 3 function declarations or 1?

An alternative is to let the factory be part of a trait. Something like
this:

template <typename T>
struct string_view_trait {};

template <typename CharT, size_t N>
struct string_view_trait<std::array<CharT, N>>
{ static string_view make(std::array<CharT>::iterator,
std::array<CharT>::iterator); };

etc.

Olaf van der Spek

unread,
Nov 2, 2015, 2:04:38 PM11/2/15
to ISO C++ Standard - Future Proposals
How often do you use iterators with strings? Note that strings member functions mostly are index-based. 

Nevin Liber

unread,
Nov 2, 2015, 2:08:53 PM11/2/15
to std-pr...@isocpp.org
On 2 November 2015 at 13:04, Olaf van der Spek <olafv...@gmail.com> wrote:

How often do you use iterators with strings?

Very.  Basically, any time I apply an algorithm to them, especially predicates such as the ones found in the Boost String Algorithms Library.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Russell Greene

unread,
Nov 2, 2015, 4:29:55 PM11/2/15
to std-pr...@isocpp.org

Any time you have an iterator,  you can have a char*,  and that it also assuming that the Ran is contiguous,  which it isn't garunteed to be.


--

Evan Teran

unread,
Nov 2, 2015, 4:33:29 PM11/2/15
to ISO C++ Standard - Future Proposals
Sure, you can always convert a std::string::iterator to a char* using something like &*it, but why require the extra syntax just to say "starting here". An iterator already represents the concept of "here".

Nevin Liber

unread,
Nov 2, 2015, 4:37:27 PM11/2/15
to std-pr...@isocpp.org
On 2 November 2015 at 15:29, Russell Greene <russell...@gmail.com> wrote:

Any time you have an iterator,  you can have a char*,  and that it also assuming that the Ran is contiguous,  which it isn't garunteed to be.


How is that situation different that having two contiguous iterators pointing into two different collections?  I'm not sure what you are pointing out...
Reply all
Reply to author
Forward
0 new messages