Add std::begin and std::end for regular expression iterators

228 views
Skip to first unread message

frederik...@gmail.com

unread,
Feb 20, 2018, 3:35:28 AM2/20/18
to ISO C++ Standard - Future Proposals
Currently, it is not possible to use extended for-loops with stuff like std::sregex_iterator, because there is no .begin()/.end() pair and no std::begin/std::end pair for std::sregex_iterator -- adding an overload for std::sregex_iterator to the std-namespace is trivial and doing so will allow using them in extended for-loops.

I'm already doing this in production code and so far, have found no issues with it.

My implementation is fairly simple: std::begin(std::sregex_iterator&) returns the argument and std::end(std::sregex_iterator&) return a default constructed std::sregex_iterator:


namespace std {
auto end(std::sregex_iterator& ) -> std::sregex_iterator& {
static std::sregex_iterator end_of_sequence_iterator;
return end_of_sequence_iterator;
}

auto begin(std::sregex_iterator& b) -> std::sregex_iterator& {
return b;
}
}

Nicol Bolas

unread,
Feb 20, 2018, 9:54:20 AM2/20/18
to ISO C++ Standard - Future Proposals, frederik...@gmail.com
On Tuesday, February 20, 2018 at 3:35:28 AM UTC-5, frederik...@gmail.com wrote:
Currently, it is not possible to use extended for-loops with stuff like std::sregex_iterator, because there is no .begin()/.end() pair and no std::begin/std::end pair for std::sregex_iterator --

You don't use range-based for on iterators; you use them on ranges. That's why it's called "range-based for".

What we ought to have is an `sregex_range` type.

frederik...@gmail.com

unread,
Feb 20, 2018, 4:52:54 PM2/20/18
to ISO C++ Standard - Future Proposals, frederik...@gmail.com

As long as I can use a "range-based" for loop, I'm fine. The needed changes to the library is just very tiny if you accept that a std::sregex_iterator is the beginning of the range and the end is whatever it is; no new types required and things fits naturally in how the library and the language already works.

If you want to add a `sregex_range` type, I'm sure that can be done better with a generic range type for all iterator like types, rather than having a specialized `sregex_range`.

Maybe sregex_iterator shouldn't have had _iterator in its name.

A `sregex_range` could be implemented quite easily as well though: just combine a  begin and an end, with a default on end the to a default constructed end.


This should do the trick:

struct sregex_range : std::pair<std::sregex_iterator, std::sregex_iterator> {
    sregex_range(std::sregex_iterator begin, std::sregex_iterator end = std::sregex_iterator()) : pair(begin, end) { }
    std::sregex_iterator begin() { return first; }
    std::sregex_iterator end() { return second; }
};

Arthur O'Dwyer

unread,
Feb 20, 2018, 5:35:12 PM2/20/18
to ISO C++ Standard - Future Proposals, frederik...@gmail.com
On Tuesday, February 20, 2018 at 12:35:28 AM UTC-8, frederik...@gmail.com wrote:
Currently, it is not possible to use extended for-loops with stuff like std::sregex_iterator, because there is no .begin()/.end() pair and no std::begin/std::end pair for std::sregex_iterator -- adding an overload for std::sregex_iterator to the std-namespace is trivial and doing so will allow using them in extended for-loops.

I'm already doing this in production code and so far, have found no issues with it.

Correct. I provided iterable iterators as part of the sample code for my SCM Challenge puzzle at CppCon last year, IIRC.

I have a published proposal on the topic here:
It was in the pre-ABQ 2017 mailing, if I recall correctly; but it was not discussed in LEWG because I forgot about it.
I would be happy for anybody else to pick it up and run with it.

–Arthur

Nicol Bolas

unread,
Feb 20, 2018, 6:41:15 PM2/20/18
to ISO C++ Standard - Future Proposals, frederik...@gmail.com
The thing is, the regex library as a whole needs a full-fledged modernization pass. Ranges should be a part of that (but proper RangeTS versions, with iterator/sentinel pairs). The values of regex matches/iterators deal a lot in `std::string` or `charT*`s; these are obvious places where `string_view`s should be provided. And I'm sure other people have other ideas.

frederik...@gmail.com

unread,
Feb 21, 2018, 2:08:22 AM2/21/18
to ISO C++ Standard - Future Proposals, frederik...@gmail.com
Den onsdag den 21. februar 2018 kl. 00.41.15 UTC+1 skrev Nicol Bolas:
The thing is, the regex library as a whole needs a full-fledged modernization pass. Ranges should be a part of that (but proper RangeTS versions, with iterator/sentinel pairs). The values of regex matches/iterators deal a lot in `std::string` or `charT*`s; these are obvious places where `string_view`s should be provided. And I'm sure other people have other ideas.

Using string_view instead of string is definitely welcome, but from my point of view this is not the major problem; it's a non-functional enhancement that only deals with efficiency (yieks! a c++ developer who doesn't think efficiency is everything!). Also, it's fairly trivial to add it.

Do you know of any other suggestions, besides Arthur's (who actually did the work I was suggesting)?
Reply all
Reply to author
Forward
0 new messages