istream_iterator delimiter(s)

1,055 views
Skip to first unread message

Jeaye Wilkerson

unread,
Apr 11, 2014, 3:27:38 AM4/11/14
to std-pr...@isocpp.org
Just as it's possible to write something like:

vector<string> vec; // ...
copy
(begin(vec), end(vec), ostream_iterator<string>(cout, "\n"));

I propose it be possible to write something like:

vector<string> vec;
istringstream iss
; // some input stream with several lines of data
copy
(istream_iterator<string>(iss, "\n"), istream_iterator<string>(), back_inserter(vec)); // ala getline, but supplies an arbitrary number of patterns

The new second parameter (optional) for the istream_iterator constructor represents a collection of all delimiter chars. This would default to all whitespace characters.

Pros:

  • Ability to specify arbitrary tokenization delimiters without the need for extensive ctype specialization or something else nasty
    • Reading something like an IP: 127.0.0.1 into its components can now be specified in terms of istream_iterator's delimiter
  • Relatively familiar approach ala ostream_iterator and getline
  • Backward compatible and non-intrusive
Cons:
  • Locale problems?
  • Only supplied for istream_iterator<string> specialization? (could also work with other types!)
istringstream iss{ "0.1|0.2|0.3" }; // current istream_iterator will choke on this
copy
(istream_iterator<double>(iss, "|"), istream_iterator<double>(), ostream_iterator<double>(cout, " ")); // prints "0.1 0.2 0.3"

I know that each of these can be implemented with custom input iterators; that's not the point.

Jeaye Wilkerson

unread,
Apr 11, 2014, 11:47:55 PM4/11/14
to std-pr...@isocpp.org
Defaulting to all whitespace characters would mean that istream_iterator<string> would by default return only a single string before quitting. Making the argument optional would break all existing istream_iterator<string> code.
 
The idea is that the normal code for finding when to start still remains, the delimiter just says when to end. With that said, istream_iterator<string> would skip preceding whitespace as expected, read all non-whitespace, as expected, until it hits one of the delimiters. I understand the desire for an implementtion; one may be provided.

David Krauss

unread,
Apr 12, 2014, 12:22:13 AM4/12/14
to std-pr...@isocpp.org
On 2014–04–12, at 11:47 AM, Jeaye Wilkerson <con...@jeaye.com> wrote:

Defaulting to all whitespace characters would mean that istream_iterator<string> would by default return only a single string before quitting. Making the argument optional would break all existing istream_iterator<string> code.
 
The idea is that the normal code for finding when to start still remains, the delimiter just says when to end. With that said, istream_iterator<string> would skip preceding whitespace as expected, read all non-whitespace, as expected, until it hits one of the delimiters.

And then it quits and becomes equal to the singular default value std::istream_iterator<string>(), having read at most one single string given the default delimiter set. This isn’t very useful.

You might consider defining an extractor class instead, containing a reference to a string and a set of delimiters. Attempting to extract a delimiter sets failbit. Instantiating istream_iterator with the extractor will provide the desired semantics.

Reply all
Reply to author
Forward
0 new messages