Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Overloading the default c++ scanner

39 views
Skip to first unread message

jacobnavia

unread,
Jul 8, 2017, 6:19:52 PM7/8/17
to
This program reads words from a file and shows the words and their
positions.

1 #include <fstream>
2 #include <iostream>
3 #include <string>
4 #include <map>
5 #include <vector>
6
7 int main(int argc, char **argv)
8 {
9 if (argc == 2) {
10 std::map<std::string, std::vector<size_t>> positions;
11
12 std::ifstream fin(argv[1]);
13 std::string word;
14 while (fin >> word)
15 positions[word].push_back((size_t)fin.tellg() -
word.length());
16 fin.close();
17
18 for (auto &pair : positions) {
19 std::cout << pair.first;
20 for (auto &pos : pair.second)
21 std::cout << " " << pos;
22 std::cout << "\n";
23 }
24 }
25 }

The problem here is that the scanner considers that
"word" and "word,word" are two different words.

Apparently just searches the first non blank character.

How can we change the >> operator to call a user defined function to
scan the characters?

My C++ knowledge doesn't go that far.

Ian Collins

unread,
Jul 8, 2017, 7:17:01 PM7/8/17
to
I would treat this as an input filtering problem and provide a custom
stream buf to convert punctuation to whitespace before the stream
operator sees it.

--
Ian

Öö Tiib

unread,
Jul 8, 2017, 7:29:54 PM7/8/17
to
Annoying habit to put these line numbers there to make it non-copyable.

>
> The problem here is that the scanner considers that
> "word" and "word,word" are two different words.
>
> Apparently just searches the first non blank character.

You want to specify more delimiters in istream than white space?

> How can we change the >> operator to call a user defined function to
> scan the characters?
>
> My C++ knowledge doesn't go that far.

To my knowledge you must imbue stream with suitably for your taste
screwed up locale.

#include <locale>
#include <iostream>
#include <algorithm>
#include <iterator>
#include <vector>
#include <sstream>

class Delims
: public std::ctype<char>
{
mask t_[table_size];
public:
explicit Delims(size_t refs = 0)
: std::ctype<char>(&t_[0], false, refs)
{
std::copy_n(classic_table(), table_size, t_);
t_['-'] = (mask)space;
t_['\''] = (mask)space;
t_[','] = (mask)space;
}
};


int main() {
std::istringstream input("Subway,McDonald's and Burger-King.");
std::locale x(std::locale::classic(), new Delims); // that new
// doesn't leak
input.imbue(x);

std::copy(std::istream_iterator<std::string>(input),
std::istream_iterator<std::string>(),
std::ostream_iterator<std::string>(std::cout, "\n")
);
}

Who uses the C++ streams for anything but for training students to think?

Alf P. Steinbach

unread,
Jul 8, 2017, 7:50:15 PM7/8/17
to
The only possible customization that I recall is to define the notion of
whitespace, and I'm not even sure where to look for that.

But, you can overload `operator>>`, and in that overload, provide your
own custom scanning.

One way to do that is to define a distinct `Word` type, for use as the
overload's (second) formal argument.

Bypassing that complexity, you can just define an ordinary function like

auto word_from( istream& stream )
-> string
{
// Your custom scanning logic here.
}

And by /starting/ with that approach, you can at the end use it to
implement an operator>> overload, if you deem that to be desirable.


Cheers!,

- Alf

jacobnavia

unread,
Jul 9, 2017, 11:14:08 AM7/9/17
to
Thanks to all for your answers
jacob
0 new messages