Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

istringstream constructor performance

15 views
Skip to first unread message

RRick

unread,
Nov 21, 2009, 11:55:06 PM11/21/09
to
I have written some C style static methods that convert strings to
doubles, ints, longs, etc. using the C++ string stream objects. Each
of the converter methods use a locally constructed istringstream
object which uses the >> operator to make the conversions. A typical
method looks like:

double ToDouble( const string & str)
{
std::istringstream iss( str);
double num;
iss >> num;
return num;
}

I have heard that the C routines (sprintf and sscanf) are supposedly
faster than the corresponding C++ objects. I ran a couple of tests
and found that:
* sprintf and ostreamstring are about the same in performance
(within a couple of percentages).
* On the other hand, istringstream (ISS) takes about 82% longer
than the sscanf times. I tried a couple of variations where I used
the ISS clear and str methods instead of the constructor, but that
didn't change the performance. When I made the ISS object static, it
took only 12% longer than as sscanf.

This was compiled with GNU g++ 4.2.4 on Linux 2.6.24. (I did try
posting this message on gnu.g++.help, but got no reply.) The times
for each test for 5 million conversions were:
ISS local 2.79 sec
ISS static 1.78 sec
sscanf 1.52 sec

The following static ISS mods made a big difference in performance and
shortened the total run time by 35%. If I remove the static
declaration, the run times go from 1.78 seconds back to 2.79 seconds.

double ToDouble( const string & str)
{
static std::istringstream iss;

iss.clear(); iss.str( str);
double num;
iss >> num;
return num;
}

For some reason, running the default ISS constructor takes up a lot of
time. Why is this? I have looked at some of the gnu g++ header
source files, but nothing obvious shows up. Could this an issue with
the basic_stringbuf or basic_streambuf classes?

Any ideas on how to fix this? I realize a string converter class
could be created with a local ISS object, but at this point I prefer
the ease of use of the C style methods. I'm looking for ways to make
the construction of the ISS object faster.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Francis Glassborow

unread,
Nov 22, 2009, 8:13:28 AM11/22/09
to
RRick wrote:

>
> double ToDouble( const string & str)
> {
> static std::istringstream iss;
>
> iss.clear(); iss.str( str);
> double num;
> iss >> num;
> return num;
> }
>
> For some reason, running the default ISS constructor takes up a lot of
> time. Why is this? I have looked at some of the gnu g++ header
> source files, but nothing obvious shows up. Could this an issue with
> the basic_stringbuf or basic_streambuf classes?
>
> Any ideas on how to fix this? I realize a string converter class
> could be created with a local ISS object, but at this point I prefer
> the ease of use of the C style methods. I'm looking for ways to make
> the construction of the ISS object faster.
>
>
>

The basic problem is that you are imposing the overhead for
construction/destruction on your use. Stream classes of all kinds have
relatively expensive ctors/dtors. When you switch to static instnaces you
only pay the overhead once rather than every entry to the function.

If speed is an issue, you might try inlining ToDouble. Many compilers will
ignore the request because of the static local data, but a good optimising
whole program compiler should be able to cope. Note that it has to be whole
program because every TU has to be able to use the same static data element.

Claudio Pacati

unread,
Nov 22, 2009, 8:16:28 AM11/22/09
to
On 22 Nov, 05:55, RRick <Rickara...@comcast.net> wrote:
> I have written some C style static methods that convert strings to
> doubles, ints, longs, etc. using the C++ string stream objects.
<snip>

>
> I have heard that the C routines (sprintf and sscanf) are supposedly
> faster than the corresponding C++ objects. I ran a couple of tests
> and found that:
> * sprintf and ostreamstring are about the same in performance
> (within a couple of percentages).
> * On the other hand, istringstream (ISS) takes about 82% longer
> than the sscanf times. I tried a couple of variations where I used
> the ISS clear and str methods instead of the constructor, but that
> didn't change the performance. When I made the ISS object static, it
> took only 12% longer than as sscanf.

Did You remember to test performances with optimization turned on?
It makes a big difference with the C++ standard library. At least with
gcc.

Regards, Claudio Pacati

Ulrich Eckhardt

unread,
Nov 23, 2009, 6:32:55 AM11/23/09
to
RRick wrote:
> I have written some C style static methods that convert strings to
> doubles, ints, longs, etc. using the C++ string stream objects. Each
> of the converter methods use a locally constructed istringstream
> object which uses the >> operator to make the conversions. A typical
> method looks like:
>
> double ToDouble( const string & str)
> {
> std::istringstream iss( str);
> double num;
> iss >> num;
> return num;
> }

Note: This code is broken. The problem is that it fails to check whether the
conversion succeeded, which is must do to get meaningful results.

> I have heard that the C routines (sprintf and sscanf) are supposedly
> faster than the corresponding C++ objects.

C++ IOStreams are a complex system of formatting and parsing plugins for
various types which supports customizations for different cultural
conventions (locales). The printf/scanf functions also know locales, but are
much simpler, allowing them to theoretically be faster, which they also
typically are.

> I ran a couple of tests
> and found that:
> * sprintf and ostreamstring are about the same in performance
> (within a couple of percentages).
> * On the other hand, istringstream (ISS) takes about 82% longer
> than the sscanf times.

Not surprising. In particular the allocation of dynamic memory for streams
is a big overhead.

> I tried a couple of variations where I used the ISS clear and str methods
> instead of the constructor, but that didn't change the performance. When
> I made the ISS object static, it took only 12% longer than as sscanf.

The latter variant will break. If you only once feed it any nonparsable
data, you will have the streamstate set to 'fail'. Hence my suggestion to
check for errors.

That said, there is the strtod() function, which does only parsing of
numbers into a double. This avoids the overhead of both IOStreams and the
format-string parsing of sscanf(). The only overhead is that the radix char
is locale-dependent, but that could even be necessary.

> Any ideas on how to fix this? I realize a string converter class
> could be created with a local ISS object, but at this point I prefer
> the ease of use of the C style methods. I'm looking for ways to make
> the construction of the ISS object faster.

For your info: I find the idea of creating objects for parsing a single
value strange, too. Also, it really creates lots of unnecessary overhead.
I'd go with strtod() or even a locale-independent version thereof. The
locale-independent one is especially important when parsing files or similar
input.

Uli

RRick

unread,
Nov 23, 2009, 6:57:36 AM11/23/09
to

I realize the problem is with the constructor, but I wonder what is a
default istringstream (ISS) constructor/destructor doing that takes up
35% of the total run. I would expect the ISS str method to take up a
big portion of the time, but not the default constructor.

As for optimization of my program, it is not too important here
because the issue is with the ISS constructor, which is compiled
during the compiler creation and then stored in the C++ library. The
default compiler creation flags are -g -O2, which should be good
enough.


I did run the test program through valgrind using the callgrind tool.
That was interesting. The ISS construction and destruction takes
around 23% (15.5% for construction, 7.5% for destruction) of the total
time. 75% of the ISS construction time is spent creating a Locale
object and calling a _M_cache_locale method. This is what is taking so
much time.

The destructor looks correct because it is spending most of its time
deallocating the basic_stringbuff. What is odd is that two Locales
and two calls to the _M_cache_locale are made per ISS constructor
call. Perhaps one is used for the istringstream and one for the
stringbuff objects. What ever is happening in the ISS construction,
it doesn't look very efficient. I always assume caches to be
lightning fast, but this doesn't seem to be the case here.


There are a few loose ends in this analysis. First, I can account for
only 23% of the total time spent in the ISS constructor/destructor but
not the 35% of the total run. I don't know where the other 12% went.
The local ISS construction of the locale object and caching looks
bad. Why two calls? But assume only one set of calls were made; that
would save only 8% of the total run. An improvement, but not much of
one.

Nick Hounsome

unread,
Nov 23, 2009, 4:45:33 PM11/23/09
to

Personal preference: strtod

If you have a high pain threshold it is possible to use the stream
conversion facets directly but it is not pretty and almost impossible
to find any helpful documentation. In particular I know of no C++ book
that covers facets to any useful extent.

RRick

unread,
Nov 24, 2009, 12:32:29 AM11/24/09
to
Ulrich is correct that the code for ToDouble is not complete. For
clarity, I left out the error checking. Also, a static ISS object
will not work in general unless it is protected from multiple
accesses. The static object was only a test to see what effect the
constructor had on the total time.


Using valgrind/callgrind I can find no allocating of resources in the
ISS constructor or its base classes. The only allocation I can find
is in the ISS set method. There is some deallocating in the ISS
destructor but that is a result of the set method.


I do like the strto* method idea. That set of methods sounds like a
good way to circumvent the overhead of the ISS objects. This does
force the string values to be locale independent, but that might not
be all that limiting.


This set of utility routines is composed of two types of converters:
ToXxx and ToString( Xxx num), where Xxx is a double, long, int, etc.

The ToString is particularly useful since std::string does not do
numeric conversions. Its main use is in creating messages for
exceptions (or anything else) that needs a string value. My code
tends to have lots of ostringstream objects littered about that do one
or maybe two conversions and then go out of scope.

The ToString is also helpful in creating string entries for logs. The
ostream >> operator is not atomic and if multiple processes/threads
use >>, the logs are corrupted with jumbled messages. One solution is
to send out a single string per entry via the low level write.

I haven't used the ToXxx methods all that much. They do tend to be
overlooked in favor of io/ifstreams. If your input parsing can be
done at the point of contact, a single istream is the way to go.

RRick

unread,
Nov 24, 2009, 10:59:45 AM11/24/09
to

> If you have a high pain threshold it is possible to use the stream
> conversion facets directly but it is not pretty and almost impossible
> to find any helpful documentation. In particular I know of no C++ book
> that covers facets to any useful extent.

Facets are called from the _M_cache_locale method that takes up a lot
of time. The basic_ios class has Facet pointer storage for a
_M_cytpe, _M_num_get, and _M_num_put.

Any explanation of facets you have would be of interest. I probably
won't use them directly, since I'm trying to be compiler indifferent,
but I do have an interest in what is going on here.

Claudio Pacati

unread,
Nov 28, 2009, 3:42:52 AM11/28/09
to
> This set of utility routines is composed of two types of converters:
> ToXxx and ToString( Xxx num), where Xxx is a double, long, int, etc.
>
> The ToString is particularly useful since std::string does not do
> numeric conversions. Its main use is in creating messages for
> exceptions (or anything else) that needs a string value. My code
> tends to have lots of ostringstream objects littered about that do one
> or maybe two conversions and then go out of scope.

Why defining separate methods, all of them using ostringstream, when
you can use a single template?
In my personal set of standard utilities I have a single

template<typename T>
std::string toString(const T& t, size_t prec=-1, size_t acc=-1)

where prec and acc are precision and accuracy, meaningful only for
floating point types (the default -1 means: use the standard of
ostringstream).

regards, Claudio Pacati

0 new messages