Effeciency of std::string.

Eric Tetz

unread,

Nov 22, 1999, 3:00:00 AM11/22/99

to

I just started reading "Effective C++" by Scott Myers. In the introduction he
says, "As for raw char*-based strings, you shouldn't use those antique throw-
backs unless you have a VERY good reason. Well-implemented string types can now
be superior to char*s in virtually every way - including effeciency."

How I want to believe this! I would love to use std::string instead of naked
character buffers. However, every informal benchmark I've every done (I do one
each time I get an update of my compiler) shows std::string to be VASTLY slower
than C-Style strings. The last test I did, in preperation for this post, showed
std::string as 8 times slower than C style strings. This just doesn't make any
sense - even if std::string was just a wrapper for C strings, I should do better
than that.

At this point I'm assuming my test code is hopelessly naive... can somebody
show me the error of my ways and restore my faith in std::string (and thereby
resolve the mystery of Dr.Myers comment)?

// Compare std::string and C style null-terminated strings
// for small string operations...

#include <cstdio>
#include <ctime>
#include <string>

using std::string;

void TestString()
{
string a, b, c;

a.reserve (128);
b.reserve (128);
c.reserve (128);

clock_t begin, end;

begin = clock ();

for (int i = 0; i < 1000000; ++i)
{
a = "Hello";
b = " world";
c = "! How'ya doin'?";
a += b;
a += c;
c = "Hello world! What's up?";
if (c != a)
c = "Doh!";
}

end = clock ();

printf ("TestString elapsed time: %2.1fl\n",
static_cast <double> ((end - begin) / CLOCKS_PER_SEC));
}

void TestPChar()
{
char a [128];
char b [128];
char c [128];

clock_t begin, end;

begin = clock ();

for (int i = 0; i < 1000000; ++i)
{
strcpy (a, "Hello");
strcpy (b, " world");
strcpy (c, "! How'ya doin'?");
strcat (a, b);
strcat (a, c);
strcpy (c, "Hello world! What's up?");
if (strcmp (c, a) == 0)
strcpy (c, "Doh!");
}

end = clock ();

printf ("TestPChar elapsed time: %2.1fl\n",
static_cast <double> ((end - begin) / CLOCKS_PER_SEC));
}

int main()
{
TestPChar();
TestString();

return 0;
}

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Carl Barron

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

Eric Tetz <et...@accolade.com> wrote:

> I just started reading "Effective C++" by Scott Myers. In the introduction he
> says, "As for raw char*-based strings, you shouldn't use those antique throw-
> backs unless you have a VERY good reason. Well-implemented string types can
>>now
> be superior to char*s in virtually every way - including effeciency."
>
> How I want to believe this! I would love to use std::string instead of naked
> character buffers. However, every informal benchmark I've every done (I do
> one
> each time I get an update of my compiler) shows std::string to be VASTLY
> slower
> than C-Style strings. The last test I did, in preperation for this post,
> showed
> std::string as 8 times slower than C style strings. This just doesn't
> make any
> sense - even if std::string was just a wrapper for C strings,
> I should do better than that.
>
> At this point I'm assuming my test code is hopelessly naive... can somebody
> show me the error of my ways and restore my faith in std::string (and thereby
> resolve the mystery of Dr.Myers comment)?
>

[code snipped]

if I use strings member functions assign() and append() I do get
std::string to be twice as fast as C strings. If I use your original
code I get C strins to be 6.5 times faster than std::string. The loop in
TestString becomes in my faster version.

for(int i=0;i<100000;++i)
{
a.assign("Hello");
b.assign(" world");
c.assign("! How are you doin'?");
a.append(b);
a.append(c);
c.assign("Hello world! What's up?");
if(c!=a)
c.assign("Dah!");

Internal

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

Just a question of my own...

It seems kind of odd to me that in the VC++ version of the STL that string
will grow the buffer if the minimum size is less than what has been
reserved. If I've reserved room for 128 characters, why would the string
class need to reallocate the buffer when the minimum size is 31?

Confusing...

--
Justin Rudd

Eric Tetz <et...@accolade.com> wrote in message
news:MPG.12a33979f...@pumba.class.udg.mx...

[snip sample code]

Axel Schmitz-Tewes

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

I can try your code with my compilers and see a big difference in the runtime of
the released or debug versions.

TestPChar TestString
Watcom 11
debug version: 2.01 17.03
release version 2.01 5.01

MSVC6
debug version 2.01 32.01
release version 2.01 3.01

I have used only the standard switches.

It looks like doing some optimizing switches by hand could give
you a good feeling using the string class, but AFAIK faster than
C-string handling seems hopeless.

regards,
axel

Alexei Zakharov

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

Eric Tetz <et...@accolade.com> wrote in message
news:MPG.12a33979f...@pumba.class.udg.mx...
> I just started reading "Effective C++" by Scott Myers. In the
introduction he
> says, "As for raw char*-based strings, you shouldn't use those antique
throw-
> backs unless you have a VERY good reason. Well-implemented string types
can now
> be superior to char*s in virtually every way - including effeciency."
>
> How I want to believe this! I would love to use std::string instead of
naked
> character buffers. However, every informal benchmark I've every done (I
do one
> each time I get an update of my compiler) shows std::string to be VASTLY
slower
> than C-Style strings. The last test I did, in preperation for this post,
showed
> std::string as 8 times slower than C style strings. This just doesn't
make any
> sense - even if std::string was just a wrapper for C strings, I should do
better
> than that.

Strange... VC++ 6.0 makes code with 2.6 sec and 3.0 sec for char* and string
respectively. Although gcc 2.90 -- 0.8 and 5.3!

Vetle Roeim

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

* Eric Tetz

> I just started reading "Effective C++" by Scott Myers. In the introduction he
> says, "As for raw char*-based strings, you shouldn't use those antique throw-
> backs unless you have a VERY good reason. Well-implemented string types can now
> be superior to char*s in virtually every way - including effeciency."
>
> How I want to believe this! I would love to use std::string instead of naked
> character buffers. However, every informal benchmark I've every done (I do one
> each time I get an update of my compiler) shows std::string to be VASTLY slower
> than C-Style strings. The last test I did, in preperation for this post, showed
> std::string as 8 times slower than C style strings. This just doesn't make any
> sense - even if std::string was just a wrapper for C strings, I should do better
> than that.
>

> At this point I'm assuming my test code is hopelessly naive... can somebody
> show me the error of my ways and restore my faith in std::string (and thereby
> resolve the mystery of Dr.Myers comment)?

I suspect that the speed of std::string depends a lot on the
implementation. I ran your code through two compilers, GNU C++
Compiler 2.95.1 and Sun Workshop C++ Compiler 5.0.

Output from code compiled with g++:
TestPChar elapsed time: 5.0l
TestString elapsed time: 8.0l

Output from code compiled with Sun C++ Compiler:
TestPChar elapsed time: 3.0l
TestString elapsed time: 18.0l

As you can see, both were slower with std::string, but Sun C++
Compiler was a lot slower.. Of course; this isn't exactly a
"cleanroom" test. .. to say the least.

I'm afraid I can't restore your faith, but I doubt I'll go back to
C-style strings.. ever. std::strings is _so_ much easier to use.

(The code was compiled and run on a Sun Ultra 1, running Solaris 5.7)

vr
--
Vetle Roeim
Departement of Informatics, University of Oslo, Norway

f u cn rd ths u r prbly a gk

Internal

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

I don't know whose implementation you are using, but I see the same thing as
you using VC++ 6.0. From what I've seen, even though you call reserve as
soon as your call operator =(char*), it deallocates the buffer and allocates
a new one. So you can reserve a meg (a bit drastic I know), but as soon as
you assign 1 char, that meg gets freed and another buffer is allocated.

SGI's (and therefore STLPort) allocators are supposed to be much faster.
I've never had a chance to benchmark them though. Maybe someone else on
this list has. I'd be interested in seeing how much faster SGI's
implementation is.

I know this isn't much help but at least you aren't alone :-)

--
Justin Rudd

Eric Tetz <et...@accolade.com> wrote in message
news:MPG.12a33979f...@pumba.class.udg.mx...

[snip sample code]

Marc Lepage

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

Eric Tetz wrote:
>
> I just started reading "Effective C++" by Scott Myers. In the introduction he
> says, "As for raw char*-based strings, you shouldn't use those antique throw-
> backs unless you have a VERY good reason. Well-implemented string types can
now
> be superior to char*s in virtually every way - including effeciency."
>
> How I want to believe this! I would love to use std::string instead of naked
> character buffers. However, every informal benchmark I've every done (I do
one
> each time I get an update of my compiler) shows std::string to be VASTLY
slower
> than C-Style strings. The last test I did, in preperation for this post,
showed
> std::string as 8 times slower than C style strings. This just doesn't make
any
> sense - even if std::string was just a wrapper for C strings, I should do
better
> than that.
>
> At this point I'm assuming my test code is hopelessly naive... can somebody
> show me the error of my ways and restore my faith in std::string (and thereby
> resolve the mystery of Dr.Myers comment)?
>

Your test code is *not* equivalent. The std::string code will always work; it
will never overflow. The char* code *happens* to work with short strings, but
will overflow with longer strings.

Put differently, you have optimized the latter code by making assumptions about
input size, that you have not made in the former code.

They are incomparable. Apples vs. oranges.

--
Marc Lepage
Software Developer
Molecular Mining Corporation
http://www.molecularmining.com/

Pete Becker

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

Internal wrote:
>
> Just a question of my own...
>
> It seems kind of odd to me that in the VC++ version of the STL that string
> will grow the buffer if the minimum size is less than what has been
> reserved. If I've reserved room for 128 characters, why would the string
> class need to reallocate the buffer when the minimum size is 31?
>
> Confusing...
>

How did you determine that the VC++ implementation does this? The code
for the reserve function simply sets aside the requested amount of
space. The reported timings in other replies in this thread don't seem
to indicate that there is any reallocation taking place.

--
Pete Becker
Dinkumware, Ltd.
http://www.dinkumware.com

Andy Philpotts

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

In article <1e1p7f6.1uq...@buf-ny1-16.ix.netcom.com>,
cbar...@ix.netcom.com says...

> Eric Tetz <et...@accolade.com> wrote:
>
>
> if I use strings member functions assign() and append() I do get
> std::string to be twice as fast as C strings. If I use your original
> code I get C strins to be 6.5 times faster than std::string. The loop in
> TestString becomes in my faster version.
>
> for(int i=0;i<100000;++i)
> {
> a.assign("Hello");
> b.assign(" world");
> c.assign("! How are you doin'?");
> a.append(b);
> a.append(c);

> c.assign("Hello world! What's up?");
> if(c!=a)
> c.assign("Dah!");
> }
>

This seems well strange, it suggests that

a = "Hello" is very different to a.assign("Hello") and/or that
a.append(b) is very different to a += b.

Why should that be so, if assign and append are so damn fast why not
define the = and += as inlines based on the fast versions...

I think what we are dealing with here (in general with std::string vs
char*) is a combination of poor implementation and dubious compiler
optimization.

I would further observe that the performance aspects of code can (in many
of the cases I meet in real life) be quite superfluous compared to the
time to develop and debug. In this respect I find that std::string
clearly wins the day for all but very competant C programmers (a rare
breed these days).

--
Andy Philpotts
Please replace "127.0.0.1" with "Reciprocal.com" when replying by e-mail!

John Hickin

unread,

Nov 23, 1999, 3:00:00 AM11/23/99

to

Alexei Zakharov wrote:
>

>
> Strange... VC++ 6.0 makes code with 2.6 sec and 3.0 sec for char* and string
> respectively. Although gcc 2.90 -- 0.8 and 5.3!
>

Compiler: VC6

Target char* string
Debug: 1.01 25.01
Release: 3.01 3.01

Can anybody make sense of this?

Bill Wade

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

Eric Tetz wrote in message ...

>I just started reading "Effective C++" by Scott Myers. In the introduction
he
>says, "As for raw char*-based strings, you shouldn't use those antique
throw-
>backs unless you have a VERY good reason. Well-implemented string types can
now
>be superior to char*s in virtually every way - including effeciency."
>

>At this point I'm assuming my test code is hopelessly naive... can
somebody
>show me the error of my ways and restore my faith in std::string (and
thereby
>resolve the mystery of Dr.Myers comment)?

Your test code tests three string operations: assignment, concatenation, and
equality comparison. std::string is safer than char* for all three
operations.

When both arguments are the same type (either both string or both char*) and
have enough capacity char* will usually be faster for short arguments, but
string operations have some advantages when the arguments get longer. When
the argument types are mixed (one char* and one string), you'll usually get
the slowest results.

For assignment strcpy has to
copy characters
test for end-of-string

For assignment from char* to std::string the operations are:
Make sure there is enough room (if not make more room).
copy characters
test for end-of-string
at best this is probably 50% more expensive than strcpy.

For assignment from std::string to std::string you can compute lengths in
O(1) time so assignment can be about as fast as strcpy. However string to
string copies don't need to look for a terminating NULL so on some platforms
the actual copy can be faster (copy four or eight bytes at a time).

In single-threaded applications COW can make many assignments happen in O(1)
time, but this may actually increase the number of times memory is allocated
or freed, so its biggest benefits occur when strings are long.

For concatenation, strcat has to find the end of the first string and then
perform strcpy. std::string should be able to find the end of the first
string in O(1) time, but this doesn't help much for small strings. The
strcpy operation was discussed above.

For equality comparison, strcmp has to look at each character until a
mismatch or zero is found. std::string comparison should probably test
lengths first (in O(1)) time, making it faster for your example. In
principle it is possible for strings to test contents in blocks (four or
eight characters at a time), but char_traits may mess things up enough that
it is difficult to do in practice.

Anders J. Munch

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

Marc Lepage wrote in message <383AB701...@molecularmining.com>...

>Your test code is *not* equivalent. The std::string code will always
work; it
>will never overflow. The char* code *happens* to work with short
strings, but
>will overflow with longer strings.
>
>Put differently, you have optimized the latter code by making
assumptions about
>input size, that you have not made in the former code.
>
>They are incomparable. Apples vs. oranges.

This was my first thought as well. So I rewrote TestPChar() to use
dynamically
resized C strings, the realloc way. Included below. To my surprise, it
is _still_ faster, though now only by a factor of two, whereas the
original is
almost four times faster (BCB4). Puzzling.

Maybe the conclusion is that the "Well-implemented string types" Scott
mentions, do not exist yet??

Another possibility is that the std::string implementations tested use
copy-on-write, which the strings in this test don't benefit from because
they are too small and too often modified?

- Anders

Realloc-version of TestPChar:

void TestPChar()
{
char* a = NULL;
char* b = NULL;
char* c = NULL;

std::clock_t begin, end;

begin = std::clock ();

for (int i = 0; i < 1000000; ++i)
{

// quick hack: yes, I know these macros are bad code in
// oh-so-many-ways.
#define STRCAT(A,B) A = (char*) realloc(A, strlen(A)+strlen(B));
\
strcat(A,B)
#define STRCPY(A,B) A = (char*) realloc(A, strlen(B)); \
strcpy(A,B)

// STRCOPY_CONST: Fast assignment for literal strings;
// replacing with STRCPY makes very little difference
#define STRCOPY_CONST(a, S) a = (char*)realloc(a, sizeof(S));\
strcpy(a,S)

STRCOPY_CONST(a, "Hello");
STRCOPY_CONST(b, " world");
STRCOPY_CONST(c, "! How'ya doin'?");
STRCAT(a, b);
STRCAT (a, c);
STRCPY(c, "Hello world! What's up?");

if (strcmp (c, a) == 0)

{
STRCOPY_CONST(c, "Doh!");
}
}

end = std::clock ();

std::printf ("TestPChar elapsed time: %2.1fl\n",

static_cast <double> ((end - begin) / CLOCKS_PER_SEC));
}

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Internal

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

Pete Becker <peteb...@acm.org> wrote in message
news:383ABE99...@acm.org...

> How did you determine that the VC++ implementation does this? The code
> for the reserve function simply sets aside the requested amount of
> space. The reported timings in other replies in this thread don't seem
> to indicate that there is any reallocation taking place.

I looked at the code. If you look at the VC++ implementation of
assign(const char* s, size_type n) (which is what operator = calls), you
will see a call for _Grow(n,true). Inside _Grow you will hit the last else
block.

Basically it looks like this...

if(_Trim && (_MIN_SIZE < _Res || _Res < _N))

Well _Trim is true from the true passed in. _MIN_SIZE is 31 and _Res is 159
from the call to reserve(which allocates some extra room). So it calls
_Tidy which deallocates the buffer of 159 characters and creates a buffer of
31 characters (with a call to _Copy).

Now the next time through the loop, assign does NOT reallocate the buffer
(because _MIN_SIZE and _Res are equal). It will reuse the buffer that has
already been allocated. So the deallocating of the buffer from reserve
probably doesn't have that adverse of effect on the timings.

But my other question still stands. Why does string reallocate the buffer
when we reserve 128 characters??

In fact if you change a = "Hello" to a="Hello<30spaces>", it will always
reallocate the buffer of a because after you assign the string, _Res is 65.
Well that makes _MIN_SIZE < _Res always true. So if you have a string that
has a buffer of 32 characters and you assign one character, it will
deallocate the buffer and reallocate a buffer of size 31.

--
Justin Rudd

Pete Becker

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

Pete Becker wrote:
>
> Internal wrote:
> >
> > Just a question of my own...
> >
> > It seems kind of odd to me that in the VC++ version of the STL that string
> > will grow the buffer if the minimum size is less than what has been
> > reserved. If I've reserved room for 128 characters, why would the string
> > class need to reallocate the buffer when the minimum size is 31?
> >
> > Confusing...
> >
>

> How did you determine that the VC++ implementation does this? The code
> for the reserve function simply sets aside the requested amount of
> space. The reported timings in other replies in this thread don't seem
> to indicate that there is any reallocation taking place.
>

Never mind...

--
Pete Becker
Dinkumware, Ltd.
http://www.dinkumware.com

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Alex Vinokur

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

In article <MPG.12a33979f...@pumba.class.udg.mx>, Eric Tetz
<et...@accolade.com> wrote:

[snip]

> However, every informal benchmark I've every
> done (I do one
> each time I get an update of my compiler) shows std::string to be
> VASTLY slower
> than C-Style strings.

[snip]

See http://alexvn.homepage.com/alexvn.html
Click on :
- Performance : C vs. STL (#1)
- Performance : C vs. STL (#2)
- Performance : Access to array, vector, basic_string

Alex

* Sent from RemarQ http://www.remarq.com The Internet's Discussion Network *
The fastest and easiest way to search and participate in Usenet - Free!

Carl Barron

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

Andy Philpotts <And...@127.0.0.1> wrote:

Correct, something was done to my above test code [since deleted] but
recoding the test program I get operator version essentially the same as
append/copy but if I specialize the operations to just copy data like
str*() does then I get a much better comparision.

Conclusion: this implimentation of std::string does not take advantage
of the fact there is enough space in the strings not to require
temporaries and copies.
I added a fourth test that uses pointers and iterators to do the
copying/appending and that provides a test with std::string that is
slower than <cstring> functions, but actually not too bad a comparision
with <cstring> functions. [done with codewarrior pro 5.2/4.0
strings...]

My actual results were [clock_ticks, not seconds..]
starting...
559:using operator =,+=
555:using assign,append
179:using C strings
228:modified workings
done!!!

code follows:
/* tester.h */
#pragma once
#include <ctime>
#include <string>

struct tester
{
virtual const char *name()const=0;
virtual std::clock_t timer()=0;
virtual ~tester(){}
};

struct test_one:public tester
{
const char *name() const {return "using operator =,+=";}
std::clock_t timer();
};

struct test_two:public tester
{
const char *name() const {return "using assign,append";}
std::clock_t timer();
};

struct test_three:public tester
{
const char *name() const {return "using C strings";}
std::clock_t timer();
};

struct test_four:public tester
{
static void copy(const char *,std::string &);
static void add(std::string &,std::string &);

const char *name()const {return "modified workings";}
std::clock_t timer();
};

/* tester.cp */

#include "tester.h"
#include <string>
using namespace std;

void test_four::copy(const char *in,std::string &out)
{
std::string::iterator it = out.begin();
while(*in)
{
*it = *in++;
++it;
}
}

void test_four::add(std::string &a,std::string &b)
{
std::string::iterator i=a.end(),j=b.begin();
while(j!=b.end())
{
*i = *j;
++i;
++j;
}
}

std::clock_t test_one::timer()
{
string a, b, c;

a.reserve (128);
b.reserve (128);
c.reserve (128);

clock_t begin, end;

begin = clock ();

for (int i = 0; i < 1000000; ++i)
{

a = "Hello";
b = " world";
c = "! How'ya doin'?";
a += b;
a += c;
c = "Hello world! What's up?";
if (c != a)
c = "Doh!";
}

end = clock ();
return end-begin;
}

std::clock_t test_four::timer()
{
string a,b,c;
a.reserve(128);
b.reserve(128);
c.reserve(128);

clock_t begin,end;

begin = clock();

for (int i = 0; i < 1000000; ++i)
{

copy("Hello",a);
copy("world",b);
copy("! How yq doin'",c);
add(a,b);
add(a,c);
if(c!=a)
copy("Dah!",c);
}
end = clock();
return end-begin;
}

std::clock_t test_two::timer()
{
string a, b, c;

a.reserve (128);
b.reserve (128);
c.reserve (128);

clock_t begin, end;

begin = clock ();

for (int i = 0; i < 1000000; ++i)
{

a.assign("Hello");
b.assign(" world");

c.assign("! How'ya doin'?");
a.append(b);
a.append(c);
c.assign("Hello world! What's up?");
if (c != a)
c.assign("Doh!");
}

end = clock ();
return end-begin;
}

std::clock_t test_three::timer()

{
char a [128];
char b [128];
char c [128];

clock_t begin, end;

begin = clock ();

for (int i = 0; i < 1000000; ++i)
{

strcpy (a, "Hello");
strcpy (b, " world");
strcpy (c, "! How'ya doin'?");
strcat (a, b);
strcat (a, c);

strcpy (c, "Hello world! What's up?");

if (strcmp (c, a) == 0)

strcpy (c, "Doh!");
}

end = clock ();
return end-begin;
}
/* driver.cp */
#include "tester.h"
#include <iostream>
#include <iomanip>
#include <algorithm>

namespace
{
struct print
{
std::ostream &os;
print(std::ostream &a):os(a){}
void operator () (tester *x)
{
os << std::setw(10) << x->timer() << ':' <<
x->name() << std::endl;
}
};

test_one one;
test_two two;
test_three three;
test_four four;
tester *tests[] = {&one,&two,&three,&four};
}

int main()
{
std::cout << "starting...\n";
std::for_each(tests,tests+4,print(std::cout));
std::cout << "done!!!\n";

Vetle Roeim

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

* Anders J. Munch

> Marc Lepage wrote in message <383AB701...@molecularmining.com>...
> >Your test code is *not* equivalent. The std::string code will always
> work; it
> >will never overflow. The char* code *happens* to work with short
> strings, but
> >will overflow with longer strings.
> >
> >Put differently, you have optimized the latter code by making
> assumptions about
> >input size, that you have not made in the former code.
> >
> >They are incomparable. Apples vs. oranges.
>
>
> This was my first thought as well. So I rewrote TestPChar() to use
> dynamically
> resized C strings, the realloc way. Included below. To my surprise, it
> is _still_ faster, though now only by a factor of two, whereas the
> original is
> almost four times faster (BCB4). Puzzling.
>
> Maybe the conclusion is that the "Well-implemented string types" Scott
> mentions, do not exist yet??

Have you read Carl Barron's response to your article?
(Message-ID: <1e1p7f6.1uq...@buf-ny1-16.ix.netcom.com>)

vr
--
Vetle Roeim
Departement of Informatics, University of Oslo, Norway

f u cn rd ths u r prbly a gk

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Pierre Baillargeon

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

"Anders J. Munch" wrote:
>
> This was my first thought as well. So I rewrote TestPChar() to use
> dynamically
> resized C strings, the realloc way. Included below. To my surprise, it
> is _still_ faster, though now only by a factor of two, whereas the
> original is
> almost four times faster (BCB4). Puzzling.

This is still not equivalent, although it shows a real-life difference
that can occur in real code. The basic problems are:

- some compilers "know" the string C functions and replace them with
inline assembly code (Visual C++ 6.0 does it in Release mode).

- Your strings are constant and visible in the compilation unit.

Put these two together and you end up with highly optimized code. It can
happen that all you do in a real program is manipulate constant strings
declared locally in a single function. But that is unlikely.

A better test would declare the strings in another compile unit (and, to
be paranoid, not static). The fact that the compiler generate in-line
assembly is fair since it can do it all the time and is a valid
advantage of the C functions.

Daniel T.

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

In article <383AC9DD...@nortelnetworks.com>, "John Hickin"
<hic...@nortelnetworks.com> wrote:

>Alexei Zakharov wrote:
>>
>
>>
>> Strange... VC++ 6.0 makes code with 2.6 sec and 3.0 sec for char* and string
>> respectively. Although gcc 2.90 -- 0.8 and 5.3!
>>
>
>Compiler: VC6
>
>Target char* string
>Debug: 1.01 25.01
>Release: 3.01 3.01
>
>Can anybody make sense of this?

Chances are taht in debug mode, no inlining is taking place...

Marc Lepage

unread,

Nov 24, 1999, 3:00:00 AM11/24/99

to

"Anders J. Munch" wrote:
>
> This was my first thought as well. So I rewrote TestPChar() to use
> dynamically
> resized C strings, the realloc way.

That's an improvement. The code is still not equivalent. When you work with
char*, you use char* exclusively. When you use std::string, you are still mixing
in char*.

I believe that this involves only a pointer assignment:

char* p = "foo";

I believe this involves a little more work:

std::string s = "foo";

So perhaps try a TestString like this:

void TestString()
{
static const string sHello = "Hello";
static const string sWorld = " world";
static const string sHow = "! How'ya doin'?";
static const string sWhat = "Hellow world! What's up?";
static const string sDoh = "Doh!";

string a, b, c;

a.reserve (128);
b.reserve (128);
c.reserve (128);

clock_t begin, end;

begin = clock ();

for (int i = 0; i < 1000000; ++i)
{

a = sHello;
b = sWorld;
c = sHow;

a += b;
a += c;

c = sWhat;
if (c != a)
c = sDoh;
}

end = clock ();

printf ("TestString elapsed time: %2.1fl\n",

static_cast <double> ((end - begin) / CLOCKS_PER_SEC));
}

--

Marc Lepage
Software Developer
Molecular Mining Corporation
http://www.molecularmining.com/

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Carl Barron

unread,

Nov 25, 1999, 3:00:00 AM11/25/99

to

Andy Philpotts <And...@127.0.0.1> wrote:

> In article <1e1p7f6.1uq...@buf-ny1-16.ix.netcom.com>,
> cbar...@ix.netcom.com says...
> > Eric Tetz <et...@accolade.com> wrote:
> >
> >
> > if I use strings member functions assign() and append() I do get
> > std::string to be twice as fast as C strings. If I use your original
> > code I get C strins to be 6.5 times faster than std::string. The loop in
> > TestString becomes in my faster version.
> >
> > for(int i=0;i<100000;++i)
> > {
> > a.assign("Hello");
> > b.assign(" world");
> > c.assign("! How are you doin'?");
> > a.append(b);
> > a.append(c);
> > c.assign("Hello world! What's up?");
> > if(c!=a)
> > c.assign("Dah!");
> > }
> >
>
> This seems well strange, it suggests that
>

seemwd strange to me as well. I am guessing that assign(char *)and
is doing an an optimized memcpy if it will fit and the operator versions
do not, Or they create unneeded tempoararies, and copy back. I have to
check what standard says about assign/append. And all really depends on
compiler implimentation. Compiled with CodeWarrior 5.2 [4.0 string
files as noted temp bug fix]. The strange results are interesting...

> a = "Hello" is very different to a.assign("Hello") and/or that
> a.append(b) is very different to a += b.
>
> Why should that be so, if assign and append are so damn fast why not
> define the = and += as inlines based on the fast versions...
>
> I think what we are dealing with here (in general with std::string vs
> char*) is a combination of poor implementation and dubious compiler
> optimization.
>
> I would further observe that the performance aspects of code can (in many
> of the cases I meet in real life) be quite superfluous compared to the
> time to develop and debug. In this respect I find that std::string
> clearly wins the day for all but very competant C programmers (a rare
> breed these days).
>

How much a code segment should be optimized for a specific compiler
depends on how the code preforms with the real application, if the code
is a bottleneck, change it if possible, if not don't bother. Only
profiling will make it certain, where a complex bottle neck is....

Howard Hinnant

unread,

Nov 25, 1999, 3:00:00 AM11/25/99

to

As long as you guys are going to do these tests, could you please make
sure they are as similar as possible?

Thanks,
-Howard

> if (c != a)
> c = "Doh!";

> if (strcmp (c, a) == 0)
> strcpy (c, "Doh!");

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Axel Schmitz-Tewes

unread,

Nov 25, 1999, 3:00:00 AM11/25/99

to

Carl Barron schrieb:

> ...

>
> My actual results were [clock_ticks, not seconds..]
> starting...
> 559:using operator =,+=
> 555:using assign,append
> 179:using C strings
> 228:modified workings
> done!!!
>

You have done a very fine tuning. Take a look at my results at my results:

MSVC6:
release:

starting...
3805:using operator =,+=
3826:using assign,append
2583:using C strings
1793:modified workings
done!!!

Watcom 11 release:

starting...
4567:using operator =,+=
4576:using assign,append
2053:using C strings
771:modified workings
done!!!

:-)

regards,
axel

Alexei Zakharov

unread,

Nov 25, 1999, 3:00:00 AM11/25/99

to

Bill Wade <bill...@stoner.com> wrote in message
news:81er18$e...@library1.airnews.net...
...

> For assignment from std::string to std::string you can compute lengths in
> O(1) time so assignment can be about as fast as strcpy. However string to
> string copies don't need to look for a terminating NULL so on some
platforms
> the actual copy can be faster (copy four or eight bytes at a time).

I don't agree. If you have two strings with the same allocator, copying
will never occur. Only the reference counter increments (until it reaches
some big number). Such assignment performs not as fast but much faster than
strcpy.

Thomas Feuster

unread,

Nov 25, 1999, 3:00:00 AM11/25/99

to

Hi Carl,

I suggest a change to your copy() and add() functions like this:

void test_four::copy(const char *in,std::string &out)
{

out.reserve( strlen( in ) ); // make result big enough

std::string::iterator it = out.begin();
while(*in)
{
*it = *in++;
++it;
}
}

void test_four::add(std::string &a,std::string &b)
{

a.reserve( a.size() + b.size() ); // make result big enough

std::string::iterator i=a.end(),j=b.begin();
while(j!=b.end())
{
*i = *j;
++i;
++j;
}
}

Otherwise test_four fails if you don't call reserve(128) in the
beginning.

Fortunatley, this changes my numbers on slightly (MSVC 5.0 release):

Original test
starting...
5358:using operator =,+=
5117:using assign,append
2784:using C strings <= MSVC crap, debug version
takes 1382
1222:modified workings
done!!!

Modified code
starting...
5388:using operator =,+=
5128:using assign,append
2764:using C strings <= MSVC crap, debug version
takes 1382
1862:modified workings
done!!!

Thomas

Carl Barron wrote:
[snip]

> My actual results were [clock_ticks, not seconds..]
> starting...
> 559:using operator =,+=
> 555:using assign,append
> 179:using C strings
> 228:modified workings
> done!!!

[snip]

> void test_four::copy(const char *in,std::string &out)
> {
> std::string::iterator it = out.begin();
> while(*in)
> {
> *it = *in++;
> ++it;
> }
> }
>
> void test_four::add(std::string &a,std::string &b)
> {
> std::string::iterator i=a.end(),j=b.begin();
> while(j!=b.end())
> {
> *i = *j;
> ++i;
> ++j;
> }
> }

--
Dr. Thomas Feuster mailto:thomas....@sdm.de
sd&m AG http://www.sdm.de
software design & management
Thomas-Dehler-Str. 27, 81737 Muenchen, Germany
Tel +49 89 63812-816, Fax -490

John Hickin

unread,

Nov 25, 1999, 3:00:00 AM11/25/99

to

"Daniel T." wrote:
>
> In article <383AC9DD...@nortelnetworks.com>, "John Hickin"
> <hic...@nortelnetworks.com> wrote:

> >
> >Compiler: VC6
> >
> >Target char* string
> >Debug: 1.01 25.01
> >Release: 3.01 3.01
> >
> >Can anybody make sense of this?
>
> Chances are taht in debug mode, no inlining is taking place...

That explains the ratio 25/3 for string, which didn't mystify me.
The 1/3 ratio, Debug/Release, for char* is totally unexpected. It makes
one think of bugs.

Anders J. Munch

unread,

Nov 26, 1999, 3:00:00 AM11/26/99

to

Marc Lepage wrote in message <383C0699...@molecularmining.com>...

>I believe that this involves only a pointer assignment:
>
> char* p = "foo";
>
>I believe this involves a little more work:
>
> std::string s = "foo";
>
>So perhaps try a TestString like this:

[snipped]

Tried it. 50% _slower_ than the original TestString (BCB4).

- Anders

Anders J. Munch

unread,

Nov 26, 1999, 3:00:00 AM11/26/99

to

Pierre Baillargeon wrote in message <383BFE08...@artquest.net>...

>This is still not equivalent, although it shows a real-life difference
>that can occur in real code. The basic problems are:
>
>- some compilers "know" the string C functions and replace them with
>inline assembly code (Visual C++ 6.0 does it in Release mode).

I wouldn't use the word "problem". I'm thinking more along the lines of
"advantage", "benefit", or "good work with C strings, compiler
implementors; now see if you can do the same for std::string" :-)

Besides, I checked the assembly generated, and none of the string
functions were inlined.

>
>- Your strings are constant and visible in the compilation unit.

I tried assigning the string constants to variables, and gave everything
external linkage. Didn't make much of a difference.

Carl Barron

unread,

Nov 26, 1999, 3:00:00 AM11/26/99

to

Howard Hinnant <hinnant@anti-spam_metrowerks.com> wrote:

> As long as you guys are going to do these tests, could you please make
> sure they are as similar as possible?
>
> Thanks,
> -Howard
>

> > if (c != a)
> > c = "Doh!";
>

> > if (strcmp (c, a) == 0)
> > strcpy (c, "Doh!");
>

ok if all three if(c != a) are converted to if(c == a) the results are:
starting...
488:using operator =,+=
488:using assign,append
166:using C strings
251:modified workings
done!!!

[max inlining setting 8, global optimizing 5 = max]. same settings as
other test. results, similiar in order of speed.

Timur Aydin

unread,

Nov 26, 1999, 3:00:00 AM11/26/99

to

On 22 Nov 1999 23:01:08 -0500, Eric Tetz <et...@accolade.com> wrote:

>I just started reading "Effective C++" by Scott Myers. In the introduction he
>says, "As for raw char*-based strings, you shouldn't use those antique throw-
>backs unless you have a VERY good reason. Well-implemented string types can now
>be superior to char*s in virtually every way - including effeciency."
>

>How I want to believe this! I would love to use std::string instead of naked

>character buffers. However, every informal benchmark I've every done (I do one

>each time I get an update of my compiler) shows std::string to be VASTLY slower

>than C-Style strings. The last test I did, in preperation for this post, showed
>std::string as 8 times slower than C style strings. This just doesn't make any
>sense - even if std::string was just a wrapper for C strings, I should do better
>than that.
>

IMHO, the fairest comparison between std::string's and char*'s would
be to use both of them independently in a REAL world
algorithm/application.

A good example would be a parser that processes a text file with
script commands. This involves varying text line lengths, substring
search and extraction and a variable number of tokens.

My feeling is that overall, the std::string implementation will be
shorter to develop, easier to maintain, easier to extend with new
features and reasonably efficient.

Andy Glew

unread,

Nov 26, 1999, 3:00:00 AM11/26/99

to

> > >Compiler: VC6
> > >
> > >Target char* string
> > >Debug: 1.01 25.01
> > >Release: 3.01 3.01
> > >
> > >Can anybody make sense of this?
>

> The 1/3 ratio, Debug/Release, for char* is totally unexpected. It makes
> one think of bugs.

Guessing:

"Debug" char* may use the x86 instructions REP MOVSB.
While "release" may use "optimised" code to do the string operations:
perhaps REP MOVSD, REP MOVSB, or a hand loop.

Such optimizations often win on one processor, say a P5 (Pentium),
but lose on the next, say a P6 (Pentium Pro, II, III, etc.) In fact, in
particular,
on P6's REP MOVSx and REP STOSx have been optimized, to make large
copies fast (with a crossover point around 64 bytes), so a single call to
REP MOVSB is usually faster than the so-called optimized call
REP MOVSD; REP MOVSB.

Or, exactly the opposite could be happening: maybe the debug code
uses a naive copy loop, and the release code an inlined REP MOVSB.

Or^2, the debug code may not be inlined, while the release code may be inlined.
Although inlining is usually a good idea, I have seen many instances where
inlining actually hurts performance, by increasing the amount of code that has
to fit in the instruction cache and trained in the branch predictor.

J.Barfurth

unread,

Nov 26, 1999, 3:00:00 AM11/26/99

to

Alexei Zakharov <A.S.Za...@inp.nsk.su> schrieb in im Newsbeitrag:
81j6o1$mql$1...@ntlg.sibnet.ru...

>
> Bill Wade <bill...@stoner.com> wrote in message
> news:81er18$e...@library1.airnews.net...
> ...
> > For assignment from std::string to std::string you can compute lengths
in
> > O(1) time so assignment can be about as fast as strcpy. However string
to
> > string copies don't need to look for a terminating NULL so on some
> platforms
> > the actual copy can be faster (copy four or eight bytes at a time).
>
> I don't agree. If you have two strings with the same allocator, copying
> will never occur. Only the reference counter increments (until it reaches
> some big number). Such assignment performs not as fast but much faster
than
> strcpy.

An implementation of std::string need not be refcounted. As has been
explored in this group before, it would probably perform better without rc,
if it is supposed to be thread-safe.
Also, you can easily force a copy:
string s1("Hello world);

string::iterator i1 = s1.begin(); // force subsequent operations to
copy

string s2(s1);
string s3; s3 = s1;

*i1 = 'J'; // must not change s2,s3

-- Jörg Barfurth

J.Barfurth

unread,

Nov 26, 1999, 3:00:00 AM11/26/99

to

Carl Barron <cbar...@ix.netcom.com> schrieb in im Newsbeitrag:
1e1r8vs.9e3...@buf-ny3-42.ix.netcom.com...

> Andy Philpotts <And...@127.0.0.1> wrote:
>
> > In article <1e1p7f6.1uq...@buf-ny1-16.ix.netcom.com>,
> > cbar...@ix.netcom.com says...
> > > Eric Tetz <et...@accolade.com> wrote:

> void test_four::copy(const char *in,std::string &out)
> {
> std::string::iterator it = out.begin();
> while(*in)
> {
> *it = *in++;
> ++it;
> }
> }

Yields undefined behaviour if out.size() < strlen(in).
Otherwise out.size() will not be adjusted to match strlen(in).

> void test_four::add(std::string &a,std::string &b)
> {
> std::string::iterator i=a.end(),j=b.begin();
> while(j!=b.end())
> {
> *i = *j;
> ++i;
> ++j;
> }
> }

Yields undefined behaviour if !b.empty().
Also a.size() won't be updated to reflect the new size.

The remedy of resizing b or out before the copy loop incurs the overhead
you wanted to avoid:
In case 1 a strlen(in)
In both cases a resize(), leading to a possible reallocation and
unnecessary initialization of the extra data before copying the real
values.

-- Jörg Barfurth

Carl Barron

unread,

Nov 26, 1999, 3:00:00 AM11/26/99

to

Thomas Feuster <Thomas....@sdm.de> wrote:

> Hi Carl,
>
> I suggest a change to your copy() and add() functions like this:
>

> void test_four::copy(const char *in,std::string &out)
> {

> out.reserve( strlen( in ) ); // make result big enough
>

> std::string::iterator it = out.begin();
> while(*in)
> {
> *it = *in++;
> ++it;
> }
> }

> etc.

Well I only wrote test_four to use the same assumptions as
test_three, the C string tester. Adding more assumptions probably will
reduce test_four to test_one or test_two the std::string function usage.
Using essentially the C string routines assumptions, the results should
be nearly identical. Especially if you can remove the inlining of
assembler code for strcpy(),etc. that many compilers provice for the C
string code...

Ed Holloman

unread,

Nov 30, 1999, 3:00:00 AM11/30/99

to

In article <1e1p7f6.1uq...@buf-ny1-16.ix.netcom.com>, Carl
Barron <cbar...@ix.netcom.com> wrote:

> Eric Tetz <et...@accolade.com> wrote:
>
> > I just started reading "Effective C++" by Scott Myers. In the introduction
> > he
> > says, "As for raw char*-based strings, you shouldn't use those antique
> > throw-
> > backs unless you have a VERY good reason. Well-implemented string types can
> >>now
> > be superior to char*s in virtually every way - including effeciency."
> >
> > How I want to believe this! I would love to use std::string instead of naked
> > character buffers. However, every informal benchmark I've every done (I do
> > one
> > each time I get an update of my compiler) shows std::string to be VASTLY
> > slower
> > than C-Style strings. The last test I did, in preperation for this post,
> > showed
> > std::string as 8 times slower than C style strings. This just doesn't
> > make any
> > sense - even if std::string was just a wrapper for C strings,
> > I should do better than that.
> >

> > At this point I'm assuming my test code is hopelessly naive... can somebody
> > show me the error of my ways and restore my faith in std::string (and
> > thereby
> > resolve the mystery of Dr.Myers comment)?
> >

> [code snipped]

>
> if I use strings member functions assign() and append() I do get
> std::string to be twice as fast as C strings. If I use your original
> code I get C strins to be 6.5 times faster than std::string. The loop in
> TestString becomes in my faster version.
>

> for(int i=0;i<100000;++i) // 100,000

> {
> a.assign("Hello");
> b.assign(" world");
> c.assign("! How are you doin'?");
> a.append(b);
> a.append(c);
> c.assign("Hello world! What's up?");
> if(c!=a)
> c.assign("Dah!");
> }

> for (int i = 0; i < 1000000; ++i) // 1,000,000

> {
> a = "Hello";
> b = " world";

> c = "! How'ya doin'?";

> a += b;
> a += c;

> c = "Hello world! What's up?";

> if (c != a)
> c = "Doh!";
> }

I'm just guessing here, but could the dramatic difference in time be
due to the difference in the number of times each loop iterates? :-)

Regards,

Ed Holloman

Marcus, Aviva and Rhiannon

unread,

Dec 1, 1999, 3:00:00 AM12/1/99

to

Hi...

Axel Schmitz-Tewes wrote:
> You have done a very fine tuning. Take a look at my results at my results:
>
> MSVC6:
> release:
>
> starting...
> 3805:using operator =,+=
> 3826:using assign,append
> 2583:using C strings
> 1793:modified workings
> done!!!
>
> Watcom 11 release:
>
> starting...
> 4567:using operator =,+=
> 4576:using assign,append
> 2053:using C strings
> 771:modified workings
> done!!!
>

Approaching it as a mystery to be explored, how does the behaviour
change when larger strings and buffers are used?

> regards,
> axel

yours, Marcus.

ps. Sorry not to reply to the original post, but I missed the start
of the thread.
--
Marcus, Aviva and Rhiannon at the 'Appo Site'
Marcus: mar...@appo.demon.co.uk
Aviva: av...@appo.demon.co.uk
Rhiannon: rhia...@appo.demon.co.uk

Christopher Eltschka

unread,

Dec 1, 1999, 3:00:00 AM12/1/99

to

Axel Schmitz-Tewes wrote:
>
> I can try your code with my compilers and see a big difference in the runtime of
> the released or debug versions.
>
> TestPChar TestString
> Watcom 11
> debug version: 2.01 17.03
> release version 2.01 5.01
>
> MSVC6
> debug version 2.01 32.01
> release version 2.01 3.01
>
> I have used only the standard switches.
>
> It looks like doing some optimizing switches by hand could give
> you a good feeling using the string class, but AFAIK faster than
> C-string handling seems hopeless.

I guess it depends:

{
std::string a, b;
a.reserve(1000);
a = "Some very long string (but not completely filling out "
"the reservation, so appending b does not need a "
"reallocation to occur.";
b = "Some more text";
a += b;
}

vs.

{
char a[1000], b[15];
strcpy(a, "Some very long ... to occur");
strcpy(b, "Some more text";
strcat(a, b);
}

Here, I'd expect strcat to be much slower, since it must scan
the complete string a first just to determine where to append
b, while std::string::operator+= just inspects it's begin and
size members (or even just the end member, if stored as begin/end
pair).
That is, strcat(a, b) is O(strlen(a)+strlen(b)), while
a+=b for pre-reserved memory can just be O(b.size()).

Even worse, if reallocation is necessary, a naive C version
will first call strlen(a) to find out the size to allocate,
then call strcpy to copy a, and then strcat to add b.
That is, it will iterate through a _three_ times.
And even an optimized version (which uses the result of
strlen to directly strcpy b to its correct place) must
iterate through a two times, once for strlen and once for
strcpy. OTOH, std::string will iterate just once, since
it has the length stored separately.

Of course you can do that in C as well, but I guess it's
not done that frequently, given that you then have to
maintain _two_ variables (which must remain in sync),
and no third-party function will support that (i.e. you'll
be forced to re-determine the length after each library call
which might have changed it).

ash...@my-deja.com

unread,

Dec 4, 1999, 3:00:00 AM12/4/99

to

using Optimizing for speed (/O2) and favour for speed (/Ot), I get the
following:
starting...
2303:using operator =,+=
2323:using assign,append
2043:using C strings
771:modified workings
done!!!

Note how C Strings are pretty close to using operator = and using
assign.
And, the modified workings beats all.

ash

Axel Schmitz-Tewes wrote:
>
> Carl Barron schrieb:
[snipped]

Mike Nordell

unread,

Mar 23, 2000, 3:00:00 AM3/23/00

to

Eric Tetz <et...@accolade.com> wrote:

>I just started reading "Effective C++" by Scott Myers. In the introduction
he
>says, "As for raw char*-based strings, you shouldn't use those antique throw-
>backs unless you have a VERY good reason. Well-implemented string types can
now
>be superior to char*s in virtually every way - including effeciency."
>
>How I want to believe this! I would love to use std::string instead of naked
>character buffers. However, every informal benchmark I've every done (I do
one
>each time I get an update of my compiler) shows std::string to be VASTLY
slower
>than C-Style strings. The last test I did, in preperation for this post,
showed
>std::string as 8 times slower than C style strings. This just doesn't make
any
>sense - even if std::string was just a wrapper for C strings, I should do
better
>than that.

[snipped string vs. char[] operations timing test]

On my compiler (MSVC 6) there was no timing differences at all for
these two.

Is it perhaps that you performed your timings without optimization,
where memcpy would be considered to be *much* faster than un-optimized
std::string operations.

--
Mike
'Sweden' => 'se' if you reply by mail.