Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

std::string::assign range

54 views
Skip to first unread message

Christopher Pisz

unread,
Mar 5, 2015, 3:10:53 PM3/5/15
to


I know there exists a way to copy a filestream into a string in one
line, using iterators, because I've seen it before on this newsgroup.
I looked up the documentation, but can't seem to get the syntax right.

My attempt:

std::ifstream("test.txt");
if( !file )
{
// Error
}

std::string textToParse;

textToParse.assign(std::istreambuf_iterator<char>(file),
std::istream_iterator<char>());
Foo(textToParse);


A colleague of mine mentioned that this is a much more inefficient way
of doing it, so I wanted to test for funsies compared to:

std::ifstream("test.txt");
if( !file )
{
// Error
}

std::stringstream textToParse;
textToParse << file.rdbuf();
Foo(textToParse.str());



What's the syntax for the assign-range?
Is there any reason one way should be better than the other?



--
I have chosen to troll filter/ignore all subthreads containing the
words: "Rick C. Hodgins", "Flibble", and "Islam"
So, I won't be able to see or respond to any such messages
---

Christopher Pisz

unread,
Mar 5, 2015, 3:16:30 PM3/5/15
to
On 3/5/2015 2:10 PM, Christopher Pisz wrote:
>
>
> I know there exists a way to copy a filestream into a string in one
> line, using iterators, because I've seen it before on this newsgroup.
> I looked up the documentation, but can't seem to get the syntax right.
>
> My attempt:
>
> std::ifstream("test.txt");
> if( !file )
> {
> // Error
> }
>
> std::string textToParse;
>
> textToParse.assign(std::istreambuf_iterator<char>(file),
> std::istream_iterator<char>());
> Foo(textToParse);
>
>
> A colleague of mine mentioned that this is a much more inefficient way
> of doing it, so I wanted to test for funsies compared to:
>
> std::ifstream("test.txt");
> if( !file )
> {
> // Error
> }
>
> std::stringstream textToParse;
> textToParse << file.rdbuf();
> Foo(textToParse.str());
>
>
>
> What's the syntax for the assign-range?
> Is there any reason one way should be better than the other?
>
>
>


Whoops, lets name the ifstream file...
Working on a compilable example, but can't get the assign-range right.
Sorry.

Ben Bacarisse

unread,
Mar 5, 2015, 4:35:44 PM3/5/15
to
Christopher Pisz <nos...@notanaddress.com> writes:

> I know there exists a way to copy a filestream into a string in one
> line, using iterators, because I've seen it before on this newsgroup.
> I looked up the documentation, but can't seem to get the syntax right.
>
> My attempt:
>
> std::ifstream("test.txt");
> if( !file )
> {
> // Error
> }
>
> std::string textToParse;
>
> textToParse.assign(std::istreambuf_iterator<char>(file),
> std::istream_iterator<char>());

Did you mean something like this:

file >> std::noskipws;
std::copy(std::istream_iterator<char>(file),
std::istream_iterator<char>(),
std::inserter(textToParse, textToParse.begin()));

?

<snip>
--
Ben.

Christopher Pisz

unread,
Mar 5, 2015, 5:18:06 PM3/5/15
to
Indeed!

Full listing (make your own timer and exception classes):

// Shared Includes
#include "Exception.h"
#include "PerformanceTimer.h"

// Standard Includes
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>

//--------------------------------------------------------------------------------------------------
void Method1()
{
std::ifstream file("test.txt");
if( !file )
{
// Error
throw Shared::Exception(__FILE__, __LINE__, "Cannot open test
file");
}

std::string textToParse;

file >> std::noskipws;
std::copy(std::istream_iterator<char>(file),
std::istream_iterator<char>(),
std::inserter(textToParse, textToParse.begin()));

file.close();
}

//--------------------------------------------------------------------------------------------------
void Method2()
{
std::ifstream file("test.txt");
if( !file )
{
// Error
throw Shared::Exception(__FILE__, __LINE__, "Cannot open test
file");
}

std::stringstream textToParse;
textToParse << file.rdbuf();

file.close();
}

//--------------------------------------------------------------------------------------------------
int main()
{
Shared::PerformanceTimer timer;
Method1();
std::cout << "Method 1 :" << timer.Stop() << std::endl;

timer.Start();
Method2();
std::cout << "Method 2 :" << timer.Stop() << std::endl;

}

Output:
Method 1 :0.283209
Method 2 :0.0216563


That's quite a difference! What's going on under the hood with the
iterator method?

Victor Bazarov

unread,
Mar 5, 2015, 5:26:28 PM3/5/15
to
I don't see any proof of the equality of the result of two different
methods...

Inserting into a text string, one character at a time, at the beginning,
most likely involves too many reallocations and too many copy
operations. Have you tried profiling your program?

V
--
I do not respond to top-posted replies, please don't ask

Victor Bazarov

unread,
Mar 5, 2015, 5:47:26 PM3/5/15
to
OK, I take it back. I obviously don't know how 'std::inserter' works.

> Have you tried profiling your program?

Still, the only way to know for sure why the iterator method is slower
is to profile it.

Christopher Pisz

unread,
Mar 5, 2015, 6:15:01 PM3/5/15
to
I am not sure how to go about profiling it any deeper.

I'd have to start the research and purchase of a native C++ profiler
campaign up again. The last one ended in failure.


Supposedly the Windows SDK comes with Xperf according to Google and that
can be used. I don't know where to start, but its worth learning at this
point.
Message has been deleted

Christopher Pisz

unread,
Mar 5, 2015, 6:26:12 PM3/5/15
to
On 3/5/2015 5:19 PM, Stefan Ram wrote:
> Christopher Pisz <nos...@notanaddress.com> writes:
>> I know there exists a way to copy a filestream into a string in one
>> line
>
> If you take the required #include lines into account, it's
> not possible, because every #include line must be written
> on its own line.
>
> Otherwise, it's simple, because the rest of the program
> can always be written in a single line (assuming it does
> not include preprocessor directives).
>
>

You a funny guy!

Statement. One statement.

Christopher Pisz

unread,
Mar 5, 2015, 8:11:52 PM3/5/15
to
So, I got Xperf to work after watching a few hours of videos and working
through a lot of out of date information.

It reports that
Method 1 has 44 heap allocations for a size of 217,357.
Method 2 has 26 heap allocations for a size of 194,437.

I rely on my high precision timer for the time of execution for both.

Method 1 :0.283209
Method 2 :0.0216563

I can't really find anything about istream_iterator or std::inserter
though. I like the code in method 1, it feels cleaner. I want to
understand if there is another step I am missing or something else I can
do similar.

It is fairly often I'd want to copy the contents of one stream to a
string or to another stream.

Paavo Helde

unread,
Mar 6, 2015, 12:55:59 AM3/6/15
to
Christopher Pisz <nos...@notanaddress.com> wrote in
news:mdakl6$v6s$1...@dont-email.me:
> Output:
> Method 1 :0.283209
> Method 2 :0.0216563
>
>
> That's quite a difference! What's going on under the hood with the
> iterator method?

The stream classes are meant for reading and writing *formatted* text at
high level (streaming a sequence of C++ objects of arbitrary classes to
and from a file). The current locale is taken into account on every step,
end-of-line conversions done, etc. The iterator works in this layer even
if it actually does no transformations. I would not be surprised if there
were a couple of virtual function calls involved with reading each
character.

OTOH, by using rdbuf() you actually say that you do not want all that
formatting and translation and want just to get the file content as it is
on the disk. Naturally this is a lot faster.

Still, I would expect a difference of at most 3-4 times, the stream
abstractions are not that bad. Looks like you are testing non-optimized
code. Also, the times like 0.02 seconds are too close to the OS process
scheduling granularity, plus the effects from OS file caching may show
up. I would put some loops in main() to get larger times, and the very
first runs of both methods should be ignored.

hth
Paavo

Juha Nieminen

unread,
Mar 6, 2015, 3:20:21 AM3/6/15
to
Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
> std::copy(std::istream_iterator<char>(file),
> std::istream_iterator<char>(),
> std::inserter(textToParse, textToParse.begin()));

What's the problem with

std::string textToParse(std::istream_iterator<char>(file),
std::istream_iterator<char>());

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---
Message has been deleted

Chris Vine

unread,
Mar 6, 2015, 8:29:33 AM3/6/15
to
On 6 Mar 2015 12:17:09 GMT
r...@zedat.fu-berlin.de (Stefan Ram) wrote:
> Christopher Pisz <nos...@notanaddress.com> writes:
> >On 3/5/2015 5:19 PM, Stefan Ram wrote:
> >>Christopher Pisz <nos...@notanaddress.com> writes:
> >>>I know there exists a way to copy a filestream into a string in one
> >>>line
> >>it's simple, because the rest of the program can always be written
> >>in a single line
> >Statement. One statement.
>
> Whenever one has multiple statements,
> one can use a compound statement (block)
> to merge them into a single statement.
> So, the requirement of »one statement«
> really is no requirement.

Well at least Christopher does not take the approach of answering a
question where the intent is obvious but the wording slightly amiss
with a pedantically correct but completely useless answer.

I think it is generally better to be helpful than to try to be "clever".

Chris

Message has been deleted
Message has been deleted

Chris Vine

unread,
Mar 6, 2015, 9:29:05 AM3/6/15
to
On 6 Mar 2015 13:59:11 GMT
r...@zedat.fu-berlin.de (Stefan Ram) wrote:
> Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:
> >I think it is generally better to be helpful than to try to be
> >"clever".
>
> When one does not know the meaning of words like
> »line« or »statement«, it is better to learn it
> (when posting into a technical programming newsgroup),
> than to accuse others.

Then let me meet pedantry with pedantry. Christopher did not "accuse
others". So far as there was an accusation (which seems a somewhat
paranoid view of the world on your part) it came from me. On other
hand, I did not use the words 'line' or 'statement', the failure to
understand which you criticize as forfeiting the right to criticize.

You knew full well that when Christopher referred to "statement" he
meant a non-compound statement, or possibly "expression".

I am afraid you have "previous" on this one.

Chris

Victor Bazarov

unread,
Mar 6, 2015, 10:00:24 AM3/6/15
to
I took your code and changed it a bit so I could time it on Windows. It
didn't have the ~14-fold difference like in your example, only about
1.5-fold (the iterator method taking longer by ~50%). Run your test
after building it in release (with optimization) and ensure that the
text file has been placed in the file cache (by running a couple times
and only noting the last run). Or swap the methods (call the 2 before
calling 1), just to be sure.

The operation is very quick on Windows with shorter files, so profiling
here doesn't make much sense. I would venture to point out, though that
the iterator method makes more function calls, probably.

And, after all, it's QoI of the library, of course.

Christopher Pisz

unread,
Mar 6, 2015, 11:54:42 AM3/6/15
to
Just that I don't know it, but that looks to be better.
So, taking your suggestions in combination with Victor's, here is my
current listing:


// Shared Includes
#include "Exception.h"
#include "PerformanceTimer.h"

// Standard Includes
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <cstdio>

const unsigned int NUM_ITERATIONS = 100;

//--------------------------------------------------------------------------------------------------
void Method1()
{
std::ifstream file("test.txt");
if( !file )
{
// Error
throw Shared::Exception(__FILE__, __LINE__, "Cannot open test
file");
}

std::string textToParse(std::istream_iterator<char>(file),
std::istream_iterator<char>());

file.close();
}

//--------------------------------------------------------------------------------------------------
void Method2()
{
std::ifstream file("test.txt");
if( !file )
{
// Error
throw Shared::Exception(__FILE__, __LINE__, "Cannot open test
file");
}

std::string textToParse;

file >> std::noskipws;
std::copy(std::istream_iterator<char>(file),
std::istream_iterator<char>(),
std::inserter(textToParse, textToParse.begin()));

file.close();
}

//--------------------------------------------------------------------------------------------------
void Method3()
{
std::ifstream file("test.txt");
if( !file )
{
// Error
throw Shared::Exception(__FILE__, __LINE__, "Cannot open test
file");
}

std::stringstream textToParse;
textToParse << file.rdbuf();

file.close();
}

//--------------------------------------------------------------------------------------------------
int main()
{
Method1();
Method2();
Method3();

Shared::PerformanceTimer timer;
for(unsigned count = 0; count < NUM_ITERATIONS; ++count)
{
Method1();
}
std::cout << "Method 1 :" << timer.Stop() << std::endl;

timer.Start();
for(unsigned count = 0; count < NUM_ITERATIONS; ++count)
{
Method2();
}
std::cout << "Method 2 :" << timer.Stop() << std::endl;

timer.Start();
for(unsigned count = 0; count < NUM_ITERATIONS; ++count)
{
Method3();
}
std::cout << "Method 3 :" << timer.Stop() << std::endl;
}


and the output is:
Method 1 :0.012716
Method 2 :0.361421
Method 3 :0.141371

Christopher Pisz

unread,
Mar 6, 2015, 12:06:57 PM3/6/15
to
I take it back, something is fruity with Juha's suggestion. I see a
warning "warning C4930: 'std::string
textToParse(std::istream_iterator<_Ty>,std::istream_iterator<_Ty>
(__cdecl *)(void))': prototyped function not called (was a variable
definition intended?)" and cannot seem to use the string afterward
without compiler errors that claim it isn't a compatible type. I don't
follow. Using msvc 2012.

Victor Bazarov

unread,
Mar 6, 2015, 2:56:56 PM3/6/15
to
Change that line to read

std::string textToParse(std::istream_iterator<char>{file},
std::istream_iterator<char>{});

(note the curly braces), and don't use pre-C++11 compiler :-)

(actually I'm not sure it's going to work with VC++ 2012, I used 2013
and got this result:

Method 1 :498549
Method 2 :305819
Method 3 :110364

(with an 18K file, and those are the processor ticks, using the Windows
QueryPerformanceCounter)

How big is your file?

Another note: make sure the optimizer does not throw away the result of
the Method1. It's quite possible that since you're not returning it
anywhere, the optimizer might change the code to never create the object
in the first place. Think of returning the string from those functions
(as in 'std::string Method1(...')

Here is my (corrected) code:
//--------------------------------------------------------------------------------------------------
// Standard Includes
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <cstdio>
#include <Windows.h>

const unsigned int NUM_ITERATIONS = 100;

//--------------------------------------------------------------------------------------------------
std::string Method1()
{
std::ifstream file("test.txt");
if (!file)
{
// Error
throw 1;
}

std::string textToParse(std::istream_iterator < char > {file},
std::istream_iterator < char > {});

file.close();

return textToParse;
}

//--------------------------------------------------------------------------------------------------
std::string Method2()
{
std::ifstream file("test.txt");
if (!file)
{
// Error
throw 22;
}

std::string textToParse;

file >> std::noskipws;
std::copy(std::istream_iterator<char>(file),
std::istream_iterator<char>(),
std::inserter(textToParse, textToParse.begin()));

file.close();

return textToParse;
}

//--------------------------------------------------------------------------------------------------
std::string Method3()
{
std::ifstream file("test.txt");
if (!file)
{
// Error
throw 333;
}

std::stringstream textToParse;
textToParse << file.rdbuf();

file.close();

return textToParse.str();
}

//--------------------------------------------------------------------------------------------------
int main()
{
Method1();
Method2();
Method3();

LARGE_INTEGER t0, t1;
QueryPerformanceCounter(&t0);
for (unsigned count = 0; count < NUM_ITERATIONS; ++count)
{
Method1();
}
QueryPerformanceCounter(&t1);
std::cout << "Method 1 :" << t1.QuadPart - t0.QuadPart << std::endl;

QueryPerformanceCounter(&t0);
for (unsigned count = 0; count < NUM_ITERATIONS; ++count)
{
Method2();
}
QueryPerformanceCounter(&t1);
std::cout << "Method 2 :" << t1.QuadPart - t0.QuadPart << std::endl;

QueryPerformanceCounter(&t0);
for (unsigned count = 0; count < NUM_ITERATIONS; ++count)
{
Method3();
}
QueryPerformanceCounter(&t1);
std::cout << "Method 3 :" << t1.QuadPart - t0.QuadPart << std::endl;
}
//--------------------------------------------------------------------------------------------------

Luca Risolia

unread,
Mar 6, 2015, 4:32:44 PM3/6/15
to
Il 06/03/2015 18:06, Christopher Pisz ha scritto:
>>> std::string textToParse(std::istream_iterator<char>(file),
>>> std::istream_iterator<char>());

> I take it back, something is fruity with Juha's suggestion. I see a
> warning "warning C4930: 'std::string
> textToParse(std::istream_iterator<_Ty>,std::istream_iterator<_Ty>
> (__cdecl *)(void))': prototyped function not called (was a variable
> definition intended?)" and cannot seem to use the string afterward
> without compiler errors that claim it isn't a compatible type. I don't
> follow.

According to the language rules "std::istream_iterator<char>()" is a
declaration of "function taking no arguments returning
istream_iterator<char>". To force the compiler to treat the construct as
an expression surround it with extra parentheses:

std::string
textToParse(std::istream_iterator<char>(file),(std::istream_iterator<char>()));

Victor Bazarov

unread,
Mar 6, 2015, 5:09:21 PM3/6/15
to
And to expand on that, the sequence of tokens
std::istream_iterator<char>(file)
can be also treated as a declaration of an object called 'file' of type
'istream_iterator<char>', the parentheses are superfluous and ignored.
Thus the entire original declaration of 'textToParse' is a declaration
of a function that takes two arguments and returns a std::string. The
name of the first argument ("file") in that case is ignored.
0 new messages