The failures of iostreams

2237 views
Skip to first unread message

Jason McKesson

unread,
Nov 17, 2012, 2:36:34 PM11/17/12
to std-pr...@isocpp.org
The Iostreams library in C++ has a problem. We have real, reasonable, legitimate C++ professional, who like C++ and use modern C++ idioms, telling people to not use iostreams. This is not due to differing ideas on C++ or C-in-classes-style development, but the simple practical realities of the situation.

This kind of thing is indicative of a real problem in iostreams. In order to eventually solve that problem, we must first identify exactly what the problems are. This discussion should be focused on exactly that: identifying the problems with the library. Once we know what the real problems are, we can be certain that any new system that is proposed addresses them.

Note that this is about problems within iostreams. This is not about a list of things you wish it could do. This is about what iostreams actually tries to do but fails at in some way. So stuff like async file IO doesn’t go here, since iostreams doesn’t try to provide that.

Feel free to add to this list other flaws you see in iostreams. Or if you think that some of them are not real flaws, feel free to explain why.

Performance

This is the big one, generally the #1 reason why people suggest using C-standard file IO rather than iostreams.

Oftentimes, when people defend iostreams performance, they will say something to the effect of, “iostreams does far more than C-standard file IO.” And that’s true. With iostreams, you have an extensible mechanism for writing any type directly to a stream. You can “easily” write new streambuf’s that will allow you to (via runtime polymorphism) be able to work with existing code, thus allowing you to leverage your file IO for other forms of IO. You could even use a network pipe as an input or output stream.

There’s one real problem with this logic, and it is exactly why people suggest C-standard file IO. Iostreams violates a fundamental precept of C++: pay only for what you use.

Consider this suite of benchmarks. This code doesn’t do file IO; it writes directly to a string. All it’s doing is measuring the time it takes to append 4-characters to a string. A lot. It uses a `char[]` as a useful control. It also tests the use of `vector<char>` (presumably `basic_string` would have similar results). Therefore, this is a solid test for the efficiency of the iostreams codebase itself.

Obviously there will be some efficiency loss. But consider the numbers in the results.

The ostringstream is more than full order of magnitude slower than the control. It’s almost 100x in some cases. Note that it’s not using << to write to the stream; it’s using `ostream::write()`.

Note that the vector<char> implementations are fairly comparable to the control, usually being around 1x-4x the speed. So clearly this is something in ostringstream.

Now, you might say that one could use the stringbuf directly. And that was done. While it does improve performance over the ostringstream case substantially (generally half to a quarter the performance), it’s still over 10x slower than the control or most vector<char> implementations.

Why? The stringbuf operations ought to be a thin wrapper over std::string. After all, that’s what was asked for.

Where does this inefficiency come from? I haven’t done any extensive profiling analysis, but my educated guesses are from two places: virtual function overhead and an interface that does too much.

ostringstream is supposed to be able to be used as an ostream for runtime-polymorphism. But here’s where the C++ maxim comes into play. Runtime-polymorphism is not being used here. Every function call should be able to be statically dispatched. And it is, but all of the virtual machinery comes from within ostringstream.

This problem seems to come mostly from the fact that basic_ostream, which does most of the leg-work for ostringstream, has no specific knowledge of its stream type. Therefore it's always a virtual call. And it may be doing many such virtual calls.

You can achieve the same runtime polymorphism (being able to overload operator<< for any stream) by using a static set of stream classes, tightly coupled to their specific streambufs, and a single “anystream” type that those streams can be converted into. It would use std::function-style type erasure to remember the original type and feed function calls to it. It would use a single function call to initiate each write operation, rather than what appears to be many virtual calls within each write.

Then, there’s the fact that streambuf itself is overdesigned. stringbuf ought to be a simple interface wrapper around a std::string, but it’s not. It’s a complex thing. It has locale support of all things. Why? Isn’t that something that should be handled at the stream level?

This API has no way to get a low-level interface to a file/string/whatever. There’s no way to just open a filebuf and blast the file into some memory, or to shove some memory out of a filebuf. It will always employ the locale machinery even if you didn’t ask for it. It will always make these internal virtual calls, even if they are completely statically dispatched.

With iostreams, you are paying for a lot of stuff that you don’t frequently use. At the stream level, it makes sense that you’re paying for certain machinery (though again, some way to say that you’re not using some of it would be nice). At the buffer level, it does not, since that is the lowest level you’re allowed to use.

Utility

While performance is the big issue, it’s not the only one.

The biggest selling point for iostreams is the ability to extend its formatted writing functionality. You can overload operator<< for various types and simply use them. You can’t do that with fprintf. And thanks to ADL, it will work just fine for classes in namespaces. You can create new streambuf types and even streams if you like. All relatively easily.

Here’s the problem, and it is admittedly one that is subjective: printf is really nice syntax.

It’s very compact, for one. Once you understand the basic syntax of it, it’s very easy to see what’s going on. Especially for complex formatting. Just consider the physical size difference between these two:
snprintf(..., “0x%08x”, integer);
stream << "0x" << std::right << std::hex << std::setw(8) << iVal << std::endl;
It may take a bit longer to become used to the printf version, but this is something you can easily look up in a reference.

Plus, it makes it much easier to do translations on formatted strings. You can look the pattern string up in a table that changes from language to language. This is rather more difficult in iostreams, though not impossible. Granted, pattern changes may not be enough, as some languages have different subject/verb/object grammars that would require reshuffling patterns around. However, there are printf-style systems that do allow for reshuffling, whereas no such mechanism exists for iostream-style.

C++ used the << method because the alternatives were less flexible. Boost.Format and other systems show that C++03 did not really have to use this mechanism to achieve the extensibility features that iostreams provide.

What do you think? Are there other issues in iostreams that need to be mentioned?

Nevin Liber

unread,
Nov 17, 2012, 3:03:09 PM11/17/12
to std-pr...@isocpp.org
On 17 November 2012 13:36, Jason McKesson <jmck...@gmail.com> wrote:
C++ used the << method because the alternatives were less flexible. Boost.Format and other systems show that C++03 did not really have to use this mechanism to achieve the extensibility features that iostreams provide.

Boost.Format came out in 2002.  C++03 (which is basically C++98) was standardized in the 90s.  Short of building a time machine, I fail to see how Boost.Format showed C++03 anything. 
 
What do you think? Are there other issues in iostreams that need to be mentioned?

Not really, no.  Ragging on iostreams is easy, and has been done plenty of times already.  Coming up with a proposal to replace it is hard and time consuming.  I don't see any proposal here.  Are you looking to write one?
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Loïc Joly

unread,
Nov 17, 2012, 3:08:19 PM11/17/12
to std-pr...@isocpp.org, Jason McKesson
Le 17/11/2012 20:36, Jason McKesson a �crit :
> The Iostreams library in C++ has a problem. We have real, reasonable,
> legitimate C++ professional, who like C++ and use modern C++ idioms,
> telling people to not use iostreams. This is not due to differing
> ideas on C++ or C-in-classes-style development, but the simple
> practical realities of the situation.
>

There are mostly two points where I disagree with your analysis:
- Performance: I performances really matter, granted, I will not use
iostream, but I will not use C I/O facilities either. I will use
platform specific API that can deliver maximum performance.

- Usability: I find printf format really hard to use (and very error
prone). It's another language, and an obscure one. I genuinely have no
idea what 0x%08x meant in your message. I was not even sure if it
expected one argument or several. But this is not my main point. My main
point is that your comparison is unfair: Most of the time, when doing
I/O, I don't care about format (when I care, then I use a UI library
such as Qt, or I generate HTML, or LaTeX, or whatever, but I don't use
iostream). And in this case, iostream are not more verbose:

os << "Line " << line << ": Error(" << code << "): " << msg;
printf("Line %??: Error(%??): %??", line, code, msg);

The difference is not that big, even when using only basic types (and,
as you said, the difference is in the other direction when dealing with
user defined types).

For me, the biggest issue I have with iostream is localisation, and the
possibility to have a whole sentence in one block, and to be able to
swap arguments. And boost format really helps here.

--
Lo�c

Nicol Bolas

unread,
Nov 17, 2012, 3:13:07 PM11/17/12
to std-pr...@isocpp.org


On Saturday, November 17, 2012 12:03:52 PM UTC-8, Nevin ":-)" Liber wrote:
On 17 November 2012 13:36, Jason McKesson <jmck...@gmail.com> wrote:
C++ used the << method because the alternatives were less flexible. Boost.Format and other systems show that C++03 did not really have to use this mechanism to achieve the extensibility features that iostreams provide.

Boost.Format came out in 2002.  C++03 (which is basically C++98) was standardized in the 90s.  Short of building a time machine, I fail to see how Boost.Format showed C++03 anything. 

My point being that Boost.Format was possible, so it could have been done. That is, we didn't need variadic templates or other C++11 features to be able to have this functionality.
 
What do you think? Are there other issues in iostreams that need to be mentioned?

Not really, no.  Ragging on iostreams is easy, and has been done plenty of times already.  Coming up with a proposal to replace it is hard and time consuming.  I don't see any proposal here.  Are you looking to write one?

Did you read the intro section of the post, where I state that writing a proposal first requires collecting the problems? You're kinda missing the point here. You have to figure out what went wrong before you can fix it. Otherwise, you're likely to create more problems by missing something important.
 

Nicol Bolas

unread,
Nov 17, 2012, 3:50:26 PM11/17/12
to std-pr...@isocpp.org, Jason McKesson


On Saturday, November 17, 2012 12:08:20 PM UTC-8, Loïc Joly wrote:
Le 17/11/2012 20:36, Jason McKesson a �crit :
> The Iostreams library in C++ has a problem. We have real, reasonable,
> legitimate C++ professional, who like C++ and use modern C++ idioms,
> telling people to not use iostreams. This is not due to differing
> ideas on C++ or C-in-classes-style development, but the simple
> practical realities of the situation.
>

There are mostly two points where I disagree with your analysis:
- Performance: I performances really matter, granted, I will not use
iostream, but I will not use C I/O facilities either. I will use
platform specific API that can deliver maximum performance.

I would consider this something of a non-sequitor. Yes, one can always run to the OS facilities if one wants maximum performance. That is not an excuse for iostream's performance however (and the fact that you do so is indicative of the exact problem I state).

There's a big difference between "maximum performance", "reasonable performance", and "iostreams performance". The difference between vector<char> and writing to a char[] is "reasonable performance." It's an abstraction, but it's a tight one that can work out well if your compiler is good. The difference between iostreams (especially stringbuf) and vector<char> is utterly inexcusable. There is no reason for such a massive performance difference to exist between those cases.

I again remind you of the C++ maxim: pay only for what you use. You shouldn't have to leave performance on the table unless you're doing something that requires that loss of performance. C-standard file IO offers reasonable performance relative to the OS facilities; why shouldn't iostreams? Isn't that what one should expect from standard library facilities, to offer a wrapper around the OS that is reasonably thin?

You don't see people ditching operator new just to get reasonable allocation performance. Even if they want to write their own allocation system based on the OS specifics, they'll still hook it into operator new.

However, you rarely see people write a file IO system built on OS specifics and then build a streambuf-derived class to use it with iostreams. There's a reason for that.

Iostreams should be someone that people should want to use for platform-neutral development. That's my point, and it's performance makes people want to use other things.

- Usability: I find printf format really hard to use (and very error
prone). It's another language, and an obscure one. I genuinely have no
idea what 0x%08x meant in your message. I was not even sure if it
expected one argument or several. But this is not my main point. My main
point is that your comparison is unfair: Most of the time, when doing
I/O, I don't care about format

That's nice that you don't have to. Some people do, a lot. Their use cases should not be ignored.

My comparison came from actual use. There are plenty of times when I have needed to look at a 32-bit integer output as a hexadecimal number. And iostreams makes that incredibly difficult, while printf makes it incredibly easy.
 
(when I care, then I use a UI library
such as Qt, or I generate HTML, or LaTeX, or whatever, but I don't use
iostream)

Isn't that indicative of a failure in iostreams? That if you need to write hexadecimal numbers, you bring in Qt/HTML/LaTeX (I really don't know what LaTeX is doing there), rather than using standard library features. Remember: we're not talking about visual formatting; this is pure text stuff. This is "I want the integer to be hexadecimal" or "I want the float to only have 2 decimal digits."

You shouldn't have to run screaming to Qt whenever you want to do that in a reasonable way.
 
. And in this case, iostream are not more verbose:

os << "Line " << line << ": Error(" << code << "): " << msg;
printf("Line %??: Error(%??): %??", line, code, msg);

The difference is not that big, even when using only basic types (and,
as you said, the difference is in the other direction when dealing with
user defined types).

For me, the biggest issue I have with iostream is localisation, and the
possibility to have a whole sentence in one block, and to be able to
swap arguments. And boost format really helps here.

--
Lo�c

Loïc Joly

unread,
Nov 17, 2012, 4:32:22 PM11/17/12
to std-pr...@isocpp.org, Nicol Bolas
Le 17/11/2012 21:50, Nicol Bolas a �crit :
>
>
> (when I care, then I use a UI library
> such as Qt, or I generate HTML, or LaTeX, or whatever, but I don't
> use
> iostream)
>
>
> Isn't that indicative of a failure in iostreams? That if you need to
> write hexadecimal numbers, you bring in Qt/HTML/LaTeX (I really don't
> know what LaTeX is doing there), rather than using standard library
> features. Remember: we're not talking about visual formatting; this is
> pure text stuff. This is "I want the integer to be hexadecimal" or "I
> want the float to only have 2 decimal digits."
>
I may have been misunderstood here. What I was saying is that if I want
visual formatting, I will anyway use other libraries than iostream. And
if I don't want visual formatting, but pure text, then I usually don't
care if floats have 2, 6 or 12 decimal digits.

There is another point where I believe iostreams are weak, it's
encoding. There is the codecvt facet that can be used, but I find it not
really easy to use. Moreover, I'd like to open a file and let the system
automatically detect its format (using BOM, or maybe other heuristics)
and allow me to directly read from it into my internal format.

--
Lo�c



Václav Zeman

unread,
Nov 17, 2012, 5:34:03 PM11/17/12
to std-pr...@isocpp.org
On 11/17/2012 08:36 PM, Jason McKesson wrote:
The Iostreams library in C++ has a problem. We have real, reasonable, legitimate C++ professional, who like C++ and use modern C++ idioms, telling people to not use iostreams. This is not due to differing ideas on C++ or C-in-classes-style development, but the simple practical realities of the situation.

This kind of thing is indicative of a real problem in iostreams. In order to eventually solve that problem, we must first identify exactly what the problems are. This discussion should be focused on exactly that: identifying the problems with the library. Once we know what the real problems are, we can be certain that any new system that is proposed addresses them.

Note that this is about problems within iostreams. This is not about a list of things you wish it could do. This is about what iostreams actually tries to do but fails at in some way. So stuff like async file IO doesn’t go here, since iostreams doesn’t try to provide that.

Feel free to add to this list other flaws you see in iostreams. Or if you think that some of them are not real flaws, feel free to explain why.
[...]

What do you think? Are there other issues in iostreams that need to be mentioned?

First, I do not consider myself C++ IO streams expert, rather an advanced user. I agree that current C++ IO streams have some problems.

Performance
I have never needed that much performance that I would have to not use C++ IO streams to get the performance. Thus, I do not consider performance an issue with the current IO streams except for std::stringstream et al. I think that it is a failure in design that getting the string out of the stringstream is by value. Second, that the only way to reset the stream easily is to call 'stream.str("")' or 'stream.str(std::string())'. There should be some sort of 'clear()' like member function.

Problematic cases
Here are some use cases and experiences where I think the current C++ IO streams are lacking or failing.

Recently, I have decided that I wanted to read (on Windows with MSVC) UTF-16 or UTF-32 text files using wchar_t variants of file IO streams. Now, to get that with C++11 I have to imbue the streams with one of codecvt_utf{16,32} facets. So far that's ok and understandable. What I consider a failure in design is that to actually get it working, I have to open files in binary mode. Opening the file in binary mode means that the stream will stop translating DOS/*NIX EOLs. Clearly, IMHO, the EOLs and encoding are two separate issues, or should be. Maybe locale should also have some sort of EOL facet to do this?

Second problem I consider important is that writing own streambufs is exceptionally hard. This seems to be because both the semantics and names of streambuf's member functions are bizarre.

Possible solution?
On few occasions, I have used Boost.IOStreams. Their abstractions and categories of streams are richer than what standard C++ IO streams offer and they have worked for me well enough, certainly better than raw streams, in some situations. Especially the 'stream' and 'stream_buffer' class templates are extremely useful. Implementing own stream and stream_buf on to of Device concept using these two templates is rather easy. Filtering stream with chain of filters is another very useful concept.

If Boost.IOStreams are not directly usable to be adopted as a standard library, then at least they can server as an example of successful library, IMHO, from which anybody who would like to improve existing C++ IO streams should learn.

If nothing else could be accepted from the library, just the stream and the stream_buffer classes alone (with the necessary support classes/code) would be a huge improvement to standard C++ IO streams.

HTH,

--
VZ

signature.asc

Beman Dawes

unread,
Nov 17, 2012, 6:03:25 PM11/17/12
to std-pr...@isocpp.org
On Sat, Nov 17, 2012 at 2:36 PM, Jason McKesson <jmck...@gmail.com> wrote:

> ...
> The Iostreams library in C++ has a problem.

Um... I suspect most of the LWG believes iostreams has far more than
one problem.

> What do you think? Are there other issues in iostreams that need to be
> mentioned?

You might want to ask Herb Sutter for his list of problems with
iostreams. IIRC, there are eight or ten issues on his list, and he
believes a C++11 version of Boost.Format, or something similar, would
solve a lot of them. But best to ask him directly.

This mailing list is a good place to float an idea about your library,
as mentioned in http://isocpp.org/std/submit-a-proposal

But the assumption was that you had an existing library you wanted to
float for possible standardization, not just a wish-list and some
ideas about a possible future library.

As has been noted many times by many LWG members, the problem with
libraries that don't exist yet is that they are inevitably presented
as far superior to existing libraries for the problem domain. And if
someone raises an issue with the not-yet-existing library, the
response is often that the issue will be easy to fix. So of course
everyone would love to have this wondrous library for the standard!
But only If it ever gets implemented, documented, used, refined, and
matures into something useful, and someone writes an actual proposal
document.

--Beman

Martinho Fernandes

unread,
Nov 17, 2012, 6:20:12 PM11/17/12
to std-pr...@isocpp.org
On Sat, Nov 17, 2012 at 11:34 PM, Václav Zeman <vhai...@gmail.com> wrote:
On 11/17/2012 08:36 PM, Jason McKesson wrote:
I think that it is a failure in design that getting the string out of the stringstream is by value.

I think getting the string by value is the correct design. What I think is missing is to make str() have lvalue and rvalue ref-qualified overloads so you can get it out of a temporary stringstream with a move, or even write std::move(some_stringstream).str() and "move a string out", but stealing the buffer from the underlying stringbuf.

Martinho

Nicol Bolas

unread,
Nov 17, 2012, 6:21:22 PM11/17/12
to std-pr...@isocpp.org, bda...@acm.org

The main purpose of this thread is to collect a list of legitimate grievances towards iostreams. That way, when someone writes or submits a proposal, we can check it against the list and know how well it's doing. Even better, if I (or anyone reading this) were inclined to write such a library and a proposal, it would help guide my interface to know what the major issues that need resolving are.

Tony V E

unread,
Nov 17, 2012, 7:56:10 PM11/17/12
to std-pr...@isocpp.org
I think beyond just a list problems, you need a list of features / uses. I know what it streams does today, but is that what we really want in a new class? 

I think maybe it should be split into separate classes.

Tony
--
 
 
 

VinceRev

unread,
Nov 17, 2012, 9:06:23 PM11/17/12
to std-pr...@isocpp.org
I agree with your 2 main points : the problem of performance and number formatting. Concerning the format, I think that having the choice between the both syntax in C++ streams would be great, because the printf formatting is sometimes far more easier to use to print numbers on std::cout or to text files. Concerning, the performance, here we have clearly a problem of virtual calls. I work with supercomputers, and I oftenly need to write hundreds of several GB files. Consequently I've run some benchmarks and I've compared the following cases :
- the standard solution using a loop of write()/read() and varying the size of the internal buffer with pubsetbuf
- another one, where I put "manually" the data in a large memory buffer, and when the buffer is full, I call the write()/read() function passing this buffer as parameter

... and the second technique is in general 10x faster than the first one (see the attached plot).

I don't have any elegant solution to provide, but the fact is that the write() and read() functions have a substantial overhead....


benchmark.png

Julien Nitard

unread,
Nov 17, 2012, 9:28:01 PM11/17/12
to std-pr...@isocpp.org
Hi All,

This SO question may be of interest to understand the frustration of some users with iostream:


Regards,

Julien

Bjorn Reese

unread,
Nov 18, 2012, 6:12:27 AM11/18/12
to std-pr...@isocpp.org
On 2012-11-17 20:36, Jason McKesson wrote:

> What do you think? Are there other issues in iostreams that need to be
> mentioned?

Perhaps peripheral, but std::cout (and std::cerr) are objects, so you
cannot use them for debug printing from the destructors of global
objects.

Having said that, I also think that we should consider the virtues of
iostream-style. How would to create something like Boost.Serialization
using a printf-style?

Arthur Tchaikovsky

unread,
Nov 18, 2012, 8:20:12 AM11/18/12
to std-pr...@isocpp.org
It’s very compact, for one. Once you understand the basic syntax of it, it’s very easy to see what’s going on. Especially for complex formatting. Just consider the physical size difference between these two:
snprintf(..., “0x%08x”, integer);
stream << "0x" << std::right << std::hex << std::setw(8) << iVal << std::endl;
It may take a bit longer to become used to the printf version, but this is something you can easily look up in a reference.

a) every heard of "type safety"?
b) What a warped logic. I remember hell unleashed on my proposal to unify class declaration rules, just to cite few:
"Oh, no, another rule to learn", "We don't need it because we do not see point in it etc",
and here what do I see as an argument? 

"It may take a bit longer to become used to the printf version, but this is something you can easily look up in a reference."
a) I am not interested in things that may take a bit longer if I have already things that are safe and easy to use
b) I am not interested in looking something as simple and rudimentary as up in a reference.

We are supposed to make C++ easier. The C++ cannot become a language where every single smallest thing is so complicated that must be looked up in a reference.

Anyway, the point is that you simply don't know what you're talking about when you say that  snprintf is better option to cout.  

Arthur Tchaikovsky

unread,
Nov 18, 2012, 8:38:08 AM11/18/12
to std-pr...@isocpp.org
It’s very compact, for one. Once you understand the basic syntax of it, it’s very easy to see what’s going on. Especially for complex formatting. Just consider the physical size difference between these two:
snprintf(..., “0x%08x”, integer);
stream << "0x" << std::right << std::hex << std::setw(8) << iVal << std::endl;
It may take a bit longer to become used to the printf version, but this is something you can easily look up in a reference.

Again, logic of a person for whom recursion is as easy to understand and use as iteration.

On Saturday, 17 November 2012 19:36:37 UTC, Nicol Bolas wrote:

Martinho Fernandes

unread,
Nov 18, 2012, 8:42:33 AM11/18/12
to std-pr...@isocpp.org
On Sun, Nov 18, 2012 at 2:38 PM, Arthur Tchaikovsky <atch...@gmail.com> wrote:
It’s very compact, for one. Once you understand the basic syntax of it, it’s very easy to see what’s going on. Especially for complex formatting. Just consider the physical size difference between these two:
snprintf(..., “0x%08x”, integer);
stream << "0x" << std::right << std::hex << std::setw(8) << iVal << std::endl;
It may take a bit longer to become used to the printf version, but this is something you can easily look up in a reference.

Again, logic of a person for whom recursion is as easy to understand and use as iteration.

Your most recent replies have been getting somewhat inflamatory. I think you should take a break.

Martinho

J. Daniel Garcia

unread,
Nov 18, 2012, 8:59:14 AM11/18/12
to std-pr...@isocpp.org
While I do not share inflammatory style, I think we should clearly make a separate of concerns here. If I understood correctly (and that might not be the case), we have here 2 different issues:

+ Performance issue: iostreams are slow. This seems to be relevant only for large size files.
+ Usability issue: Current interfaces is very convenient for simple cases, although there are some complains for complex cases

Is this accurate summary?

--
 
 
 




Nicol Bolas

unread,
Nov 18, 2012, 12:21:34 PM11/18/12
to std-pr...@isocpp.org


On Sunday, November 18, 2012 5:20:13 AM UTC-8, Arthur Tchaikovsky wrote:
It’s very compact, for one. Once you understand the basic syntax of it, it’s very easy to see what’s going on. Especially for complex formatting. Just consider the physical size difference between these two:
snprintf(..., “0x%08x”, integer);
stream << "0x" << std::right << std::hex << std::setw(8) << iVal << std::endl;
It may take a bit longer to become used to the printf version, but this is something you can easily look up in a reference.

a) every heard of "type safety"?

Yes. Which Boost.Format provides quite nicely while still using printf-style syntax.

Jens Maurer

unread,
Nov 18, 2012, 3:19:41 PM11/18/12
to std-pr...@isocpp.org
On 11/18/2012 12:12 PM, Bjorn Reese wrote:
> Perhaps peripheral, but std::cout (and std::cerr) are objects, so you
> cannot use them for debug printing from the destructors of global
> objects.

That's not quite accurate, see 27.4.1p2:

"The objects are not destroyed during program execution."

plus footnote:

"294) Constructors and destructors for static objects can access these
objects to read input from stdin or write output to stdout or stderr."

Jens

Brendon Costa

unread,
Nov 18, 2012, 8:04:21 PM11/18/12
to std-pr...@isocpp.org
I am not an expert on these things but just want to add my two cents in case it is helpful.

I have found in the workplaces I have been at that people generally prefer to use printf style string formatting over the ostream style. Despite the significant issues that come with using printf in particular (programs that crash on incorrect usage comes to mind). 

There have been a number of reasons for this:

1) Performance
This is the big one, particularly for log messages. 

2) A preference on how people like to read strings
This is subjective but again I think that most people I have spoken to about this prefer to read strings where the code does not get inserted in the middle of reading the textual string (not the best code fragment but gives an example):

printf("Client %s, failed to connect at address: %s for reason: %s\n", c->name, c->address, strerror(errno));

instead of:

std::cout << "Client " << cl->name << ", failed to connect at address: " << c->address << " for reason: " << strerror(errno) << std::endl;

The reasoning is that you don't have to mentally "parse" through the code to read what the message is saying using the printf style of formatting. The % symbols are "less intrusive" than inserting code in the middle of the text string.

3) Simplicity formatting in certain cases
Good examples of this have already been mentioned. The two that come up a lot are printing a hex integer or prefixing things with 0's to a specific width. 

4) Inconsistencies in the stream interface
One example here are flags. Some work on only the next item, but others set details globally. One example I have seen a few times is std::setprecision() being used and not expecting it to last past the next item but resulting in changing the precision for components that follow.

5) Dynamic memory allocation
The one other item that seems to have been relatively important (or at least perceived to be important) in the past is dynamic allocation of memory. It should be possible to use a custom streambuf on a pre-allocated buffer, but in general people simply seem to fall back to snprintf() for its simplicity. Again, this was common in logging where say a subsystem (like syslog) has a max sized buffer it accepts we would allocate a pool of objects that size and simply construct messages into those buffers (truncating as necessary). I don't think this is currently easily supported by iostreams, but then dont know if it should be (and why it is last on my list).


Now Boost.Format solves most of these issues IMO except possibly the performance issue. This may have changed since it was measured or the measurements done may have been incorrect:




--
 
 
 

Bjorn Reese

unread,
Nov 19, 2012, 11:01:22 AM11/19/12
to std-pr...@isocpp.org
I supposed I have found a bug in the compiler (or standard C++ library)
then. I just checked C++98, and it also contains the passages you quote.

Olaf van der Spek

unread,
Nov 19, 2012, 6:51:12 PM11/19/12
to std-pr...@isocpp.org
On Saturday, November 17, 2012 8:36:37 PM UTC+1, Nicol Bolas wrote:
What do you think? Are there other issues in iostreams that need to be mentioned?

I think a C++11 variant of printf / boost::format should be standardized to deal with the utility issue.
I think a lower-level interface should be provided for binary (unbuffered and maybe async) IO. 
This wouldn't fix iostreams, but it'd avoid it for a number of use cases.


Olaf

Arthur Tchaikovsky

unread,
Nov 20, 2012, 5:23:26 AM11/20/12
to std-pr...@isocpp.org
Your most recent replies have been getting somewhat inflamatory. I think you should take a break.

Fair enough, but interestingly, you didn't say anything to the guy who claimed that my suggestion is idiotic. I believe that either apply rules (of correct manners etc) to everyone and I am more than happy for it, or don't apply them at all. Saying just to one guy (me) to ease off and don't say anything to another guy why I believe presented far worse behavior than I (calling someone's suggestion "idiotic") is simply not fair. I would like you to note that I wasn't the first guy who posted "somewhat" inflammatory posts. Some people here are passive aggressive and this bad too yet you don't mind them doing so. And also, please note that I didn't use any offensive words, like commenting on someone's suggestion as "idiotic", for example.

Nicol Bolas

unread,
Nov 20, 2012, 12:01:16 PM11/20/12
to std-pr...@isocpp.org


On Tuesday, November 20, 2012 2:23:26 AM UTC-8, Arthur Tchaikovsky wrote:
Your most recent replies have been getting somewhat inflamatory. I think you should take a break.

Fair enough, but interestingly, you didn't say anything to the guy who claimed that my suggestion is idiotic. I believe that either apply rules (of correct manners etc) to everyone and I am more than happy for it, or don't apply them at all. Saying just to one guy (me) to ease off and don't say anything to another guy why I believe presented far worse behavior than I (calling someone's suggestion "idiotic") is simply not fair. I would like you to note that I wasn't the first guy who posted "somewhat" inflammatory posts. Some people here are passive aggressive and this bad too yet you don't mind them doing so. And also, please note that I didn't use any offensive words, like commenting on someone's suggestion as "idiotic", for example.

"idiotic" is not an offensive word. More importantly, he called your suggestion idiotic, which is very different from calling you idiotic. Attacks against your suggestion are going to happen; that's what this discussion forum is about. Attacking you as a person is what we wouldn't allow; attacking a suggestion is perfectly reasonable.

Plus, the "idiotic" comment came after an extended period of discussion where you continued to use the same reasoning over and over, without showing the slightest sense that you understood the opposing argument. Nor did you display any recognition or understanding of the simple fact that the standard doesn't cover what you were talking about. Given the substance of the discussion, I think it was a perfectly reasonable assessment of your suggestion.

DeadMG

unread,
Nov 20, 2012, 12:39:44 PM11/20/12
to std-pr...@isocpp.org
I think that a replacement should focus on just I/O. Let the Unicode proposal propose text formatting replacements.

ma...@lysator.liu.se

unread,
Nov 23, 2012, 2:11:42 AM11/23/12
to std-pr...@isocpp.org
The performance problem of iostreams is the locale support.
If you remove the locale support then everything can be nicely inlined into nothingness and run in circles around printf. Remember that printf do parse the format string every time it runs so there is a pretty big wiggle room if that is what you wish to beat.


> There’s one real problem with this logic, and it is exactly why people
> suggest C-standard file IO. Iostreams violates a fundamental precept of
> C++: pay only for what you
use.
 
Yes. See above.


> Consider this suite of benchmarks. This code doesn’t do file IO; it writes
> directly to a string. All it’s doing is measuring the time it takes to append
> 4-characters to a string. A lot. It uses a `char[]` as a useful control. It also
> tests the use of `vector<char>` (presumably `basic_string` would have
> similar results). Therefore, this is a solid test for the efficiency of the
> iostreams codebase itself.

>
> Obviously there will be some efficiency loss. But consider the numbers in
> the results.


I did download the tests and ran them using g++ -O2 <filename>.cpp
My g++ is g++-4.7.2 on linux.
All tests run in about the same time save for 'putting binary data into a vector<char> using back_inserter' which took about 6x the times of the rest and, contradicting your analysis, 'putting binary data directly into stringbuf' which took about half the time of the rest.
If I were to remove the -O2 flag, telling the compiler to not optimize the code, then my test results show some similarity to yours (Worst case 15x) but who compiles benchmarks without optimization?

/MF

Nicol Bolas

unread,
Nov 23, 2012, 3:47:19 AM11/23/12
to std-pr...@isocpp.org, ma...@lysator.liu.se


On Thursday, November 22, 2012 11:11:43 PM UTC-8, ma...@lysator.liu.se wrote:
The performance problem of iostreams is the locale support.
If you remove the locale support then everything can be nicely inlined into nothingness and run in circles around printf.
 
I'm curious as to how this "inlined into nothingness" thing works when most of iostreams' interface, particularly all of the overloads of types, is based on virtual calls. Non-statically-determinable virtual calls.
 
Remember that printf do parse the format string every time it runs so there is a pretty big wiggle room if that is what you wish to beat.

> There’s one real problem with this logic, and it is exactly why people
> suggest C-standard file IO. Iostreams violates a fundamental precept of
> C++: pay only for what you
use.
 
Yes. See above.

> Consider this suite of benchmarks. This code doesn’t do file IO; it writes
> directly to a string. All it’s doing is measuring the time it takes to append
> 4-characters to a string. A lot. It uses a `char[]` as a useful control. It also
> tests the use of `vector<char>` (presumably `basic_string` would have
> similar results). Therefore, this is a solid test for the efficiency of the
> iostreams codebase itself.

>
> Obviously there will be some efficiency loss. But consider the numbers in
> the results.


I did download the tests and ran them using g++ -O2 <filename>.cpp
My g++ is g++-4.7.2 on linux.
All tests run in about the same time save for 'putting binary data into a vector<char> using back_inserter' which took about 6x the times of the rest and, contradicting your analysis, 'putting binary data directly into stringbuf' which took about half the time of the rest.
If I were to remove the -O2 flag, telling the compiler to not optimize the code, then my test results show some similarity to yours (Worst case 15x) but who compiles benchmarks without optimization?

As stated in the page, the benchmarks were compiled with O3 on g++ 4.3.4. Thus, this is more likely due to more aggressive optimizations and/or better standard library implementations. Also, were you compiling as C++11 or as C++03?
 

/MF

Olaf van der Spek

unread,
Nov 23, 2012, 5:16:19 AM11/23/12
to std-pr...@isocpp.org, ma...@lysator.liu.se
Op vrijdag 23 november 2012 08:11:43 UTC+1 schreef ma...@lysator.liu.se het volgende:

The performance problem of iostreams is the locale support.
If you remove the locale support then everything can be nicely inlined into nothingness and run in circles around printf. Remember that printf do parse the format string every time it runs so there is a pretty big wiggle room if that is what you wish to beat.

If the format argument is known at compile time, you could parse it at compile time and gain type safety as a bonus.

Is anyone actually using locales?
When writing to files I don't want the output to be affected by locales.

Arthur Tchaikovsky

unread,
Nov 23, 2012, 5:24:53 AM11/23/12
to std-pr...@isocpp.org
"idiotic" is not an offensive word

Your suggestion that idiotic isn't offensive word is idiotic and ignorant. No offense though. I'm not calling you idiotic just your suggestion. 

Arthur Tchaikovsky

unread,
Nov 23, 2012, 12:33:23 PM11/23/12
to std-pr...@isocpp.org
After all, iteration is more natural to C++ than recursion (Alexandrescu,Modern C++ Design Generic Programming and Design Patterns Applied, chapter 3.5)

One more prove that your logic is flawed (oopss, not flawed, idiotic as you don't find this word offending), that you're rude, that you're not interested in listening in others opinions etc. etc.


On Tuesday, 20 November 2012 17:01:16 UTC, Nicol Bolas wrote:

Xeo

unread,
Nov 23, 2012, 12:37:53 PM11/23/12
to std-pr...@isocpp.org
Please take a leave, cool your head down, and come back when you're ready for professional discussions again. You're just sounding childish right now, bringing unrelated topics into the discussion.

Ville Voutilainen

unread,
Nov 23, 2012, 12:37:58 PM11/23/12
to std-pr...@isocpp.org
On 23 November 2012 19:33, Arthur Tchaikovsky <atch...@gmail.com> wrote:
> After all, iteration is more natural to C++ than recursion
> (Alexandrescu,Modern C++ Design Generic Programming and Design Patterns
> Applied, chapter 3.5)
> One more prove that your logic is flawed (oopss, not flawed, idiotic as you
> don't find this word offending), that you're rude, that you're not
> interested in listening in others opinions etc. etc.

Please do explain what this response has to do with "the failures of iostreams"?
Or with std-proposals? Well, actually, please *don't* explain that, I
don't think
we want to hear.

Nicol Bolas

unread,
Nov 23, 2012, 12:49:37 PM11/23/12
to std-pr...@isocpp.org, ma...@lysator.liu.se

I think locales would be important for things like formatting currency, dates, times, etc. It could have its place in the formatting part of the API. The reason locales aren't often used is because... they're terrible. And unreliable. If we had Boost.Locale-style locales, then there'd be a better chance of them seeing use.

The problem with iostreams is that locales are part of the streambuf, not merely the formatting stream. The streambuf should be about basic "byte" input/output to/from a stream, not locale-specific constructs.

Arthur Tchaikovsky

unread,
Nov 24, 2012, 3:51:12 AM11/24/12
to std-pr...@isocpp.org
"idiotic" is not an offensive word. More importantly, he called your suggestion idiotic, which is very different from calling you idiotic. Attacks against your suggestion are going to happen; that's what this discussion forum is about. Attacking you as a person is what we wouldn't allow; attacking a suggestion is perfectly reasonable.

Plus, the "idiotic" comment came after an extended period of discussion where you continued to use the same reasoning over and over, without showing the slightest sense that you understood the opposing argument. Nor did you display any recognition or understanding of the simple fact that the standard doesn't cover what you were talking about. Given the substance of the discussion, I think it was a perfectly reasonable assessment of your suggestion.

Please do explain what this response has to do with "the failures of iostreams"?

My response has as much to do with failures of iostreams as his reply to me, which I've cited above. If they are rules, the rules should be obeyed by everyone, and applied to everyone. Ville, why didn't you ask Nicol the same question you've asked me? Why is it OK for him to behave like smart as* and when I'm replying to his post and explain how idiotic his suggestion is, it is me who is the bad guy? If you could note, it is not me who starts being offensive - be it passive or active. You for example, with your icecream are the best example of passive rudeness and lack of basic manners. If someone tells you something, no matter how wrong he is, if you have basic manners you don't tell him that you rather go and get some icecream instead of listen to him. Yet you said exactly this. And yet, when someone calls my suggestion idiotic you nor anyone reacts? You didn't say to Nicol for example that he shouldn't post such idiotic reply (the one cited above) - it was OK with you. Why? I'm not an aggressive person nor person who is looking for any kind of trouble, but when I come across of boorish behavior (you, Nicol, and the "idiotic" guy) I feel that I have to defend myself. That's all.

ri...@longbowgames.com

unread,
Nov 24, 2012, 7:51:16 AM11/24/12
to std-pr...@isocpp.org, ma...@lysator.liu.se
I'm just going to ignore the posts which are... let's call them 'off-topic'.


On Friday, November 23, 2012 12:49:38 PM UTC-5, Nicol Bolas wrote:
The problem with iostreams is that locales are part of the streambuf, not merely the formatting stream. The streambuf should be about basic "byte" input/output to/from a stream, not locale-specific constructs.

There's an argument to be made that locale formatting shouldn't be done in a stream at all, but rather be a collection of string operations.

DeadMG

unread,
Nov 24, 2012, 7:55:14 AM11/24/12
to std-pr...@isocpp.org, ma...@lysator.liu.se, ri...@longbowgames.com
A very important argument, IYAM. There's no reason to couple I/O and string formatting. I/O should serve the purpose of "Writing bytes to an external source", and that's all.

Olaf van der Spek

unread,
Nov 24, 2012, 8:03:46 AM11/24/12
to std-pr...@isocpp.org
Where should newline translation be done?


--
Olaf

ri...@longbowgames.com

unread,
Nov 24, 2012, 8:16:48 AM11/24/12
to std-pr...@isocpp.org
On Saturday, November 24, 2012 8:03:48 AM UTC-5, Olaf van der Spek wrote:
Where should newline translation be done?

I would argue that newline translation is a serialization issue, not a localization issue, and so is within the realm of I/O.  Same goes with BOMs and byte order if we're dealing with Unicode streams. Number formats and padding, on the other hand, are harder to justify.

DeadMG

unread,
Nov 24, 2012, 8:16:55 AM11/24/12
to std-pr...@isocpp.org
Nowhere- or at least, if the user wants to do it, he should do it himself. I can see a potential argument for having a newline constant for different plats but when dealing with input, the various kinds of newline are really the user's problem. It's not like "Ignore \r and use \n" is a complex thing to do.

The I/O library reads the bytes. If you want to change them or whatever, that's your problem.

Olaf van der Spek

unread,
Nov 24, 2012, 8:19:40 AM11/24/12
to std-pr...@isocpp.org
On Sat, Nov 24, 2012 at 2:16 PM, DeadMG <wolfei...@gmail.com> wrote:
> Nowhere- or at least, if the user wants to do it, he should do it himself. I
> can see a potential argument for having a newline constant for different
> plats but when dealing with input, the various kinds of newline are really
> the user's problem. It's not like "Ignore \r and use \n" is a complex thing
> to do.

Are newlines guaranteed to be \r and \n?

--
Olaf

DeadMG

unread,
Nov 24, 2012, 8:38:43 AM11/24/12
to std-pr...@isocpp.org
No, but they are the only conventions of note. In any case, it's still outside the remit of IO. It serves the data- how you interpret it is your problem.

ri...@longbowgames.com

unread,
Nov 24, 2012, 9:17:06 AM11/24/12
to std-pr...@isocpp.org
Whether it's in the 'basic' iostream or layered somewhere on top, surely the standard library should support reading and writing .txt files.

However, even before Unicode, back when the only thing you had to worry about when reading/writing text files was how newlines were encoded, the standard library was already doing a pretty bad job, since it's fairly difficult to choose exactly which kind of newline you want to output.

It appears to me that there's three things we're dealing with here: raw I/O, I/O file format, and natural language localization. Locales currently couple the last two, and IOStreams currently couple all three.

My personal feeling is that we should have classes for reading/writing raw I/O, classes built on top of that for reading/writing text files (rather than the existing ios_base::bin solution), and the natural language localization should be string operations rather than I/O operations.

DeadMG

unread,
Nov 24, 2012, 10:13:59 AM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com
The primary problem with that is that "Text file" isn't actually a well-defined platform-independent concept because of the newlines.

I don't have a problem with the idea of a text stream or something for simple uses, but it's not a part of the core- it's a wrapper on a stream of bytes. As for localization, that's definitely a billion miles outside the remit of IO.

ri...@longbowgames.com

unread,
Nov 24, 2012, 12:38:22 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com
On Saturday, November 24, 2012 10:13:59 AM UTC-5, DeadMG wrote:
The primary problem with that is that "Text file" isn't actually a well-defined platform-independent concept because of the newlines.

Let's make it well defined :)

Ideally, an "oTextStream" would let you set the newline sequence, which encoding to use (minimally UTF-8, UTF-16LE, UTF-16BE, UTF32-LE, or UTF32BE), and whether or not to include a BOM.  The newline sequence should *not* be decided by the platform, since it's not uncommon to want to write Unix-style text files in a Windows app, for instance.

An "iTextStream" should attempt to determine the encoding based on the BOM, or default to UTF-8 if no BOM is present. The user should also be able to explicitly say which encoding to use and whether or not to parse BOMs.

Lets talk more about the IOStream library as a whole. It would be really nice if it chained, like Boost.Iostreams. This would make the library flexible enough to support things like sockets, compression, and encryption.

I would be tempted to stay with an inheritance design so that to chain all you need to do is inherit from an iStream or oStream and hold a unique_ptr to the next iStream or oStream. In the case of something like iTextStream, this would give you enough flexibility to be constructed from a unique_ptr or just a filename. The filename version would really just be for convenience, but it would make it easier to teach new users how to read text files.

Some people in this thread are worried about the cost of virtual calls, so another option is to base the chaining on templates instead of inheritance. That would complicate the interface, and it would require that you either template-ize anything that deals with streams or to wrap your streams in some sort of stream_ref class, but it would probably be faster and more flexible.

The third option is to do it just like Boost.Iostreams, where you use inheritance but you only store a reference to your chained streams instead of taking ownership of them. This has the advantage of making it easier to adjust a filter after it's bound, but it means the user is responsible for the lifetime of each stream in the chain, which gets really annoying when you want to give a stream to an object, since it means you have to manually make sure the streams don't die before the object that's using them does. It would also preclude things like a text stream accepting a file path, since that would require the text stream to be able to optionally create its own source/sink.

Nicol Bolas

unread,
Nov 24, 2012, 12:51:26 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com

This is getting kind of off-topic (this is about finding out where iostreams went wrong, not how to fix it), but the way I invisioned text files was that they were filters that would be used on top of binary files as sources/sinks. You wouldn't need a separate sink for them. They scan text for a character; if they find '\n', they convert it into the platform-specific equivalent. BOMs would work more or less the same way, except that they only do the insertion once: the first time someone tries to write something. After that, they're inert.

DeadMG

unread,
Nov 24, 2012, 1:36:15 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com
We have iterators for that. Hell, you can do that right now.

template<typename Iterator, typename Out> void encrypt(Iterator begin, Iterator end, Out o) { ... }
encrypt(begin, end, std::ostream_iterator(stream));

set the newline sequence,

And if I want to support multiple? Does that mean I have to be precognitive?

stay with an inheritance design

So we can continue to feel the pain of multiple inheritance? Getting rid of inheritance is a big part of the objective.

See, here's where you're going wrong. You're treating streams like iterators. They're not. Streams do not implement any functionality, at all, ever, except reading and writing bytes from external sources. They do not implement compression, or encryption. You do not compose them. They implement one specific function, and that's it. We already have iterators (ranges if we're lucky) and functions for this. The best model for a stream is as a function object. Then, for writing a range, you could do something as simple as std::for_each(begin, end, std::ref(stream));

 You wouldn't need a separate sink for them.

I agree, text files are really just about encoding data. A quick wrapper or iterator would be fine.

 convert it into the platform-specific equivalent

No. Then, you cannot write newlines which are for another platform because you're interoperating with it (say, a file to be sent over a network) or because some other application won't play well with these or something like that. The Standard should certainly expose a platform-specific newline constant, but when reading or writing them, it should be the user's choice as to what to do.

It also occurs to me that input iterators and output iterators are very silly.

Jean-Marc Bourguet

unread,
Nov 24, 2012, 2:07:08 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com
Le samedi 24 novembre 2012 16:13:59 UTC+1, DeadMG a écrit :
The primary problem with that is that "Text file" isn't actually a well-defined platform-independent concept because of the newlines.

All platforms have a notion of text file. One needs to have a C++ notion which is abstract enough that it can be used with the platform notion, not a C++ notion which is specified in such a way that there are platforms which may not implement C++ text files using their notion of text file.

Historically, OS have used notions of files which are far more than just a stream of byte. They may have stream oriented files, sequence of record files, key accessed record files,... lines in text file were numbered in some OS.

I'm not sure if that variety is still relevant but C and C++ IO were designed to handle them (for instance, spaces before end of line may disappear when rereading a text file, NUL characters may appear at end of file for binary files). Before designing a replacement which is unable to handle them, I'd suggest to be sure that they are no more relevant (start by looking at z/OS) and to bring people aware of the IO models of the OS you want to support early enough that you don't have to restart your design as not portable enough.

Yours,

--
Jean-Marc Bourguet

Nicol Bolas

unread,
Nov 24, 2012, 2:25:15 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com

Why do we have to support those?

I know it sounds silly to say, but iostream will continue to exist. Just as fopen does. If you're working in such a system and need those specific kinds of translations, I would suggest that the new system simply be able to use an iostreambuf as a sink/source.

Yours,

--
Jean-Marc Bourguet

ri...@longbowgames.com

unread,
Nov 24, 2012, 2:43:03 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com
On Saturday, November 24, 2012 1:36:15 PM UTC-5, DeadMG wrote:
We have iterators for that. Hell, you can do that right now.

Let's say I have a compressed log file that I want to read one line at a time. With chaining streams, you could do this:

ITextStream stream(make_unique<ICompressedStream>(make_unique<IFileStream>("log.gz")));
while(s = getline(stream))
  // Do something


Try doing that with iterators. The only way you could do it is by loading the entire file into memory, or by using those 'silly' input iterators.

set the newline sequence,

And if I want to support multiple? Does that mean I have to be precognitive?

You'll notice I only suggested setting the newline sequence for output streams. For input streams, a sane default would be to swallow \r characters, unless the user is expecting a certain sequence.

So we can continue to feel the pain of multiple inheritance?

std::iostream is the only part of the existing library that uses multiple inheritance, and with a filter design I'm not 100% convinced that multiple inheritance is necessary. Even so, I've never experience any pain using std::iostream; not that I use it often.

On Saturday, November 24, 2012 2:07:08 PM UTC-5, Jean-Marc Bourguet wrote:
I'm not sure if that variety is still relevant but C and C++ IO were designed to handle them (for instance, spaces before end of line may disappear when rereading a text file, NUL characters may appear at end of file for binary files). Before designing a replacement which is unable to handle them, I'd suggest to be sure that they are no more relevant (start by looking at z/OS) and to bring people aware of the IO models of the OS you want to support early enough that you don't have to restart your design as not portable enough.

I wouldn't be against allowing vendors to offer their own encoding as an option, however, the beauty of the filter design is that people can write their own plaintext filter if they really care.


On Saturday, November 24, 2012 12:51:26 PM UTC-5, Nicol Bolas wrote:
this is about finding out where iostreams went wrong, not how to fix it

You have no idea how much restraint I'm exercising by not giving a snarky reply ;)

Okay, I'll 'answer in the form of a question', as it were. Here's my list:
* Not designed with filters in mind.
* Newline format is defined by the platform rather than the programmer.
* No support for UTF.
* Because binary mode is set with ios_base::bin instead of with a different type, passing streams as parameters is unsafe.
* Locales conflate encoding with localization.
* You have no idea what you get with a locale, and defining your own is not trivial.
* Since localization and formatting is tied to streams, you can't localize or format a value to a string without going through a stream.

And a new one:
* Not designed with non-blocking streams in mind. This is necessary for network sockets, but would also be nice to have for stdio.

Jean-Marc Bourguet

unread,
Nov 24, 2012, 3:14:10 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com
Le samedi 24 novembre 2012 20:25:15 UTC+1, Nicol Bolas a écrit :

On Saturday, November 24, 2012 11:07:08 AM UTC-8, Jean-Marc Bourguet wrote:
I'm not sure if that variety is still relevant but C and C++ IO were designed to handle them (for instance, spaces before end of line may disappear when rereading a text file, NUL characters may appear at end of file for binary files). Before designing a replacement which is unable to handle them, I'd suggest to be sure that they are no more relevant (start by looking at z/OS) and to bring people aware of the IO models of the OS you want to support early enough that you don't have to restart your design as not portable enough.


Why do we have to support those?

Personally, I don't care. I consider the platforms I'm sure would have had problems as no more relevant if they ever were. The more relevant platform I know which could have problems is z/OS, but I don't know enough about it to be sure.

But if this end up in a formal proposal and if my understanding of the committee dynamic is right, it'll be confronted to people who are thinking in the other direction and will ask why we should make a standard only partially implementable on these platforms which were supported. Especially if the platforms are still relevant, but possibly even if they aren't. See what happened with the proposition to remove trigraphs. Its better to have a design which doesn't have foreseeable objections, or at least to be prepared to answer them.

Yours,

--
Jean-Marc Bourguet

DeadMG

unread,
Nov 24, 2012, 4:07:39 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com
Try doing that with iterators. The only way you could do it is by loading the entire file into memory, or by using those 'silly' input iterators.

I agree that input iterators are really just functions in disguise, but they do work and not badly either. There is no reason why an input-iterator based solution could not work just fine. More relevantly, an input-iterator based solution would actually be remotely generic- I could decompress a file I had already loaded into memory, for example.

There's really no need for a filter, source, sink design in IO, because we already have iterators and they already model those concepts and it would be remotely compatible with existing code.

You'll notice I only suggested setting the newline sequence for output streams. For input streams, a sane default would be to swallow \r characters, unless the user is expecting a certain sequence.

You have no idea what the user is expecting. Only they know that. 

ri...@longbowgames.com

unread,
Nov 24, 2012, 5:41:15 PM11/24/12
to std-pr...@isocpp.org, ri...@longbowgames.com
On Saturday, November 24, 2012 4:07:40 PM UTC-5, DeadMG wrote:
I agree that input iterators are really just functions in disguise, but they do work and not badly either. There is no reason why an input-iterator based solution could not work just fine. More relevantly, an input-iterator based solution would actually be remotely generic- I could decompress a file I had already loaded into memory, for example.

Ah, input iterators aren't so silly now, are they?

You can always adapt an input iterator to a stream of vice versa, but iterators are awkward in this case for three reasons:

1) The end of an input_iterator is a wasteful hack.
2) You can't differentiate between 'no data' and 'end of data'.
3) Iterators don't take ownership, so you have additional lifetime management.

Here's what the code would look like with iterators:

ifstream fin("log.gz");
istream_iterator it1Start(fin), it1End();
ICompressorIterator it2Start(it1Start), it2End(it1End);
IPlaintextIterator it3Start(it2Start), it3End(it2End);

while(s = getline(it3Start, it3End))
  // Do something


And because iterators don't take ownership of the thing they're iterating (at least not idiomatically), you can't give it3Start/End to an object without ensuring that fin, it1Start/End, and it2StartEnd all outlive the object in question.

It's much easier to go the other way:

ITextStream stream(make_unique<ICompressedStream>(make_unique<IFileStream>("log.gz")));
copy(input_stream(stream), input_stream(), output_stream(some_buffer));

Like many things, the situation gets a lot better if you use ranges instead of iterators. Why? It's not because streams are only useful for 'reading and writing to external sources'. It's because one-directional ranges and streams are effectively the same thing.

Now, completely abolishing streams and using input/output ranges is an interesting idea. It's mostly just a naming issue, but assuming the standard library adopted range-based algorithms, it would make things more consistent and interoperable. It would look like this:

auto range = make_iplaintext_range(make_icompressedrange_range(make_ifile_range("log.gz")));
while(s = getline(range))
  // Do something


Not bad. Unfortunately, a function expecting a range like that would look like this:

void foo(IPlaintextRange<ICompressedRange<IFileRange>> range);

So you would either have to templatize anything that uses file streams, or make wrapper objects for ranges. Not the end of the world.
 
You'll notice I only suggested setting the newline sequence for output streams. For input streams, a sane default would be to swallow \r characters, unless the user is expecting a certain sequence.

You have no idea what the user is expecting. Only they know that. 
 
Oh for heaven's sake, are you seriously taking me to task for suggesting that users who use line endings other than \n, \n\r, or \r\n, would have to stoop so low as to override a default option?

DeadMG

unread,
Nov 25, 2012, 10:23:58 AM11/25/12
to std-pr...@isocpp.org, ri...@longbowgames.com
Oh for heaven's sake, are you seriously taking me to task for suggesting that users who use line endings other than \n, \n\r, or \r\n, would have to stoop so low as to override a default option?

Well, yes. I'm saying that the stream should not eat data unless explicitly asked for.

 1) The end of an input_iterator is a wasteful hack.

It works, and it's compatible with other things. It's ironic for you to call input iterators wasteful when you'd be re-inventing a massive amount of existing functionality, especially when it would not meaningfully interact with what we have now. That is wasteful.

 2) You can't differentiate between 'no data' and 'end of data'.

I don't see the difference. In either case, there is no more data to be had.

3) Iterators don't take ownership, so you have additional lifetime management.

Really depends on the iterator. There's absolutely no reason you can't write an owning iterator. It would be unusual, but perfectly feasible. In fact, iterator adaptors often own the iterator they are adapting.

ifstream fin("log.gz");
auto begin = plaintext(decompress(fin.begin()));
auto end = plaintext(decompress(fin.end()));
while(s = getline(begin, end)) {
    ...
}

Unlike your solution, this does not have the potential to require multiple inheritance, nor dynamic allocation, nor virtual calls, and it works well with the rest of the Standard library. Of course ranges makes this quite a bit simpler- and so does ranged-for. You could do

for(string s : getline(plaintext(decompress(ifstream("log.gz")))) {
}

It's because one-directional ranges and streams are effectively the same thing.

Is exactly what I've been saying. It's really a bad idea to have one completely separate interface for X, and then have another completely separate interface for X but more generic. Not only are you duplicating X, there's no reason to use X but less generic. 

ri...@longbowgames.com

unread,
Nov 25, 2012, 10:55:50 AM11/25/12
to std-pr...@isocpp.org, ri...@longbowgames.com
Okay, you misunderstood some of the stuff I was saying about iterators, but I'm pretty sure it's moot. We both agree that ranges are a superior solution to iterators, and, while our confidence levels differ, we both think ranges have the potential for making good streams, so we can stop talking about iterators now, right?

First, a couple things I do want to respond to:


On Sunday, November 25, 2012 10:23:59 AM UTC-5, DeadMG wrote:
Well, yes. I'm saying that the stream should not eat data unless explicitly asked for.

By using a plaintext stream you're already asking for translation. If you want all the carriage returns, you probably want a binary stream. If you want a stream that handles UTF translation and doesn't handle newline translation, then you're certainly in the minority, and overriding a default isn't the end of the world.

 2) You can't differentiate between 'no data' and 'end of data'.

I don't see the difference. In either case, there is no more data to be had.

Think about non-blocking streams, like network sockets. There's a difference between reaching the end of the stream and the rest of the stream not being ready. This is one of the things that ranges typically don't deal with.

If we're talking about replacing streams with file ranges, I think it would be worth consideration that we give a ready() function to all input ranges, and a flush() option to all output ranges. This can even be important for something like compressing and decompressing a file, since you might have to read/write a large amount of data before the next block is ready.

DeadMG

unread,
Nov 25, 2012, 11:13:45 AM11/25/12