Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Defining an enum and some strings in parallel

144 views
Skip to first unread message

Andrei Alexandrescu (See Website for Email)

unread,
Feb 1, 2005, 4:23:56 AM2/1/05
to
Hello,


I'm not very versed in the C preprocessor but people have done pretty
cool things with the Boost preprocessor library, so I thought I'd ask on
how to achieve a particular task.

I need to define at the same time an enum:

enum PartOfSpeech {
coordinatingConjunction,
cardinalNumber,
determiner,
...
};

and a vector of strings that maps strings to that enum:

const char* partOfSpeechEncodings = {
"CC",
"CD",
"DT",
...
};

In the interest of maintainability, I'd like the strings to be close to
the enum values, so I'd essentially need something like:

START_DEFINE_PART_OF_SPEECH()
MAKE_ENTRY(coordinatingConjunction, "CC"),
MAKE_ENTRY(cardinalNumber, "CD"),
MAKE_ENTRY(determiner, "DT"),
END_DEFINE_PART_OF_SPEECH()

I guess there are facilities in the Boost PP library that would allow me
to do that. I'd, however, prefer to see if there's any palatable
standalone solution.


Andrei

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

anton muhin

unread,
Feb 1, 2005, 2:57:41 PM2/1/05
to
Hello, Andrei!

In similar situations I usually use a little bit different trick. It
goes like that:

1) The file with names and strings is introduced like an example below:

STRUCT(coordinateConjunction, "CC"),
STRUCT(cardinalNumber, "CD"),
....

Let's name this file pos.h

2) Then we include this file with different definitions for STRUCT, e.g.:

define STRUCT(name, str) name
enum PartOfSpeech = {
#include "pos.h"
}
#undef STRUCT

#define STRUCT(name, str) str
const char* partOfSpeechEncodings = {
#include "pos.h"
};
#undef STRUCT

As you need commas after an items, commas are put in pos.h as well.

Hope this helps,
with the best regards,
anton.

Andrei Alexandrescu (See Website for Email)

unread,
Feb 1, 2005, 3:31:24 PM2/1/05
to
> const char* partOfSpeechEncodings = {

I meant:

const char* partOfSpeechEncodings[] = {

kevin...@motioneng.com

unread,
Feb 1, 2005, 3:33:06 PM2/1/05
to
I've had a similar problem in the past. What I did was include the the
enumeration values and strings in a seperate file just as you have with
the START_DEFINE_PART_OF_SPEECH() and END_DEFINE_PART_OF_SPEECH()
macros. Let's say that's in a header called "internal_speech.h".

Then I would create another header file, say "speech.h" with the
following code:

#define END_DEFINE_PART_OF_SPEECH() };
#define START_DEFINE_PART_OF_SPEECH() enum PartOfSpeech {
#define MAKE_ENTRY(value, text) value,
#include "internal_speech.h"
#undef START_DEFINE_PART_OF_SPEECH
#undef MAKE_ENTRY

#define START_DEFINE_PART_OF_SPEECH() const char*
partOfSpeechEncodings[] = {
#define MAKE_ENTRY(value, text) text,
#include "internal_speech.h"
#undef START_DEFINE_PART_OF_SPEECH
#undef MAKE_ENTRY
#undef END_DEFINE_PART_OF_SPEECH

This is the general idea I've used. In my previous project, I needed
to do the same with mulitple enumerations so I generalized the macros a
bit to accept an enumeration name and also to create a function to
return the string based corresponding to a particular enumeration
value. I'm sure you can see how to generalize from here. If you have
any questions about my implementation, let me know.

- Kevin Hall

jaap van der Weide

unread,
Feb 1, 2005, 4:25:26 PM2/1/05
to

Andrei Alexandrescu (See Website for Email) wrote:
>
> I need to define at the same time an enum:
>
> enum PartOfSpeech {
> coordinatingConjunction,
> cardinalNumber,
> determiner,
> ...
> };
>
> and a vector of strings that maps strings to that enum:
>
> const char* partOfSpeechEncodings = {
> "CC",
> "CD",
> "DT",
> ...
> };
>
> In the interest of maintainability, I'd like the strings to be close
to
> the enum values, so I'd essentially need something like:
>
> START_DEFINE_PART_OF_SPEECH()
> MAKE_ENTRY(coordinatingConjunction, "CC"),
> MAKE_ENTRY(cardinalNumber, "CD"),
> MAKE_ENTRY(determiner, "DT"),
> END_DEFINE_PART_OF_SPEECH()
>
>


The following should work


#define DEFINE_PART_OF_SPEECH(MAKE_ENTRY) \
MAKE_ENTRY(coordinatingConjunction, "CC") \
MAKE_ENTRY(cardinalNumber, "CD") \
MAKE_ENTRY(determiner, "DT") \

#define MAKE_ENUM_ENTRY(enumeration,text) enumeration,
enum PartOfSpeech {
DEFINE_PART_OF_SPEECH(MAKE_ENUM_ENTRY)
};


#define MAKE_TEXT_ENTRY(enumeration,text) text,
const char* partOfSpeechEncodings[] = {
DEFINE_PART_OF_SPEECH(MAKE_TEXT_ENTRY)
};


Apart from allowing you to keep related things
together it has some other useful applications.

The one I use often is to create an array of
enumeration names to go along with the
enumerations. (Handy for error messages)

#define MAKE_ENUM_TEXT_ENTRY(enumeration,text) #enumeration,
const char* enumText[] = {
DEFINE_PART_OF_SPEECH(MAKE_ENUM_TEXT_ENTRY)
};

I have also found it useful in writting wrappers around
API's. (Mine had about 70 getter and setter type functions.)
It tends to make the code somewhat hard to read
for the uninitiated, but it more than made up for that
in reducing editing errors.

Jaap

Paul Mensonides

unread,
Feb 1, 2005, 4:32:55 PM2/1/05
to
Andrei Alexandrescu (See Website for Email) wrote:

> START_DEFINE_PART_OF_SPEECH()
> MAKE_ENTRY(coordinatingConjunction, "CC"),
> MAKE_ENTRY(cardinalNumber, "CD"),
> MAKE_ENTRY(determiner, "DT"),
> END_DEFINE_PART_OF_SPEECH()

First, the easy way...

// -----

#include <boost/preprocessor/punctuation/comma_if.hpp>
#include <boost/preprocessor/seq/for_each_i.hpp>
#include <boost/preprocessor/tuple/elem.hpp>

#define DEFINE_PART_OF_SPEECH(seq) \
enum PartOfSpeech { \
BOOST_PP_SEQ_FOR_EACH_I( \
DEFINE_PART_OF_SPEECH_II, 0, seq \
) \
}; \
const char* const partOfSpeechEncodings[] = { \
BOOST_PP_SEQ_FOR_EACH_I( \
DEFINE_PART_OF_SPEECH_II, 1, seq \
) \
}; \
/**/
#define DEFINE_PART_OF_SPEECH_II(r, aux, i, pair) \
BOOST_PP_COMMA_IF(i) BOOST_PP_TUPLE_ELEM(2, aux, pair) \
/**/

DEFINE_PART_OF_SPEECH(
((coordinatingConjunction, "CC"))
((cardinalNumber, "CD"))
((determiner, "DT"))
// ...
)

// -----

Remember your fancy assert article with John Torjo? Using the same general kind
of technique can add a little syntactic sugar to the above:

// -----

#include <boost/preprocessor/cat.hpp>
#include <boost/preprocessor/punctuation/comma_if.hpp>
#include <boost/preprocessor/seq/for_each_i.hpp>
#include <boost/preprocessor/tuple/elem.hpp>

#define BINARY_SEQ_TO_SEQ(bseq) \
BOOST_PP_CAT(BINARY_SEQ_TO_SEQ_A bseq, 0xEND) \
/**/
#define BINARY_SEQ_TO_SEQ_A(a, b) ((a, b)) BINARY_SEQ_TO_SEQ_B
#define BINARY_SEQ_TO_SEQ_B(a, b) ((a, b)) BINARY_SEQ_TO_SEQ_A
#define BINARY_SEQ_TO_SEQ_A0xEND
#define BINARY_SEQ_TO_SEQ_B0xEND

// BINARY_SEQ_TO_SEQ( (a, b)(c, d) ) => ((a, b))((c, d))

#define DEFINE_PART_OF_SPEECH(bseq) \
DEFINE_PART_OF_SPEECH_I(BINARY_SEQ_TO_SEQ(bseq)) \
/**/
#define DEFINE_PART_OF_SPEECH_I(seq) \
enum PartOfSpeech { \
BOOST_PP_SEQ_FOR_EACH_I( \
DEFINE_PART_OF_SPEECH_II, 0, seq \
) \
}; \
const char* const partOfSpeechEncodings[] = { \
BOOST_PP_SEQ_FOR_EACH_I( \
DEFINE_PART_OF_SPEECH_II, 1, seq \
) \
}; \
/**/
#define DEFINE_PART_OF_SPEECH_II(r, aux, i, pair) \
BOOST_PP_COMMA_IF(i) BOOST_PP_TUPLE_ELEM(2, aux, pair) \
/**/

DEFINE_PART_OF_SPEECH(
(coordinatingConjunction, "CC")
(cardinalNumber, "CD")
(determiner, "DT")
// ...
)

// -----

> I guess there are facilities in the Boost PP library that would allow
> me to do that. I'd, however, prefer to see if there's any palatable
> standalone solution.

Second, the harder way...which depends on what compiler(s) you are targetting.
Here's an ad-hoc, standalone solution:

// -----

// general facilities...

#define CAT(a, b) PRIMITIVE_CAT(a, b)
#define PRIMITIVE_CAT(a, b) a ## b

#define WHEN(bit) PRIMITIVE_CAT(WHEN_, bit)
#define WHEN_0(expr)
#define WHEN_1(expr) expr

#define COMMA() ,

#define BINARY_SEQ_ENCODE(bseq) \
CAT(BINARY_SEQ_ENCODE_A bseq, 0xEnd)(0, ~, ~) \
/**/
#define BINARY_SEQ_ENCODE_A(a, b) (1, a, b)() BINARY_SEQ_ENCODE_B
#define BINARY_SEQ_ENCODE_B(a, b) (1, a, b)() BINARY_SEQ_ENCODE_A
#define BINARY_SEQ_ENCODE_A0xEnd
#define BINARY_SEQ_ENCODE_B0xEnd

// specific facilities...

#define DPOS(bseq) DPOS_II(BINARY_SEQ_ENCODE(bseq))
#define DPOS_II(enc) \
DPOS_III( \
enum PartOfSpeech { DPOS_A enc }; \
const char* const partOfSpeechEncodings[] = { DPOS_X enc }; \
) \
/**/
#define DPOS_III(x) x // hack

#define DPOS_A(flag, id, str) WHEN(flag)(id DPOS_B_ID)
#define DPOS_B(flag, id, str) WHEN(flag)(COMMA() id DPOS_B_ID)
#define DPOS_B_ID() DPOS_B

#define DPOS_X(flag, id, str) WHEN(flag)(str DPOS_Y_ID)
#define DPOS_Y(flag, id, str) WHEN(flag)(COMMA() str DPOS_Y_ID)
#define DPOS_Y_ID() DPOS_Y

DPOS(
(coordinatingConjunction, "CC")
(cardinalNumber, "CD")
(determiner, "DT")
)

// -----

(Note: DPOS => DEFINE_PART_OF_SPEECH)
(Note: DPOS_III is only necessary as a VC workaround.)

The "difficult" part about generating a comma-separated list from a sequence is
how to get rid of the sequential-iteration macro name that comes out the tail
end. E.g.

#define A(x) x B
#define B(x) , x C
#define C(x) , x B

A(1)(2)(3) => 1, 2, 3 B
^ here

Because of the generated commas, you can't just concatenate it off, as in...

#define CAT(a, b) PRIMITIVE_CAT(a, b)
#define PRIMITIVE_CAT(a, b) a ## b

#define A(x) x B
#define B(x) x A
#define A0
#define B0

CAT( A(1)(2)(3), 0 ) => 1 2 3

The commas prevent that kind of thing from working because they interfere with
the "arguments" to PRIMITIVE_CAT. Thus, the implementation above transforms the
input sequence from (e.g.) "(a, b)(c, d)(e, f)" to "(1, a, b)()(1, c, d)()(1, e,
f)()(0, ~, ~)". This encoding is generated similarly to the (immediately) above
snippet--which is not a problem because it doesn't generate open commas. For
each pair of values (i.e. each "binary element"), it adds a flag that says
whether or not we're at the end. It also adds the nullary parentheses which
just reduces duplication, e.g.

#define A(x) complex_expression_in_terms_of(x) A_ID
#define A_ID() A

instead of

#define A(x) complex_expression_in_terms_of(x) B
#define B(x) complex_expression_in_terms_of(x) A

Given the resulting special-form sequence, we can simply not generate the
trailing macro name when the flag is 0. (comment: This is an issue related to
the fundamental blurring between "output" and "return value" in macro expansion.
That lack of distinction can be beneficial and, at times, detrimental.)

Note that more of the above is generalizable than what is actually generalized
here--especially with a conformant preprocessor (read: not most preprocessors
and definitely not VC). Things get even more general with variadics. For
example, with Chaos in C99 mode (i.e. w/variadic support):

// -----

#include <chaos/preprocessor/lambda/ops.h>
#include <chaos/preprocessor/punctuation/comma_if.h>
#include <chaos/preprocessor/seq/auto/for_each_i.h>

#define DEFINE_PART_OF_SPEECH(seq) \
enum PartOfSpeech { \
CHAOS_PP_SEQ_AUTO_FOR_EACH_I( \
CHAOS_PP_COMMA_IF_(CHAOS_PP_ARG(1)) CHAOS_PP_ARG(2), \
seq \
) \
}; \
const char* const partOfSpeechEncodings[] = { \
CHAOS_PP_SEQ_AUTO_FOR_EACH_I( \
CHAOS_PP_COMMA_IF_(CHAOS_PP_ARG(1)) CHAOS_PP_ARG(3), \
seq \
) \
}; \
/**/

DEFINE_PART_OF_SPEECH(
(coordinatingConjunction, "CC")
(cardinalNumber, "CD")
(determiner, "DT")
)

// -----

Note that this entire concept (enumerators -> strings) can be generalized:

// -----

#include <chaos/preprocessor/lambda/ops.h>
#include <chaos/preprocessor/punctuation/comma_if.h>
#include <chaos/preprocessor/seq/auto/for_each_i.h>
#include <chaos/preprocessor/stringize.h>

#define DEFINE_ENUM(name, seq) \
enum name { \
CHAOS_PP_SEQ_AUTO_FOR_EACH_I( \
CHAOS_PP_COMMA_IF_(CHAOS_PP_ARG(1)) CHAOS_PP_ARG(2), \
seq \
) \
}; \
inline const char* to_string(name e) { \
static const char* const table[] = { \
CHAOS_PP_SEQ_AUTO_FOR_EACH_I( \
CHAOS_PP_COMMA_IF_(CHAOS_PP_ARG(1)) CHAOS_PP_ARG(3), \
seq, \
CHAOS_PP_STRINGIZE_(CHAOS_PP_ARG(1)) \
) \
}; \
return table[e]; \
} \
/**/

DEFINE_ENUM( color, (red)(green)(blue) )

DEFINE_ENUM(
PartOfSpeech,
(coordinatingConjunction, "CC")
(cardinalNumber, "CD")
(determiner, "DT")
)

// -----

(Note that you could easily define the IO operators instead of (or in addition
to) 'to_string'.)

Lastly, all of the local macros should be undefined and all of the
general-purpose macros should be prefixed with a library name.
DEFINE_PART_OF_SPEECH is basically a one-time use macro, so it (and its helpers)
should be #undef'd after that use. Macros like DEFINE_ENUM (in the
land-of-make-believe where C++ has variadic macros) should be prefixed.

Regards,
Paul Mensonides

witt...@hotmail.com

unread,
Feb 1, 2005, 4:35:11 PM2/1/05
to
Wouldn't it be good enough to use a map, like below - it keeps keys and
values nicely together
-#include <iostream>
-#include <map>
-#include <string>

-using namespace std;

-int main()
-{
- map<int, string> m;

- enum PartOfSpeech
- {
- coordinatingConjunction,
- cardinalNumber,
- determiner
- };

- m[coordinatingConjunction] = "CC";
- m[cardinalNumber] = "CD";
- m[determiner] = "DT";

- for (map<int, string>::iterator it = m.begin(); it != m.end();
++it)
- {
- cout << (*it).first << '\t' << (*it).second << endl;
- }

- return 0;
-}

Andrei Alexandrescu (See Website for Email)

unread,
Feb 1, 2005, 8:42:46 PM2/1/05
to
witt...@hotmail.com wrote:
> Wouldn't it be good enough to use a map, like below - it keeps keys and
> values nicely together

It doesn't keep them together, because you need to mention the
enumerator names twice.

Andrei

Andrei Alexandrescu (See Website For Email)

unread,
Feb 1, 2005, 8:41:40 PM2/1/05
to
anton muhin wrote:
> In similar situations I usually use a little bit different trick. It
> goes like that:
>
> 1) The file with names and strings is introduced like an example below:
>
> STRUCT(coordinateConjunction, "CC"),
> STRUCT(cardinalNumber, "CD"),
> ....
>
> Let's name this file pos.h
>
> 2) Then we include this file with different definitions for STRUCT, e.g.:
>
> define STRUCT(name, str) name
> enum PartOfSpeech = {
> #include "pos.h"
> }
> #undef STRUCT
>
> #define STRUCT(name, str) str
> const char* partOfSpeechEncodings = {
> #include "pos.h"
> };
> #undef STRUCT

Thanks. In the meantime I had settled for a similar approach that
doesn't use an extra file, but has me type a zillion "\"s.

#define DEFINE_POSS \
POS(coordinatingConjunction, "CC") \
/* and, but, nor, or, yet ("cheap yet good"), plus, minus, for
("because") */ \
POS(cardinalNumber, "CD") \
POS(article, "DT") \
/* (determiner): a(n), every, no, the, another,
any, some, each, either ("either way"), neither ("neither
decision"),
that, these, this, those, all (when NOT preceding a determiner
or posessive
pronoun -- "all roads"), both (idem -- "both times") */ \
POS(existential, "EX") \
/* there ("there was a party starting", "there ensued a melee" */

namespace Pos {
#define POS(e, s) e,

enum pos { DEFINE_POSS };

#undef POS
#define POS(e, s) s,

const char* const posEncodings[] = { DEFINE_POSS };

#undef POS
}

As always, there's no perfect solution but certainly a "good enough one.
Thanks!

Andrei

msalters

unread,
Feb 2, 2005, 3:07:44 PM2/2/05
to

anton muhin wrote:
> Hello, Andrei!
>
> Andrei Alexandrescu (See Website for Email) wrote:
> > Hello,
> >
> >
> > I'm not very versed in the C preprocessor but people have done
pretty
> > cool things with the Boost preprocessor library, so I thought I'd
ask on
> > how to achieve a particular task.
> >
> > I need to define at the same time an enum:
[...]

> > and a vector of strings that maps strings to that enum:

> > In the interest of maintainability, I'd like the strings to be


close to
> > the enum values, so I'd essentially need something like:
> >
> > START_DEFINE_PART_OF_SPEECH()
> > MAKE_ENTRY(coordinatingConjunction, "CC"),
> > MAKE_ENTRY(cardinalNumber, "CD"),
> > MAKE_ENTRY(determiner, "DT"),
> > END_DEFINE_PART_OF_SPEECH()

> In similar situations I usually use a little bit different trick. It


> goes like that:
>
> 1) The file with names and strings is introduced like an example
below:
>
> STRUCT(coordinateConjunction, "CC"),
> STRUCT(cardinalNumber, "CD"),
> ....
>
> Let's name this file pos.h
>
> 2) Then we include this file with different definitions for STRUCT,
e.g.:
>
> define STRUCT(name, str) name
> enum PartOfSpeech = {
> #include "pos.h"
> }
> #undef STRUCT
>
> #define STRUCT(name, str) str
> const char* partOfSpeechEncodings = {
> #include "pos.h"
> };
> #undef STRUCT

Do you actually need two files? I'd say you could easily include
pos.h from pos.h so you'd get

// pos.h
#ifndef STRUCT
// first pass
#define STRUCT(name, str) name
enum PartOfSpeech = {
// recurse once
#include "pos.h"
}
#undef STRUCT

#define STRUCT(name, str) str
const char* partOfSpeechEncodings = {

// recurse twice
#include "pos.h"
};
#undef STRUCT
#else
// second and third pass


STRUCT(coordinateConjunction, "CC"),
STRUCT(cardinalNumber, "CD"),

#endif // STRUCT

ka...@gabi-soft.fr

unread,
Feb 2, 2005, 3:13:51 PM2/2/05
to
Andrei Alexandrescu (See Website for Email) wrote:

What's wrong with a trivial preprocessor written in AWK or
something similar? Given input like:

! coordinatingConjunction CC
! cardinalNumber CD
! ...

it's relatively trivial to generate both an enum and an
initialized std::map using AWK.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Victor Bazarov

unread,
Feb 3, 2005, 5:46:56 AM2/3/05
to
ka...@gabi-soft.fr wrote:
> Andrei Alexandrescu (See Website for Email) wrote:
>>I'm not very versed in the C preprocessor but people have done
>>pretty cool things with the Boost preprocessor library, so I
>>thought I'd ask on how to achieve a particular task.
>
> [...]

>
> What's wrong with a trivial preprocessor written in AWK or
> something similar? Given input like:
>
> ! coordinatingConjunction CC
> ! cardinalNumber CD
> ! ...
>
> it's relatively trivial to generate both an enum and an
> initialized std::map using AWK.

"Trivial" becomes "problematic" when you deal with platforms on which
there is no 'awk' or 'something similar' pre-packaged. Besides, when
your code is supposed to be compiled on some other guy's system the
necessity to have any other tools but the compiler can easily become
an unsurmountable obstacle.

Just a thought...

V

Andrei Alexandrescu (See Website for Email)

unread,
Feb 3, 2005, 10:12:38 AM2/3/05
to
> Do you actually need two files? I'd say you could easily include
> pos.h from pos.h so you'd get
>
> // pos.h
> #ifndef STRUCT
> // first pass
> #define STRUCT(name, str) name
> enum PartOfSpeech = {
> // recurse once
> #include "pos.h"
> }
> #undef STRUCT
>
> #define STRUCT(name, str) str
> const char* partOfSpeechEncodings = {
> // recurse twice
> #include "pos.h"
> };
> #undef STRUCT
> #else
> // second and third pass
> STRUCT(coordinateConjunction, "CC"),
> STRUCT(cardinalNumber, "CD"),
> #endif // STRUCT

Oh boy! This is such a funny hack. But, I must say I kind of prefer
James' suggestion a little - just use a little awk script and get over
it. But, recursive self-inclusion - point taken :o).

Andrei

Anders J. Munch

unread,
Feb 3, 2005, 10:20:17 AM2/3/05
to
Andrei Alexandrescu wrote:
> I need to define at the same time an enum:
[...]

> and a vector of strings that maps strings to that enum:

Here's what I do:

enum PartOfSpeech {
coordinatingConjunction,
cardinalNumber,
determiner,

PartOfSpeech_count
};


const char* partOfSpeechEncodings[] = {
"CC",
"CD",
"DT"

};
CT_ASSERT(LENGTHOF(partOfSpeechEncodings) == PartOfSpeech_count);

Here CT_ASSERT is a compile-time assertion macro, and LENGTHOF is just
#define LENGTHOF(arr) (sizeof(arr)/sizeof((arr)[0]))

This doesn't safeguard against just any change to the enum, all this
does is ensure that when entries are added to the enum, you remember
to add the corresponding entries to the string table.

But then that's pretty much all you need. Extending an enum is a much
more frequent use case than any other enum change.

- Anders

Maciej Sobczak

unread,
Feb 3, 2005, 4:51:48 PM2/3/05
to
Hi,

Victor Bazarov wrote:

> "Trivial" becomes "problematic" when you deal with platforms on which
> there is no 'awk' or 'something similar' pre-packaged. Besides, when
> your code is supposed to be compiled on some other guy's system the
> necessity to have any other tools but the compiler can easily become
> an unsurmountable obstacle.

So what's the problem in writing your own "tool" for doing this, in
standard C++, and plugging it as a step in the compilation process?

In the Open Source world, it is not uncommon to see this happening, even
on *much* larger scale. I remember having compiled OpenOffice.org on my
machine, where the preliminary build step was to compile... a specific
version of g++ compiler, which was then fired to compile the actual code.

When you compare these two things, writing a handful of lines in
standard C++ (if there's no awk or something similar on the target
platform) for a small tool that will help you with the rest sounds like
a piece of cake. :)

Having said that, I prefer X-macros (the idea of #including the same
file twice or more with varying active macro definitions when that
happens) for things like enums and strings. It results in less clutter
than known alternatives.

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

Nicola Musatti

unread,
Feb 3, 2005, 5:08:18 PM2/3/05
to
That's cheating! ;-)

Cheers,
Nicola Musatti

wka...@yahoo.com

unread,
Feb 3, 2005, 5:07:13 PM2/3/05
to

Variations on this technique can be used in a variety of situations.
For example:

#include <iostream>
#include <string>

#undef DATA_MEM
#undef X
#undef S

struct A
{
#define DATA_MEM \
X(string, name) S \
X(int, age) S \
X(char, sex)

#define X(TYPE, ID) TYPE ID;
#define S
DATA_MEM
#undef X
#undef S

#define X(TYPE, ID) F_##ID
#define S ,
enum Fields { DATA_MEM };
#undef X
#undef S
};

istream & operator >> (istream &is, A &a)
{
#define X(TYPE, ID) is >> a.ID;
#define S
DATA_MEM
#undef X
#undef S

return(is);
}

ostream & operator << (ostream &os, const A &a)
{
#define X(TYPE, ID) os << a.ID << endl;
#define S
DATA_MEM
#undef X
#undef S

return(os);
}

or:

http://www.geocities.com/wkaras/itemlist/itemlist.html

Andrei Alexandrescu (See Website for Email)

unread,
Feb 4, 2005, 6:28:46 AM2/4/05
to
Anders J. Munch wrote:
> Here's what I do:
>
> enum PartOfSpeech {
> coordinatingConjunction,
> cardinalNumber,
> determiner,
> PartOfSpeech_count
> };
> const char* partOfSpeechEncodings[] = {
> "CC",
> "CD",
> "DT"
> };
> CT_ASSERT(LENGTHOF(partOfSpeechEncodings) == PartOfSpeech_count);

I don't need that. The same confusion seems to persist in an email
exchange on how the same thing could be done in Ada. The maintenance
requirement is: I need to keep each enum together with the string that's
associated with it.

By the way, I ended up using Michiel's solution (recursive inclusion)
which is as cool as bug's ears. I even added include guards (which then
I disable temporarily) so now it all works smooth.

Oh, and one more thing: I also need a pos.cpp file such that the strings
aren't duplicated across compilation units. That's easily achieved by
having pos.cpp #define POS appropriately and then including pos.h.

Thanks to all!


Andrei

Andrei Alexandrescu (See Website for Email)

unread,
Feb 5, 2005, 12:10:54 PM2/5/05
to
> So what's the problem in writing your own "tool" for doing this, in
> standard C++, and plugging it as a step in the compilation process?
>
> In the Open Source world, it is not uncommon to see this happening, even
> on *much* larger scale. I remember having compiled OpenOffice.org on my
> machine, where the preliminary build step was to compile... a specific
> version of g++ compiler, which was then fired to compile the actual code.
>
> When you compare these two things, writing a handful of lines in
> standard C++ (if there's no awk or something similar on the target
> platform) for a small tool that will help you with the rest sounds like
> a piece of cake. :)

That brings up an interesting point.

Honest, the spectrum of having to write the C++ code that reads the
cin according to some simple yet nontrivial grammar is just not
appealing to me. Somehow I never got to like iostreams. I have
Langer's IOStreams book (it's almost as large as the entire Perl
book!) but I never got to read it. Whenever I think of reading it, my
thoughts end up veering towards the three lines of Perl that will do
the same as a long sequence of convoluted, impenetrable IOstreams
code... sigh.


Andrei

Maciej Sobczak

unread,
Feb 5, 2005, 8:46:27 PM2/5/05
to
Hi,

Andrei Alexandrescu (See Website for Email) wrote:

> That brings up an interesting point.
>
> Honest, the spectrum of having to write the C++ code that reads the
> cin according to some simple yet nontrivial grammar is just not
> appealing to me.

Then you get the apprentice doing it? :)

> Somehow I never got to like iostreams. I have
> Langer's IOStreams book (it's almost as large as the entire Perl
> book!) but I never got to read it.

You should! If not out of curiosity, then just for education
completeness. ;)

> Whenever I think of reading it, my
> thoughts end up veering towards the three lines of Perl that will do
> the same as a long sequence of convoluted, impenetrable IOstreams
> code... sigh.

Well, that was the idea of James few posts ago - use AWK or 'something
similar'. In fact any scripting engine could do, with standard shell as
one of the most obvious candidates (I do not have Perl on my machine,
for example).

The problem that was signalled was: what if there's no such tool?
Well, if we want to deliver C++ source, then supposedly there is at
least an environment for compiling C++ programs.

And if the only thing you have is a hammer... :)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Andrei Alexandrescu (See Website for Email)

unread,
Feb 6, 2005, 7:14:44 AM2/6/05
to
Maciej Sobczak wrote:
>>Somehow I never got to like iostreams. I have
>>Langer's IOStreams book (it's almost as large as the entire Perl
>>book!) but I never got to read it.
>
> You should! If not out of curiosity, then just for education
> completeness. ;)

But this is the fundamental question. Are the iostreams a flawed design
that is not worth learning in depth (and instead replaced with a good
design)? Or are iostreams an elegant design that deserves in-depth
analysis? Coz in the former case, one better invests time in writing
some little framework on top of FILE* instead of learning a convoluted
library.

Vote here. Are iostreams hot? Or not? Why, or why not?


Andrei

Sjoerd A. Schreuder

unread,
Feb 6, 2005, 7:15:40 AM2/6/05
to
Maciej Sobczak wrote:
> Victor Bazarov wrote:
>
>>"Trivial" becomes "problematic" when you deal with platforms on which
>>there is no 'awk' or 'something similar' pre-packaged. Besides, when
>>your code is supposed to be compiled on some other guy's system the
>>necessity to have any other tools but the compiler can easily become
>>an unsurmountable obstacle.
>
> So what's the problem in writing your own "tool" for doing this, in
> standard C++, and plugging it as a step in the compilation process?

Can't we make it part of C++? Some sort of compile-time code
execution, where the generated output is fed into the compiler.

This code would look cool, won't it?

#include <generator>

__generator enum_strings(gstream &in, gstream &out)
{
parser::identifier id;
parser::string s;
char c, d;

out << "enum {";
while((in >> id >> c >> s >> d))
{
out << id << ',';
}
out << "};";

in.restart(); // Should be possible somehow.

out << "const char* partOfSpeechEncodings = {";
while((in >> id >> c >> s >> d))
{
out << s << ',';
}
out << "};";
}

enum_strings(
coordinatingConjunction, "CC",
cardinalNumber, "CD",
determiner, "DT");

But I guess I'll be dreaming for a long time...

Sjoerd

Maciej Sobczak

unread,
Feb 7, 2005, 4:04:41 AM2/7/05
to
Hi,

Andrei Alexandrescu (See Website for Email) wrote:

>>You should! If not out of curiosity, then just for education
>>completeness. ;)
>
> But this is the fundamental question. Are the iostreams a flawed design
> that is not worth learning in depth (and instead replaced with a good
> design)? Or are iostreams an elegant design that deserves in-depth
> analysis? Coz in the former case, one better invests time in writing
> some little framework on top of FILE* instead of learning a convoluted
> library.

Convoluted? Ever heard about Loki? ;) ;) ;)

OK, switching back to serious mode.
If IOStreams are good, then they are worth learning - that's for sure.
If they aren't, then they are worth learning as well, at least to learn
their weak points - this is the prerequisite before designing something
better, otherwise you will never know if that new stuff is actually
better or not.

"Those who do not know history are condemned to repeat it."

> Vote here. Are iostreams hot? Or not? Why, or why not?

OK, my personal vote, subject to subjectivity, follows.

For me, IOStreams is an example of a very good *intended* design.
I find it very OO (just find somewhere 1. the diagram of a stream
hierarchy and 2. the diagram of the stream object structure), where OO
is used in its good way. This is to be underlined, because today's trend
is to promote template stuff - I still think that the way IOStreams are
designed shows a good hand behind it.

Now, the explanation of the "intended" word above.
I think that IOStreams are broken when it comes to their description in
the standard. Once you start digging into details, you are likely to
confuse and frustrate yourself. Once frustrated, you may easily say that
the library itself is bad.
This is when the Langer and Kreft book comes handy.

Anyway - it will take *a lot* of work to design something demonstrably
better and that lot of work should be today invested elsewhere, like on
DB or GUI stuff.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Ivan Vecerina

unread,
Feb 7, 2005, 4:07:21 AM2/7/05
to
"Andrei Alexandrescu (See Website for Email)"
<SeeWebsit...@moderncppdesign.com> wrote in message
news:4205A014...@moderncppdesign.com...

> But this is the fundamental question. Are the iostreams a flawed design
> that is not worth learning in depth (and instead replaced with a good
> design)? Or are iostreams an elegant design that deserves in-depth
> analysis? Coz in the former case, one better invests time in writing
> some little framework on top of FILE* instead of learning a convoluted
> library.
>
> Vote here. Are iostreams hot? Or not? Why, or why not?

I'm not a full expert of iostreams, but I'd say 'not'.
Design flaws (or trade-offs, if one prefers) that annoy me include:
- not great for anything but text i/o: the streambuf interface does
not differentiate between input/output/in+out streams, so you
don't take advantage of compile-time type safety.
- Formatting numeric-to-text conversions remains annoyingly verbose.
I'd probably prefer to have a default-formatting built-in,
and client-side formatter objects to do any non-standard coversions
(this has other down-sides, but would better fit my current usage).
- excessive flexibility in a single blob: locales, stream positioning,
configurable error handling (excetions or not), etc. Implementors
seem to have trouble making this efficient for the most-common case.


What I use instead in most cases:

When input/output sources need to be abstracted, I use a simple (i.e.
no 'seek' functionality) pair of interfaces for buffered input and
output, with adapters to stream_buf and other common sources/sinks.

Nowadays, on platforms where I do file i/o, I tend to use OS-specific
memory-mapping calls, because this wonderfully simplifies code while
typically providing faster run-time performance - what more can I ask
for?


Where I still use standard streams:

For simple text output (program or debug logs, etc), I do use C++
streams because, well, these are not crucial components in my
applications, and they are standard & cross-platform.
(once upon a time I wrote my own formatting and logging framework,
BkgConsole, because alternatives were sluggish on MacOS 7-9 ;)

For text input, in all but the most trivial cases, I tend to pick a
parser-generator framework or tool ( flex or boost::spirit ).


--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

ka...@gabi-soft.fr

unread,
Feb 7, 2005, 8:24:36 PM2/7/05
to
Andrei Alexandrescu (See Website for Email) wrote:
> > So what's the problem in writing your own "tool" for doing
> > this, in standard C++, and plugging it as a step in the
> > compilation process?

> > In the Open Source world, it is not uncommon to see this
> > happening, even on *much* larger scale. I remember having
> > compiled OpenOffice.org on my machine, where the preliminary
> > build step was to compile... a specific version of g++
> > compiler, which was then fired to compile the actual code.

> > When you compare these two things, writing a handful of
> > lines in standard C++ (if there's no awk or something
> > similar on the target platform) for a small tool that will
> > help you with the rest sounds like a piece of cake. :)

> That brings up an interesting point.

> Honest, the spectrum of having to write the C++ code that
> reads the cin according to some simple yet nontrivial grammar
> is just not appealing to me.

Well, I probably wouldn't use istream for the parsing, either.
Just getline, then either boost::regex or my own FieldArray
class, std::map, and such tools. (For the current example,
FieldArray is just right. It's a bit limited, but it was
designed to do with a string exactly what AWK and Perl do with
their input.)

> Somehow I never got to like iostreams. I have Langer's
> IOStreams book (it's almost as large as the entire Perl book!)
> but I never got to read it. Whenever I think of reading it, my
> thoughts end up veering towards the three lines of Perl that
> will do the same as a long sequence of convoluted,
> impenetrable IOstreams code... sigh.

They're worth learning, if only because they are standard, so
you are guaranteed to find them everywhere. The original
design, by Schwartz, was actually pretty good; even today, it's
a good example of mixing function (actually operator)
overloading and dynamic polymorphism. The committee didn't
improve them, but if you're careful about imbuing the right
locales in the right places, they're still very usable.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
Feb 8, 2005, 6:40:59 AM2/8/05
to
Victor Bazarov wrote:
> ka...@gabi-soft.fr wrote:
> > Andrei Alexandrescu (See Website for Email) wrote:
> >>I'm not very versed in the C preprocessor but people have
> >>done pretty cool things with the Boost preprocessor library,
> >>so I thought I'd ask on how to achieve a particular task.

> > [...]

> > What's wrong with a trivial preprocessor written in AWK or
> > something similar? Given input like:

> > ! coordinatingConjunction CC
> > ! cardinalNumber CD
> > ! ...

> > it's relatively trivial to generate both an enum and an
> > initialized std::map using AWK.

> "Trivial" becomes "problematic" when you deal with platforms
> on which there is no 'awk' or 'something similar'
> pre-packaged. Besides, when your code is supposed to be
> compiled on some other guy's system the necessity to have any
> other tools but the compiler can easily become an
> unsurmountable obstacle.

> Just a thought...

Obviously, anything which modifies your tool chain needs special
consideration. I've worked in places where there were
"standard" makefiles, which couldn't be modified. (And
obviously, these makefiles did not make provisions for use AWK
or anything else to generate code.) Still, given the amount of
time you win, it seems worth looking at. (And of course, there
are free implementations of awk, so any problems you will have
will be strictly political, and not technical.)

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Paul Mensonides

unread,
Feb 8, 2005, 12:09:38 PM2/8/05
to
Andrei Alexandrescu (See Website for Email) wrote:

> By the way, I ended up using Michiel's solution (recursive inclusion)
> which is as cool as bug's ears. I even added include guards (which
> then I disable temporarily) so now it all works smooth.

A word of warning... I've done quite a bit of experimentation with recursive
inclusion. As an example, the Boost pp-lib has a mechanism called "file
iteration" that iteratively includes a file. The Boost pp-lib also has a
mechanism that allows a sort of numeric assignment...

#define BOOST_PP_VALUE 5
#include BOOST_PP_ASSIGN_SLOT(1)

BOOST_PP_SLOT(1) // 5

#define BOOST_PP_VALUE \
BOOST_PP_SLOT(1) * BOOST_PP_SLOT(1) \
/**/
#include BOOST_PP_ASSIGN_SLOT(1)

BOOST_PP_SLOT(1) // 25

Given this ability, very interesting things can be done with recursive
inclusion--including expanding the range of file-iteration to billions of
iterations (obviously overkill). E.g. you could "hand-roll" a recursive
repetition process like this:

// typelist.hpp

#if !IS_LOOPING

#ifndef TYPELIST_HPP
#define TYPELIST_HPP "typelist.hpp"

#include <boost/preprocessor/slot/slot.hpp>
#include <boost/preprocessor/repetition/enum_params.hpp>
#include <boost/preprocessor/repetition/enum_shifted_params.hpp>

#ifndef MAX_ARITY
#define MAX_ARITY 15
#endif

struct nil;

template<class T, class U> struct cons {
typedef T head;
typedef U tail;
};

template<int> struct typelist;

template<> struct typelist<1> {
template<class T0> struct args {
typedef cons<T0, nil> type;
};
};

#define BOOST_PP_VALUE 2
??=include BOOST_PP_ASSIGN_SLOT(1)


#define n BOOST_PP_SLOT(1)
#define IS_LOOPING 1

??=include TYPELIST_HPP

#undef n
#undef IS_LOOPING

#endif

#else

#if n < MAX_ARITY

template<> struct typelist<n> {
template<BOOST_PP_ENUM_PARAMS(n, class T)> struct args {
typedef cons<
T0,
typename
typelist<n - 1>::args<BOOST_PP_ENUM_SHIFTED_PARAMS(n,
T)>::type
> type;
};
};

#define BOOST_PP_VALUE n + 1
??=include BOOST_PP_ASSIGN_SLOT(1)

??=include TYPELIST_HPP

#endif

#endif

Take a look at gcc's preprocessing output (or even VC's). Then take a look at
EDG's...

The problem is that recursive inclusion it isn't very portable. Some
implementations will "helpfully" bale when a file directly (or indirectly)
includes itself more than a relatively small number of times. (I think that it
was 8 or 9 on EDG.) In other words, it is detecting that a file is including
itself regardless of inclusion-guards (or the lack thereof) and aborting far
sooner than the normal maximum include depth.

Moral of the story... There are some very interesting things that can be done
with recursive inclusion, but, unfortunately, the technique is not portable.

Regards,
Paul Mensonides

Derek Ledbetter

unread,
Feb 8, 2005, 7:35:15 PM2/8/05
to
On 2005-02-06 04:15:40 -0800, "Sjoerd A. Schreuder"
<sa_sch...@wanadoo.nl> said:

> Can't we make it part of C++? Some sort of compile-time code
> execution, where the generated output is fed into the compiler.
>
> This code would look cool, won't it?
>
> #include <generator>
>
> __generator enum_strings(gstream &in, gstream &out)
> {
> parser::identifier id;
> parser::string s;
> char c, d;
>
> out << "enum {";
> while((in >> id >> c >> s >> d))
> {
> out << id << ',';
> }
> out << "};";
>
> in.restart(); // Should be possible somehow.
>
> out << "const char* partOfSpeechEncodings = {";
> while((in >> id >> c >> s >> d))
> {
> out << s << ',';
> }
> out << "};";
> }
>
> enum_strings(
> coordinatingConjunction, "CC",
> cardinalNumber, "CD",
> determiner, "DT");
>
> But I guess I'll be dreaming for a long time...
>
> Sjoerd

I would like a way to generate all sorts of declarations and code, but
I don't like this method because it knows nothing of the syntax or
semantics of C++. The input and output are just text, and the
programmer has parse it by hand. But templates are already functions
that the compiler runs to generate defintions of code and types. If
templates had a string type, you could to something like this:

partOfSpeechList =
{
{ "coordinatingConjunction", "CC" },
{ "cardinalNumber", "CD" },
{ "determiner", "DT" },
};

enum PartOfSpeech {
foreach(e in partOfSpeechList)
{
identifier(e[0]) // declare a value whose name is this string
}
};

char const*const partOfSpeechEncodings[] = {
foreach(e in partOfSpeechList)
{
literal(e[1]) // convert the compile-time string to a
string literal
}
};

(The syntax isn't real. I just want to show what I'd like to do.)

It looks like Daveed Vandevoorde is already working on this kind of thing:
http://www.vandevoorde.com/Daveed/News/Archives/000015.html
He calls it "Metacode". Another such project is called OpenC++.

Andrei Alexandrescu (See Website for Email)

unread,
Feb 8, 2005, 8:24:38 PM2/8/05
to
ka...@gabi-soft.fr wrote:
> Andrei Alexandrescu (See Website for Email) wrote:
>>Honest, the spectrum of having to write the C++ code that
>>reads the cin according to some simple yet nontrivial grammar
>>is just not appealing to me.
>
> Well, I probably wouldn't use istream for the parsing, either.
> Just getline, then either boost::regex or my own FieldArray
> class, std::map, and such tools. (For the current example,
> FieldArray is just right. It's a bit limited, but it was
> designed to do with a string exactly what AWK and Perl do with
> their input.)

That's quite what I'm doing now, except that I use cin.get() to
arduously read one character at a time.

>>Somehow I never got to like iostreams. I have Langer's
>>IOStreams book (it's almost as large as the entire Perl book!)
>>but I never got to read it. Whenever I think of reading it, my
>>thoughts end up veering towards the three lines of Perl that
>>will do the same as a long sequence of convoluted,
>>impenetrable IOstreams code... sigh.
>
> They're worth learning, if only because they are standard, so
> you are guaranteed to find them everywhere. The original
> design, by Schwartz, was actually pretty good; even today, it's
> a good example of mixing function (actually operator)
> overloading and dynamic polymorphism. The committee didn't
> improve them, but if you're careful about imbuing the right
> locales in the right places, they're still very usable.

But wait. They are supposed to do some useful string parsing that goes
beyond converting numbers to characters and back, aren't they? I mean,
why would I learn them if all I need to know to process strings is
cin.read(), cin.get(), or cin.getline() (and with the latter I already
feel risqué there) and then do everything on my own because the
iostreams (which, again, are supposed to be good at formatting) know
virtually zilch about formatting?


Andrei

Ivan Vecerina

unread,
Feb 9, 2005, 5:58:01 AM2/9/05
to
"Derek Ledbetter" <der...@serve.com> wrote in message
news:2005020803370527590%derekl@servecom...

> It looks like Daveed Vandevoorde is already working on this kind of thing:
> http://www.vandevoorde.com/Daveed/News/Archives/000015.html
> He calls it "Metacode". Another such project is called OpenC++.
Yes, the metafunctions would be nice.

Since you also pointed to OpenC++, I think that "Pivot" is also worth
a mention:
http://iap.cs.tamu.edu/IAP04 (it's a PDF).
http://www-unix.mcs.anl.gov/workshops/DSLOpt/Talks/DosReis.pdf
http://lcgapp.cern.ch/project/architecture/XTI_accu.pdf
A project that appears to involve Bjarne Stroustrup and Gabriel Dos Reis,
but seems to remain relatively "confidential" (in that not much publicity
is being made about it). I looks like Gaby is working on a first
implementation (within GCC I guess...)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

jto...@yahoo.com

unread,
Feb 10, 2005, 3:04:33 PM2/10/05
to

Andrei Alexandrescu (See Website for Email) wrote:
> Maciej Sobczak wrote:
> >>Somehow I never got to like iostreams. I have
> >>Langer's IOStreams book (it's almost as large as the entire Perl
> >>book!) but I never got to read it.
> >
> > You should! If not out of curiosity, then just for education
> > completeness. ;)
>
> But this is the fundamental question. Are the iostreams a flawed
design
> that is not worth learning in depth (and instead replaced with a good
> design)? Or are iostreams an elegant design that deserves in-depth
> analysis? Coz in the former case, one better invests time in writing
> some little framework on top of FILE* instead of learning a
convoluted
> library.
>
> Vote here. Are iostreams hot? Or not? Why, or why not?
>

I've done quite a lot of stream programming, and I'd say "not".
- seekability is pretty poor
- they are split into streams and stream buffers, but what for?
Basically, if you change the buffer of a stream (for example,set a
file's buf to point to another stringstream's buf), you simply need to
be careful later on.
- stream buffers are kept by raw pointers, so they can be leaked
- formatting is very poorly implemented. Some IO manipulators only work
for the next written item, and some work for the remainder of the
stream. Basically, if I need formatting, I simply use boost::format.
Quick: how do you write a float, formatted like "xx.xxxx"? What about
two floats?
- for custom formatting for some specific class, you need to go deep
(iword/pword or facets)
- I'm not sure if you can improve on their state flags (rdstate()).
That is, can you add your own flags?
- not to say about the streambuf's virtual functions. To overwrite
them, you need quite advanced iostream knowledge. No wonder Jonathan
Turkanis wrote a library to ease creation of stream classes :)


Best,
John

--
John Torjo, Contributing editor, C/C++ Users Journal
-- "Win32 GUI Generics" -- generics & GUI do mix, after all
-- http://www.torjo.com/win32gui/ -v1.6.3 (Resource Splitter)
-- http://www.torjo.com/cb/ - Click, Build, Run!

jto...@yahoo.com

unread,
Feb 10, 2005, 3:04:11 PM2/10/05
to

> Now, the explanation of the "intended" word above.
> I think that IOStreams are broken when it comes to their description
in
> the standard. Once you start digging into details, you are likely to
> confuse and frustrate yourself. Once frustrated, you may easily say
that
> the library itself is bad.
> This is when the Langer and Kreft book comes handy.

True... But the fact that there is a book dedicated just to this
subject, makes me wonder about the design of iostreams...

>
> Anyway - it will take *a lot* of work to design something
demonstrably
> better and that lot of work should be today invested elsewhere, like
on
> DB or GUI stuff.
>

Indeed so. I should know ;) Anyway, it would make a pretty cool
challenge to come with a better iostream design. If noone tries it, I
might take a shot at it in a few months.

Best,
John

--
John Torjo, Contributing editor, C/C++ Users Journal
-- "Win32 GUI Generics" -- generics & GUI do mix, after all
-- http://www.torjo.com/win32gui/ -v1.6.3 (Resource Splitter)
-- http://www.torjo.com/cb/ - Click, Build, Run!

ka...@gabi-soft.fr

unread,
Feb 12, 2005, 8:43:13 AM2/12/05
to
jto...@yahoo.com wrote:
> Andrei Alexandrescu (See Website for Email) wrote:
> > Maciej Sobczak wrote:
> > >>Somehow I never got to like iostreams. I have Langer's
> > >>IOStreams book (it's almost as large as the entire Perl
> > >>book!) but I never got to read it.

> > > You should! If not out of curiosity, then just for
> > > education completeness. ;)

> > But this is the fundamental question. Are the iostreams a
> > flawed design that is not worth learning in depth (and
> > instead replaced with a good design)? Or are iostreams an
> > elegant design that deserves in-depth analysis? Coz in the
> > former case, one better invests time in writing some little
> > framework on top of FILE* instead of learning a convoluted
> > library.

> > Vote here. Are iostreams hot? Or not? Why, or why not?

> I've done quite a lot of stream programming, and I'd say "not".
> - seekability is pretty poor

That's not really what they were designed for, either.

> - they are split into streams and stream buffers, but what for?

Because there are two separate concerns. The split allows two
orthogonal ways to extend -- formatting new types, and creating
new sinks and sources.

This split is probably the best part of iostream.

> Basically, if you change the buffer of a stream (for
> example,set a file's buf to point to another stringstream's
> buf), you simply need to be careful later on.

But you typically don't change the buffer of a stream. You
create a new stream with a new buffer.

> - stream buffers are kept by raw pointers, so they can be leaked

And what should be used to keep them?

> - formatting is very poorly implemented. Some IO manipulators
> only work for the next written item, and some work for the
> remainder of the stream. Basically, if I need formatting, I
> simply use boost::format. Quick: how do you write a float,
> formatted like "xx.xxxx"? What about two floats?

You'll note that both boost::format and my GB_Format base
themselves on ostream. So it can't be doing everything wrong.

I developped GB_Format because I needed the positional
parameters, for internationalization. I don't quite see how
these could be integrated into the ostream model. Other than
that, there are distinct advantages in the manipulator
approach. I think that boost::format allows the use of
manipulators, and if so, that would be a decided advantage; how
to format the data is a characteristic of the semantics of the
data, and NOT of the text surrounding the data. (Once you've
had to track down a bug because the translator accidentally
changed a %d into a %s, you'll know what I mean.)

> - for custom formatting for some specific class, you need to
> go deep (iword/pword or facets)

Not necessarily. I've only rarely used iword/pword, for
example, and I've written a lot of custom formatters. I'd guess
that for 90% or more of the user classes, the only formatting
option which is relevant is the width.

> - I'm not sure if you can improve on their state flags
> (rdstate()). That is, can you add your own flags?

Error reporting is a weak spot.

> - not to say about the streambuf's virtual functions. To
> overwrite them, you need quite advanced iostream knowledge.

Absolutely not. There's really nothing simpler.

> No wonder Jonathan Turkanis wrote a library to ease creation
> of stream classes :)

I haven't looked at it, but I imagine that the most important
thing it would do would be to provide the convience classes
which derive from istream and ostream. It's hard to get simpler
than overriding overflow and underflow.

All in all, I rather like the iostream design. I think a bit
too much has been grafted onto it, and that the changes wrought
by the standard, although probably necessary, are not
particularly well done. But the basic principle works well, up
to a point, and I've not seen any alternatives which do better.

--
James Kanze GABI Software

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
Feb 12, 2005, 8:41:43 AM2/12/05
to
Andrei Alexandrescu (See Website for Email) wrote:
> ka...@gabi-soft.fr wrote:
> > Andrei Alexandrescu (See Website for Email) wrote:

> But wait. They are supposed to do some useful string parsing
> that goes beyond converting numbers to characters and back,
> aren't they?

Not that I know of. The name of the class is istream, not
iparsestream.

> I mean, why would I learn them if all I need to know to
> process strings is cin.read(), cin.get(), or cin.getline()
> (and with the latter I already feel risqué there) and then do
> everything on my own because the iostreams (which, again, are
> supposed to be good at formatting) know virtually zilch about
> formatting?

The output formatting is fairly usable. I've yet to see any
stream input (C++ or otherwise) that was in any way really
usable for input parsing. Parsing is just a bit to complicated
to tie up with the streaming functionality (at least if you want
reasonable error recovering and error messages).

I think the role of iostream is to provide a minimal base set of
functionalities. Anything more, and you have to write your own
code. (I tend to use regular expressions a lot in my parsing.
Do you really think that istream should be able to handle things
with regular expressions.)

--
James Kanze GABI Software

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Roland Pibinger

unread,
Feb 12, 2005, 9:38:06 PM2/12/05
to
On 10 Feb 2005 15:04:11 -0500, jto...@yahoo.com wrote:

>Anyway, it would make a pretty cool
>challenge to come with a better iostream design. If noone tries it, I
>might take a shot at it in a few months.

Please no pretty cool designs any more! Just something usable like,
ahem, Java Streams, Reader/Writer, and Filters!

Best wishes,
Roland Pibinger

jto...@yahoo.com

unread,
Feb 14, 2005, 10:41:33 AM2/14/05
to

> > - they are split into streams and stream buffers, but what for?
>
> Because there are two separate concerns. The split allows two
> orthogonal ways to extend -- formatting new types, and creating
> new sinks and sources.
>
> This split is probably the best part of iostream.

I do like it myself, but it's pretty unsafe. What use is a stringstream
with a filebuf?

>
> > Basically, if you change the buffer of a stream (for
> > example,set a file's buf to point to another stringstream's
> > buf), you simply need to be careful later on.
>
> But you typically don't change the buffer of a stream. You
> create a new stream with a new buffer.

We typically don't do a lot of stuff, but it does not mean we shouldn't
deliver more careful interfaces to our classes.

>
> > - formatting is very poorly implemented. Some IO manipulators
> > only work for the next written item, and some work for the
> > remainder of the stream. Basically, if I need formatting, I
> > simply use boost::format. Quick: how do you write a float,
> > formatted like "xx.xxxx"? What about two floats?
>
> You'll note that both boost::format and my GB_Format base
> themselves on ostream. So it can't be doing everything wrong.

Yes, I love std::stringstream myself.

As a side-note, the logging lib I've developed is based on
std::ostream. So I'm aware of iostream advantages.

>
> > - for custom formatting for some specific class, you need to
> > go deep (iword/pword or facets)
>
> Not necessarily. I've only rarely used iword/pword, for
> example, and I've written a lot of custom formatters. I'd guess
> that for 90% or more of the user classes, the only formatting
> option which is relevant is the width.

Not necessary. When writing arrays/collections, you can have other
formatting, for instance. A few months ago, for the review of Output
Formatters library, these issues came up (the fact that the stream
should hold the formatting flags, not the writer object itself).

> > No wonder Jonathan Turkanis wrote a library to ease creation
> > of stream classes :)
>
> I haven't looked at it, but I imagine that the most important
> thing it would do would be to provide the convience classes
> which derive from istream and ostream. It's hard to get simpler
> than overriding overflow and underflow.

I wouldn't bet on it. Jonathan?

>
> All in all, I rather like the iostream design. I think a bit
> too much has been grafted onto it, and that the changes wrought
> by the standard, although probably necessary, are not
> particularly well done. But the basic principle works well, up
> to a point, and I've not seen any alternatives which do better.
>

Yes, I do agree that the basic principle works well.

Best,
John

--
John Torjo, Contributing editor, C/C++ Users Journal
-- "Win32 GUI Generics" -- generics & GUI do mix, after all
-- http://www.torjo.com/win32gui/ -v1.6.3 (Resource Splitter)
-- http://www.torjo.com/cb/ - Click, Build, Run!

jto...@yahoo.com

unread,
Feb 14, 2005, 4:13:11 PM2/14/05
to

Roland Pibinger wrote:
> On 10 Feb 2005 15:04:11 -0500, jto...@yahoo.com wrote:
>
> >Anyway, it would make a pretty cool
> >challenge to come with a better iostream design. If noone tries it,
I
> >might take a shot at it in a few months.
>
> Please no pretty cool designs any more! Just something usable like,
> ahem, Java Streams, Reader/Writer, and Filters!
>

I'm sorry, you problably misunderstood me. I said "cool challenge", not
"cool design". What I wanted was a better design than iostreams. I did
not mean a "cool design".

Meanwhile, you can check out Jonathan's lib - it's a very good one!

Best,
John

--
John Torjo, Contributing editor, C/C++ Users Journal
-- "Win32 GUI Generics" -- generics & GUI do mix, after all
-- http://www.torjo.com/win32gui/ -v1.6.3 (Resource Splitter)
-- http://www.torjo.com/cb/ - Click, Build, Run!

Andrei Alexandrescu (See Website for Email)

unread,
Feb 14, 2005, 4:15:57 PM2/14/05
to
jto...@yahoo.com wrote:
>>All in all, I rather like the iostream design. I think a bit
>>too much has been grafted onto it, and that the changes wrought
>>by the standard, although probably necessary, are not
>>particularly well done. But the basic principle works well, up
>>to a point, and I've not seen any alternatives which do better.
>>
>
> Yes, I do agree that the basic principle works well.

How in the world? Are we living on the same planet here? C++ IOStreams
easily rate the worst I/O package of all languages I know.

With IOStreams, I can't easily say something as simple as "Skip all
whitespace, "[", "]", or "===". And that's not regular expressions! No,
Sir. I got to call get() and then check if it's whitespace, "[", etc.
etc. etc.

So how can you like the design if it does some things that are rarely
needed (wow! formatting Euros automatically! gotta love that), while
being totally inapt at doing the most trivial tasks in I/O: the simplest
string parsing. And I'm not even asking for regular expressions.


Andrei

jto...@yahoo.com

unread,
Feb 15, 2005, 5:05:50 AM2/15/05
to

Andrei Alexandrescu (See Website for Email) wrote:
> jto...@yahoo.com wrote:
> >>All in all, I rather like the iostream design. I think a bit
> >>too much has been grafted onto it, and that the changes wrought
> >>by the standard, although probably necessary, are not
> >>particularly well done. But the basic principle works well, up
> >>to a point, and I've not seen any alternatives which do better.
> >>
> >
> > Yes, I do agree that the basic principle works well.
>
> How in the world? Are we living on the same planet here? C++
IOStreams
> easily rate the worst I/O package of all languages I know.
>
> With IOStreams, I can't easily say something as simple as "Skip all
> whitespace, "[", "]", or "===". And that's not regular expressions!
No,
> Sir. I got to call get() and then check if it's whitespace, "[", etc.

> etc. etc.

First of all, I don't recall needing this in the past.
I do recall needing regex several times, but that's another story.

Second, what I meant by "the basic principle works well", is that even
with this design, I have successfully used iostreams for (not so
advanced) I/O.

Knowing that iostreams are not good at too advanced I/O, I have adapted
to it. So, the syntax of files/streams I read from/write to is very
simple (for instance, a record on a line, or so). I rarely need
std::setw, std::setfill and other manipulators.

There still are a few idioms that work rather well.

// reading
std::string line;
while ( std::getline(in, line) )
...; // process line

// writing
for ( iterator b = ..., e = ...; b != e; ++b)
out << *b << std::endl;

Best,
John


--
John Torjo, Contributing editor, C/C++ Users Journal
-- "Win32 GUI Generics" -- generics & GUI do mix, after all
-- http://www.torjo.com/win32gui/ -v1.6.3 (Resource Splitter)
-- http://www.torjo.com/cb/ - Click, Build, Run!

ka...@gabi-soft.fr

unread,
Feb 15, 2005, 5:36:04 PM2/15/05
to
jto...@yahoo.com wrote:
> > > - they are split into streams and stream buffers, but what for?

> > Because there are two separate concerns. The split allows
> > two orthogonal ways to extend -- formatting new types, and
> > creating new sinks and sources.

> > This split is probably the best part of iostream.

> I do like it myself, but it's pretty unsafe. What use is a
> stringstream with a filebuf?

That, I don't know, but it's possible that I might start with a
filestream, and use one of my filtering streambuf's with it. Or
someone might -- I learned iostream's with the classical
iostream, where you couldn't change the streambuf once it was
set, and so I don't tend to think in those terms.

If you do happen to do it, where is the problem. Generally
speaking, the derived classes, like stringstream, are only their
special types during construction. Once the object is
constructed, with the (in this case) stringbuf, it acts exactly
like any other iostream. If you change the streambuf, it uses
the new one.

> > > Basically, if you change the buffer of a stream (for
> > > example,set a file's buf to point to another
> > > stringstream's buf), you simply need to be careful later
> > > on.

> > But you typically don't change the buffer of a stream. You
> > create a new stream with a new buffer.

> We typically don't do a lot of stuff, but it does not mean we
> shouldn't deliver more careful interfaces to our classes.

Well, I'm not totally convinced of the necessity of the
non-const rdbuf() myself. As I said, having gotten used to
iostream in pre-standard days, I never feel the need to use it.
But if it's useful to some people, I don't see any reason to
deprive them of it.

With regards to "be careful later on", I think the poster was
thinking of cout and cin, where you really do have to restore
the original streambuf before global destructors are called. Or
at least ensure that the streambuf you've set is never
destructed. But that's a general lifetime of object issue, and
not particular to iostreams. Any time you furnish an object to
others to use, you must ensure that the object's lifetime is
sufficient.

> > > - formatting is very poorly implemented. Some IO
> > > manipulators only work for the next written item, and some
> > > work for the remainder of the stream. Basically, if I need
> > > formatting, I simply use boost::format. Quick: how do you
> > > write a float, formatted like "xx.xxxx"? What about two
> > > floats?

> > You'll note that both boost::format and my GB_Format base
> > themselves on ostream. So it can't be doing everything
> > wrong.

> Yes, I love std::stringstream myself.

> As a side-note, the logging lib I've developed is based on
> std::ostream. So I'm aware of iostream advantages.

> > > - for custom formatting for some specific class, you need
> > > to go deep (iword/pword or facets)

> > Not necessarily. I've only rarely used iword/pword, for
> > example, and I've written a lot of custom formatters. I'd
> > guess that for 90% or more of the user classes, the only
> > formatting option which is relevant is the width.

> Not necessary. When writing arrays/collections, you can have
> other formatting, for instance. A few months ago, for the
> review of Output Formatters library, these issues came up (the
> fact that the stream should hold the formatting flags, not the
> writer object itself).

Most of the time, when I want a different format, it is a
completely different format. In that case, my general solution
is to create a wrapper object.

I suspect that there are exceptions, but I've not encountered
them.

> > > No wonder Jonathan Turkanis wrote a library to ease
> > > creation of stream classes :)

> > I haven't looked at it, but I imagine that the most
> > important thing it would do would be to provide the
> > convience classes which derive from istream and ostream.
> > It's hard to get simpler than overriding overflow and
> > underflow.

> I wouldn't bet on it. Jonathan?

I'll have to look at it. But the last time I wrote a streambuf,
it took me exactly five lines of code. Including the class
definition. OK -- I'll admit that it was a very special case,
but between ten and fifteen lines seems about par for the
course.

Having said that -- anytime you're deriving, there's a lot of
boilerplate code. I use a template for most of my filtering
streambufs, for example. (On the third hand, well over half the
code in the template is because I want to support both classical
and standard streams, including all of the widespread
pre-standard idioms, and I only want to count on the written
guarantees of the pre-standard streams. Thus, an input
streambuf will override overflow(), to do what the standard
guarantees that streambuf::overflow does, because the
documentation I had of the pre-standard streams didn't give any
guarantee here.)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Andrei Alexandrescu (See Website for Email)

unread,
Feb 15, 2005, 6:43:51 PM2/15/05
to
jto...@yahoo.com wrote:
> First of all, I don't recall needing this in the past.

But such things are needed when you parse something as banal as an ini
file.

> Second, what I meant by "the basic principle works well", is that even
> with this design, I have successfully used iostreams for (not so
> advanced) I/O.
>
> Knowing that iostreams are not good at too advanced I/O, I have adapted
> to it. So, the syntax of files/streams I read from/write to is very
> simple (for instance, a record on a line, or so). I rarely need
> std::setw, std::setfill and other manipulators.
>
> There still are a few idioms that work rather well.
>
> // reading
> std::string line;
> while ( std::getline(in, line) )
> ...; // process line
>
> // writing
> for ( iterator b = ..., e = ...; b != e; ++b)
> out << *b << std::endl;

But that's not "not so advanced". It's "just read the characters
without any formatting and do everything yourself". You gotta admit
that that's a far cry below what all serious and many not-so-serious
languages offer for I/O.

So, yes: the basic principle works well. That basic principle is a
function that reads characters from a file :o).

Anyway, I appreciate this communication. Before, I was doing those
things and was feeling like a fool and an ignorant who's missing out
on great facilities; now I keep on doing the same things and I'm
feeling hip and cool.


Andrei

Roland Pibinger

unread,
Feb 15, 2005, 6:54:45 PM2/15/05
to
On 14 Feb 2005 16:13:11 -0500, jto...@yahoo.com wrote:

>
>Roland Pibinger wrote:
>> On 10 Feb 2005 15:04:11 -0500, jto...@yahoo.com wrote:
>>
>> >Anyway, it would make a pretty cool
>> >challenge to come with a better iostream design. If noone tries it,
>I
>> >might take a shot at it in a few months.
>>
>> Please no pretty cool designs any more! Just something usable like,
>> ahem, Java Streams, Reader/Writer, and Filters!
>>
>
>I'm sorry, you problably misunderstood me. I said "cool challenge", not
>"cool design". What I wanted was a better design than iostreams. I did
>not mean a "cool design".
>
>Meanwhile, you can check out Jonathan's lib - it's a very good one!

You mean the "Policy-based stream buffer template with an interface
similar to std::basic_filebuf"? Pretty cool design. Too cool for me.

Best regards,
Roland Pibinger

Jonathan Turkanis

unread,
Feb 16, 2005, 5:56:52 AM2/16/05
to
Roland Pibinger wrote:
> On 14 Feb 2005 16:13:11 -0500, jto...@yahoo.com wrote:

>> I'm sorry, you problably misunderstood me. I said "cool challenge",
>> not "cool design". What I wanted was a better design than iostreams.
>> I did not mean a "cool design".
>>
>> Meanwhile, you can check out Jonathan's lib - it's a very good one!
>
> You mean the "Policy-based stream buffer template with an interface
> similar to std::basic_filebuf"? Pretty cool design. Too cool for me.

Take a look at some examples before you dismiss it as too complicated; the
library is designed to be very easy to use. Everything fancy (and really there's
not much) is done behind the scenes.

The policy classes just contain one or more read/write/seek functions which
describe in a very straightforward way how to access data using the underlying
device -- e.g., file, network connection or in-memory array. It's much more
straightforward, IMO, than overriding underflow, overflow, pbackfail, ... .
Plus, you get buffering and the ability to putback characters free.

Maybe calling it 'policy-based' gives the impression that the design is
convoluted. I hope not. After all, I didn't realize it was policy-based until
after I wrote it ;-)

BTW, the tutorial is here:

http://www.kangaroologic.com/iostreams/libs/iostreams/doc/?path=3

(It's slightly out-of-date: the library is even simpler now :-)

Best Regards,
Jonathan

Jonathan Turkanis

unread,
Feb 16, 2005, 5:58:29 AM2/16/05
to
ka...@gabi-soft.fr wrote:
> jto...@yahoo.com wrote:

>>>> No wonder Jonathan Turkanis wrote a library to ease
>>>> creation of stream classes :)
>
>>> I haven't looked at it, but I imagine that the most
>>> important thing it would do would be to provide the
>>> convience classes which derive from istream and ostream.
>>> It's hard to get simpler than overriding overflow and
>>> underflow.
>
>> I wouldn't bet on it. Jonathan?
>
> I'll have to look at it. But the last time I wrote a streambuf,
> it took me exactly five lines of code. Including the class
> definition. OK -- I'll admit that it was a very special case,
> but between ten and fifteen lines seems about par for the
> course.

As I said in another message which hasn't posted, this is basically what Dietmar
Kuehl's said about the library. However, it wasn't designed for people like you
and Dietmar ;-) Many people find the interface for customizing basic_streambuf
quite complicated.

{You need a little more patience. Moderation takes time. -mod}

Even so, I'd say ten to fifteen lines for a streambuf definition is unusually
short, if you provide buffering and override xsgetn and/or xsputn instead of
just overflow/underflow. My generic implementation of xsgetn with a putback
buffer is about 20 lines long by itself.

> Having said that -- anytime you're deriving, there's a lot of
> boilerplate code. I use a template for most of my filtering
> streambufs, for example.

Right -- eliminating boilerplate one of the central aims of the library,
particularly since buffer manipulation is very error prone. In addition to
buffering, handling seeking in read/write streams can be tricky.

> (On the third hand, well over half the
> code in the template is because I want to support both classical
> and standard streams, including all of the widespread
> pre-standard idioms, and I only want to count on the written
> guarantees of the pre-standard streams. Thus, an input
> streambuf will override overflow(), to do what the standard
> guarantees that streambuf::overflow does, because the
> documentation I had of the pre-standard streams didn't give any
> guarantee here.)

At first I was against supporting classic iostreams, since Boost is about
promoting and extending standard C++. However, I recently decided to try adding
support for the iostreams libraries used by GCC 2.9x, and I'm still holding my
breath, but it looks like all the narrow-stream tests are passing. That's really
the only platform with classic iostreams that's relevant to Boost, though it
would be interesting to see what happens on other platforms.

By the way, the library documentation is here:

http://www.kangaroologic.com/iostreams/

It's no longer in sync with the Boost CVS version, but there haven't been any
big interface changes.

Best Regards,
Jonathan

Jonathan Turkanis

unread,
Feb 16, 2005, 5:58:07 AM2/16/05
to
ka...@gabi-soft.fr wrote:
> jto...@yahoo.com wrote:

>> No wonder Jonathan Turkanis wrote a library to ease creation
>> of stream classes :)
>
> I haven't looked at it, but I imagine that the most important
> thing it would do would be to provide the convience classes
> which derive from istream and ostream. It's hard to get simpler
> than overriding overflow and underflow.

That's basically what Dietmar Kuehl said when he reviewed the library --
something along the lines of "it's already so simple to write a streambuf, who
needs a library?" For many people, including me, the protected virtual interface
of basic_streambuf is a bit obscure. For instance, overflow and pbackfail take
an int_type which have to be checked against eof. It's far easier to define a
class which a implements a subset of read/write/seek, and then generate a stream
or stream buffer via a typedef. E.g.,

struct MyDevice : boost::iostreams::sink {
void write(const char* s, std::streamsize s)
{
// write to underlying device
}
};

typedef boost::iostreams::streambuf_facade<MyDevice> MyStreambuf;

The streams and stream buffers generated in this manner are accessed with an
open/is_open/close interface similar to std::filebuf.

In addition to this (slight) streamlining of the interface, you get automatic
buffering and the ability to putback characters without having to use any of the
ten or so buffer management functions. The library also makes it easy to add a
layer of code-conversion or line-ending conversion to any stream or stream
buffer.

There's also a filtering framework based on your (and, I gather, Dietmar's)
filtering stream buffers. E.g.,

filtering_istream in;
in.push(tab_expanding_filter(4));
in.push(gzip_decompressor());
in.push(std::cin);
// read decompressed data from standard input with tabs expanded.
// (usual caveats about reading binary data apply)

Finally, the fact that access to the underlying device is separated from the
buffering component allows the device abstractions to be used in situations
where stream buffers are unsuitable. For example, on some platforms
std::streamoff is a 32-bit long even though the OS can handle 64-bit seeks. The
Boost Iostreams library's file_descriptor component can then be used directly
when std::filebuf would be inadequate.

There's a brief tutorial here, which is basically up-to-date:

http://www.kangaroologic.com/iostreams/libs/iostreams/doc/?path=3

Best Regards,
Jonathan

ka...@gabi-soft.fr

unread,
Feb 16, 2005, 6:03:41 PM2/16/05
to
Jonathan Turkanis wrote:
> ka...@gabi-soft.fr wrote:
> > jto...@yahoo.com wrote:

> >>>> No wonder Jonathan Turkanis wrote a library to ease
> >>>> creation of stream classes :)

> >>> I haven't looked at it, but I imagine that the most
> >>> important thing it would do would be to provide the
> >>> convience classes which derive from istream and ostream.
> >>> It's hard to get simpler than overriding overflow and
> >>> underflow.

> >> I wouldn't bet on it. Jonathan?

> > I'll have to look at it. But the last time I wrote a
> > streambuf, it took me exactly five lines of code. Including
> > the class definition. OK -- I'll admit that it was a very
> > special case, but between ten and fifteen lines seems about
> > par for the course.

> As I said in another message which hasn't posted, this is
> basically what Dietmar Kuehl's said about the library.
> However, it wasn't designed for people like you and Dietmar
> ;-) Many people find the interface for customizing
> basic_streambuf quite complicated.

Everything is complicated the first time you do it:-). I don't
think its that complicated, and I think that it is something
that the average programmer could learn without too much
effort. However...

In one of your other postings, you posted a link to an
introduction to your templates. In addition, I've been looking
at some of my own code. And I'm revising my opinion somewhat.
I still think that the basic streambuf interface is not all that
complicated, and is well designed for what it does. It is a
very, very generic interface, however, which is designed to
literally allow the derived class a maximum of freedom, to do
anything it wishes. In practice, there are a couple of
"standard" patterns which are used: data sinking, data sourcing,
filtering, etc. Obviously, no one of these patterns needs all
of the degrees of freedom provided. And providing templates
which manage the degrees of freedom not needed is a good idea.
While a competent programmer should have no problem learning how
to derive correctly directly from streambuf, and I wouldn't ever
qualify it as a lot of work, or difficult, using a template for
one of the standard patterns will still save a significant
amount of needless effort. (If I look back... I consider
myself a competent programmer, at least with regards to
streambuf's, and yet it's been years since I've directly derived
from streambuf to implement a filtering streambuf -- I wrote my
own templates to do it a long time ago. So regardless of what
I've been saying, I've been doing what you are saying.)

> Even so, I'd say ten to fifteen lines for a streambuf
> definition is unusually short, if you provide buffering and
> override xsgetn and/or xsputn instead of just
> overflow/underflow. My generic implementation of xsgetn with a
> putback buffer is about 20 lines long by itself.

True. But I'd say that most streambuf definitions don't really
need buffering. But of course I'm prejudiced; most of my
streambuf's are filtering streambuf's, where you normally want
to avoid buffering. (The buffering is taking place in the main
streambuf, and I find it very useful to keep the main streambuf
synchronized with the filter -- I often insert and remove
filters in the middle of a file.)

> > Having said that -- anytime you're deriving, there's a lot
> > of boilerplate code. I use a template for most of my
> > filtering streambufs, for example.

> Right -- eliminating boilerplate one of the central aims of
> the library, particularly since buffer manipulation is very
> error prone. In addition to buffering, handling seeking in
> read/write streams can be tricky.

If you want to support everything: a bidirectional streambuf
with buffering which supports seeking, you do begin to have a
lot of work. Nothing conceptually difficult, of course, but a
lot of details to get right. Or wrong, if you're not
ultra-careful.

> > (On the third hand, well over half the code in the template
> > is because I want to support both classical and standard
> > streams, including all of the widespread pre-standard
> > idioms, and I only want to count on the written guarantees
> > of the pre-standard streams. Thus, an input streambuf will
> > override overflow(), to do what the standard guarantees that
> > streambuf::overflow does, because the documentation I had of
> > the pre-standard streams didn't give any guarantee here.)

> At first I was against supporting classic iostreams, since
> Boost is about promoting and extending standard C++. However,
> I recently decided to try adding support for the iostreams
> libraries used by GCC 2.9x, and I'm still holding my breath,
> but it looks like all the narrow-stream tests are
> passing. That's really the only platform with classic
> iostreams that's relevant to Boost, though it would be
> interesting to see what happens on other platforms.

> By the way, the library documentation is here:

> http://www.kangaroologic.com/iostreams/

> It's no longer in sync with the Boost CVS version, but there
> haven't been any big interface changes.

I've only given it a quick glance, but it looks very, very
good. (If only Boost would work with Sun CC:-(. As it is, I
even have trouble with my simple templates, which don't attempt
to template on the character type of the stream.)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
Feb 16, 2005, 6:03:19 PM2/16/05
to
Jonathan Turkanis wrote:

> There's also a filtering framework based on your (and, I
> gather, Dietmar's) filtering stream buffers. E.g.,

Very definitely also Dietmar's. I published the articles, and
for many years, have had examples and even a template available
on the network, but the original idea came out of discussions
between Dietmar and myself in comp.lang.c++ (before the
appearance of the moderated group, even). I am incapable now of
saying who first thought of what, but I rather suspect that it
was synergy -- neither of us would have come up with the idea
alone. I know at any rate that without Dietmar's comments, I
wouldn't have come up with the idea myself.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
Feb 16, 2005, 6:08:02 PM2/16/05
to
Jonathan Turkanis wrote:

> The policy classes just contain one or more read/write/seek
> functions which describe in a very straightforward way how to
> access data using the underlying device -- e.g., file, network
> connection or in-memory array. It's much more straightforward,
> IMO, than overriding underflow, overflow, pbackfail, ... .
> Plus, you get buffering and the ability to putback characters
> free.

> Maybe calling it 'policy-based' gives the impression that the
> design is convoluted. I hope not. After all, I didn't realize
> it was policy-based until after I wrote it ;-)

A bit like Mr. Jourdain, no doubt.

I find a lot of the best ideas I've encountered are somewhat
like that. Generally used as ad hoc solutions before someone
recognized the pattern and gave it a name. I suppose the the
Extractor and Insertor template parameters to my filtering
streambuf's are also "policies". It just never occured to my to
qualify them as such. Nor to define the technique I was using
as a pattern.

Thus, about a year ago, when upgrading machines, I stumbled on
code I'd written around 1990, and discovered that I'd been using
traits without even knowing it.

There's a tendancy to minimize the invention in such cases.
IMHO, this is wrong; recognizing a number of ad hoc solutions as
being one, and extracting the underlying pattern, is a stroke of
genius, and should be recognized as such. As it is, because of
the tendancy to minimize such discoveries, the discoverer often
adds a lot of complications, just to impress us. Whereas the
brilliance in such ideas as traits or policies is precisely
their simplicity; the fact that any John Doe programmer can
understand them, and the fact that he has probably been using
them already without knowing it (or recognizing the pattern).

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jonathan Turkanis

unread,
Feb 19, 2005, 11:40:27 PM2/19/05
to
ka...@gabi-soft.fr wrote:
> Jonathan Turkanis wrote:

>> As I said in another message which hasn't posted, this is
>> basically what Dietmar Kuehl's said about the library.
>> However, it wasn't designed for people like you and Dietmar
>> ;-) Many people find the interface for customizing
>> basic_streambuf quite complicated.
>
> Everything is complicated the first time you do it:-).

Of couse ;-)

> I don't
> think its that complicated, and I think that it is something
> that the average programmer could learn without too much
> effort.

Let me put it this way. Each time I implement overflow I have to check the
specification to determine what to do if the argument is EOF, but I never have
the slightest doubt about what ostream::put does. Of course, this is partly
beacuse I use put frequently and implement overflow only occassionally, but also
think it's partly because of a non-intuitive interface.

> However...

> In one of your other postings, you posted a link to an
> introduction to your templates. In addition, I've been looking
> at some of my own code. And I'm revising my opinion somewhat.
> I still think that the basic streambuf interface is not all that
> complicated, and is well designed for what it does. It is a
> very, very generic interface, however, which is designed to
> literally allow the derived class a maximum of freedom, to do
> anything it wishes. In practice, there are a couple of
> "standard" patterns which are used: data sinking, data sourcing,
> filtering, etc. Obviously, no one of these patterns needs all
> of the degrees of freedom provided. And providing templates
> which manage the degrees of freedom not needed is a good idea.
> While a competent programmer should have no problem learning how
> to derive correctly directly from streambuf, and I wouldn't ever
> qualify it as a lot of work, or difficult, using a template for
> one of the standard patterns will still save a significant
> amount of needless effort.

Well said. Probably my current motivational discussion for the library ("the
protected virtual interface is confusing ...") is a bit superficial. Something
like the above would be a good idea.

>> Even so, I'd say ten to fifteen lines for a streambuf
>> definition is unusually short, if you provide buffering and
>> override xsgetn and/or xsputn instead of just
>> overflow/underflow. My generic implementation of xsgetn with a
>> putback buffer is about 20 lines long by itself.
>
> True. But I'd say that most streambuf definitions don't really
> need buffering. But of course I'm prejudiced; most of my
> streambuf's are filtering streambuf's, where you normally want
> to avoid buffering. (The buffering is taking place in the main
> streambuf, and I find it very useful to keep the main streambuf
> synchronized with the filter -- I often insert and remove
> filters in the middle of a file.)

In the Boost library buffering can be specified at the time a filter is added to
a chain. Often you want no buffering, but sometimes, e.g. with compression
filters, buffering is important. I plan to add a mechanism for filter writers to
specify default buffering options.

>>> (On the third hand, well over half the code in the template
>>> is because I want to support both classical and standard
>>> streams, including all of the widespread pre-standard
>>> idioms, and I only want to count on the written guarantees
>>> of the pre-standard streams. Thus, an input streambuf will
>>> override overflow(), to do what the standard guarantees that
>>> streambuf::overflow does, because the documentation I had of
>>> the pre-standard streams didn't give any guarantee here.)

I forgot to mention that this same rationale applies even when dealing with
standard iostreams, since implementations differ in subtle ways.

>> At first I was against supporting classic iostreams, since
>> Boost is about promoting and extending standard C++. However,
>> I recently decided to try adding support for the iostreams
>> libraries used by GCC 2.9x, and I'm still holding my breath,
>> but it looks like all the narrow-stream tests are
>> passing. That's really the only platform with classic
>> iostreams that's relevant to Boost, though it would be
>> interesting to see what happens on other platforms.
>
>> By the way, the library documentation is here:
>
>> http://www.kangaroologic.com/iostreams/
>
>> It's no longer in sync with the Boost CVS version, but there
>> haven't been any big interface changes.
>
> I've only given it a quick glance, but it looks very, very
> good.

Thank you very much!

> (If only Boost would work with Sun CC:-(. As it is, I
> even have trouble with my simple templates, which don't attempt
> to template on the character type of the stream.)

Jonathan

0 new messages