Guru of the Week #85: Style Case Study #3: Construction Unions

Natale Fietta

unread,

Aug 12, 2002, 11:17:17 AM8/12/02

to

On 9 Aug 2002 14:52:38 -0400, "Anthony Williams"
<ant...@nortelnetworks.com> wrote:

>The //else //case and //switch comments are distracting, and add
>nothing.

I respectfully disagree, in my opinion they facilitate reading because
permit a faster matching of closing braces.

Regards,
Natale Fietta

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

John Potter

unread,

Aug 13, 2002, 11:25:12 AM8/13/02

to

On 12 Aug 2002 11:17:17 -0400, ros...@iperbole.bologna.it (Natale
Fietta) wrote:

> On 9 Aug 2002 14:52:38 -0400, "Anthony Williams"
> <ant...@nortelnetworks.com> wrote:

> >The //else //case and //switch comments are distracting, and add
> >nothing.

> I respectfully disagree, in my opinion they facilitate reading because

> permit a faster matching of closing braces.

My counter opinion.

Matching braces is a distraction, not reading. Well written code would
be more readable with the braces removed, see Python. Braces are for
dumb compilers not smart people. The comments go away when the braces
go away showing their non-utility.

John

Geoff Field

unread,

Aug 14, 2002, 8:43:30 AM8/14/02

to

"John Potter" <jpo...@falcon.lhup.edu> wrote in message
news:3d5803bb...@newstest2.earthlink.net...

> On 12 Aug 2002 11:17:17 -0400, ros...@iperbole.bologna.it (Natale
> Fietta) wrote:
>
> > On 9 Aug 2002 14:52:38 -0400, "Anthony Williams"
> > <ant...@nortelnetworks.com> wrote:
>
> > >The //else //case and //switch comments are distracting, and add
> > >nothing.
>
> > I respectfully disagree, in my opinion they facilitate reading because
> > permit a faster matching of closing braces.

Modern editors (and even some older ones) allow brace matching using
a single keystroke (or maybe a double stroke), but I still personally prefer
to see the closing comments on any set of braces more than about 10
or 20 lines apart, particularly if they're nested at all.

> My counter opinion.
>
> Matching braces is a distraction, not reading. Well written code would
> be more readable with the braces removed, see Python. Braces are for
> dumb compilers not smart people. The comments go away when the braces
> go away showing their non-utility.

As someone who has spent a *lot* of time maintaining code - both my own
and others, I can't let this counter-opinion go uncommented. Yes,
well-written
code should be readable with or without braces, but often they serve to make
divisions/sections of code clearer. Maintaining code that has been written
with clear sections (and adequte comments) is much less of a chore than
having to trawl through some other sod's code (or my own some time later)
where they've played fast and furious with operator precedence just to show
that they can.

Geoff

--
Geoff Field, Professional geek, amateur stage-levelling gauge.
Spamtraps: geoff...@suespammers.org, geoff...@hotmail.com or
geoff...@great-atuin.co.uk; Real Email: gcfield at optusnet dot com dot
au
The suespammers.org mail server is located in California; do not send
unsolicited bulk or commercial e-mail to my suespammers.org address

Francis Glassborow

unread,

Aug 14, 2002, 5:15:20 PM8/14/02

to

In article <10292860...@cswreg.cos.agilent.com>, Geoff Field
<geoff...@suespammers.org> writes

> > My counter opinion.
> >
> > Matching braces is a distraction, not reading. Well written code
> > would be more readable with the braces removed, see Python. Braces
> > are for dumb compilers not smart people. The comments go away when
> > the braces go away showing their non-utility.
>
>As someone who has spent a *lot* of time maintaining code - both my own

>and others, I can't let this counter-opinion go uncommented. Yes,
>well-written code should be readable with or without braces, but often
>they serve to make divisions/sections of code clearer. Maintaining
>code that has been written with clear sections (and adequte comments)
>is much less of a chore than having to trawl through some other sod's
>code (or my own some time later) where they've played fast and furious
>with operator precedence just to show that they can.

I think you miss the point. In Python indenting is a syntactic element.
In C++ braces are used to delimit scopes but there is no requirement for

indenting so to be safe we use comments to indicate where we think we
are. Personally I would prefer to have a code beautifier that simply
enforces an appropriate indentation policy where once again the need for

comments goes away.

--
Francis Glassborow ACCU
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

Natale Fietta

unread,

Aug 15, 2002, 10:00:29 AM8/15/02

to

On 13 Aug 2002 11:25:12 -0400, jpo...@falcon.lhup.edu (John Potter)
wrote:

>Matching braces is a distraction, not reading. Well written code would
>be more readable with the braces removed, see Python. Braces are for
>dumb compilers not smart people. The comments go away when the braces
>go away showing their non-utility.

I dont know Python, if it is possible to write good quality, easily
readable C++ code without braces i would love to see some example.

Regards,
Natale Fietta

Andrei Iltchenko

unread,

Aug 15, 2002, 6:58:29 PM8/15/02

to

"Herb Sutter" <hsu...@gotw.ca> wrote in message
news:94r5lus2qovpeuauj...@4ax.com...

> GotW #85: Style Case Study #3: Construction Unions
>
> Difficulty: 4 / 10
> ______________________________________________________________________
>
>
> JG Questions
> ------------
>
> 1. What are unions, and what purpose do they serve?

A union is is a means of introducing a multiply-aliased region of
storage.

> 2. What kinds of types cannot be used as members of unions? Why do
> these limitations exist? Explain.

Given that members of a class (union) are data members, member
functions, nested types, and enumerations, any C++ type may be a member
of a union. This absence of restrictions can be explained by the lack of
side-effects caused by a nested type declaration (a typedef declaration,
for instance) on the value of an object of the nested type's enclosing
union type.

On the other hand, there exist restrictions on the types of non-static
data members of a union, but that is not what the questions asks.

> Guru Questions
> --------------
>
> 3. The article in [1] cites the motivating case of writing a scripting
> language: Say that you want your language to support a single
type
> for variables that at various times can hold an integer, a string,
> or a list. Creating a union { int i; list<int> l; string s; }
> doesn't work for the reasons given above. The following code
> presents a workaround that attempts to support allowing any type
to
> participate in a union. For a more detailed explanation, see the
> original article.
>
> Critique this code and identify:
>
> a) Mechanical errors, such as invalid syntax or nonportable
> conventions.
>
> b) Stylistic improvements that would improve code clarity,
> reusability, and maintainability.
>
> #include <list>
> #include <string>
> #include <iostream>

None of the entities defined in the header <iostream> is used in the
code below. Needlessly including this header may result in a performance
penalty since its inclusion may result in the contsruction of the
objects 'std::cin, std::wcin, etc.' corresponding to the standard
byte/wide oriented streams.

> using namespace std;
>
> typedef list<int> LIST;
> typedef string STRING;

The original article also contained the following macro definition:
#define max(a, b) ((a) > (b) ? (a) : (b))

> struct MYUNION {
> MYUNION() : currtype( NONE ) {}
> ~MYUNION() {cleanup();}
>
> enum uniontype {NONE,_INT,_LIST,_STRING};

The above enum definition is an example of undefined behavior, since any
name that starts with an underscore followed by an upper-case letter is
reserved to the implementation. The author of the article is advised to
gain some familiarity with the contents of section 17.4 "Library wide
requirements" of the C++ Standard.

> uniontype currtype;

This one is pretty serious. Since the data member 'currtype' is public,
the user of 'MYUNION' might inadvertently change its value from say
'_INT' to '_LIST' avoiding a call to either 'getint', 'getlist', or
'getstring', a scenario which is bound to result in subsequent undefined
behavior. The member 'currtype' ought to have been declared either
'protected' or 'private'.

> inline int& getint();
> inline LIST& getlist();
> inline STRING& getstring();
>
> protected:
> union {
> int i;
> unsigned char buff[max(sizeof(LIST),sizeof(STRING))];
> } U;
>
> void cleanup();
> };
>
> inline int& MYUNION::getint()
> {
> if( currtype==_INT ) {
> return U.i;
> } else {
> cleanup();
> currtype=_INT;
> return U.i;
> } // else
> }
>
> inline LIST& MYUNION::getlist()
> {
> if( currtype==_LIST ) {
> return *(reinterpret_cast<LIST*>(U.buff));

Style note, the parentheses around the 'reinterpret_cast' expression are
redundant.

> } else {
> cleanup();
> LIST* ptype = new(U.buff) LIST();

1. If the standard header <new> had been included, the new-expression
above would have assuredly been resolved to the following placement
global allocation function: void* operator new(std::size_t, void*)
throw(). However, since this allocation function is not implicitly
declared by the implementation, overload resolution may well not be able
to find a viable function for the implicit call 'new(U.buff) LIST()'
rendering the program ill-formed.

2. There's no guarantee that the alignment requirements of the LIST
object being created are not stricter than those of the array object
designated by the expression 'U.buff'. If that is the case, the
new-expression will result in undefined behavior.

> currtype=_LIST;
> return *ptype;
> } // else
> }
>
> inline STRING& MYUNION::getstring()
> {
> if( currtype==_STRING) {
> return *(reinterpret_cast<STRING*>(U.buff));

Again, the parentheses around 'reinterpret_cast' are superfluous.

> } else {
> cleanup();
> STRING* ptype = new(U.buff) STRING();

Again, the above declaration-statement may render the program ill-formed
and result in undefined behavior should the alignment requirements of
the LIST object being created be more stringent than those of the array
'U.buff'.

> currtype=_STRING;
> return *ptype;
> } // else
> }
>
> void MYUNION::cleanup()
> {
> switch( currtype ) {
> case _LIST: {
> LIST& ptype = getlist();
> ptype.~LIST();

The explicit destructor invocation above may result in undefined
behavior. Consider this sequence of statements:

void foo()
{
MYUNION myunion;
LIST & mylist = myunion.getlist();
mylist.~LIST(); // The naive user ends the lifetime of the LIST
object.
STRING & mystring = myunion.getstring(); // A second attempt to
end
//
the lifetime of the LIST object... }

> break;
> } // case
> case _STRING: {
> STRING& ptype = getstring();
> ptype.~STRING();

Same as above, the statement may result in undefined behavior.

> break;
> } // case
> default: break;
> } // switch
> currtype=NONE;
> }
>
> (For an idea of the kinds of things I'm looking for, see also
Style
> Case Study #1 and Style Case Study #2.)

In my opinion the biggest drawback of the code is not its being fraught
with fragments resulting in undefined behavior but its lack of
usability. Suppose I had a problem at hand that called for me having,
say, 3 differnet union-like types: union { int i; list<int,
my_allocator<int> > l; string s; }, union { int i; std::list<double,
my_allocator<double> > l; std::string s; double d; }, and union {
std::deque<std::string,my_allocator<double> > d;
std::map<unsigned,std::wstring> ws}. The author's code would be
absolutely useless in addressing it.

I also found the author's coding style of uppercasing the names of
introduced types quite unconventional.

All in all, in my opinion this was an article that should never have
made it into a C++ magazine read by over 40 thousand developers.

> 4. Show a better way to achieve a generalized variant type, and
> comment on any tradeoffs you encounter.

As far as I know this problem was extensively treated in Andrei
Alexandrescu's series "Discriminated Unions", with the trade-offs being
discussed there too.

Regards,

Andrei Iltchenko
Brainbench MVP for C++
http://www.brainbench.com

Maciej Sinilo

unread,

Aug 15, 2002, 7:04:32 PM8/15/02

to

In article from 15 Aug 2002 10:00:29 -0400 Natale Fietta says...

> On 13 Aug 2002 11:25:12 -0400, jpo...@falcon.lhup.edu (John Potter)
> wrote:
>
> >Matching braces is a distraction, not reading. Well written code
> >would be more readable with the braces removed, see Python. Braces
> >are for dumb compilers not smart people. The comments go away when
> >the braces go away showing their non-utility.
>
> I dont know Python, if it is possible to write good quality, easily
> readable C++ code without braces i would love to see some example.

As John said -- braces are for the compilers. In C++ it's impossible to
write without them. In Python there are no braces, because the compiler
identifies the blocks by indentation only (good idea, IMHO, forces
beginner programmers to structure the code ;).

--
Maciej Sinilo

Gennaro Prota

unread,

Aug 16, 2002, 12:00:02 AM8/16/02

to

On 13 Aug 2002 11:25:12 -0400, jpo...@falcon.lhup.edu (John Potter)
wrote:

>Matching braces is a distraction, not reading. Well written code would

>be more readable with the braces removed, see Python. Braces are for
>dumb compilers not smart people.

In such matters there's always someone who disagrees, and could be
offended for your comment about smart people :-)

http://debian.acm.ndsu.nodak.edu/doc/python/examples/Tools/scripts/pindent.py

Genny [super partes].

Geoff Field

unread,

Aug 16, 2002, 8:43:05 AM8/16/02

to

"Francis Glassborow" <francis.g...@ntlworld.com> wrote in message
news:AS8VQpDI...@robinton.demon.co.uk...

> In article <10292860...@cswreg.cos.agilent.com>, Geoff Field
> <geoff...@suespammers.org> writes

[snip]

> >As someone who has spent a *lot* of time maintaining code - both my own
> >and others, I can't let this counter-opinion go uncommented. Yes,
> >well-written code should be readable with or without braces, but often
> >they serve to make divisions/sections of code clearer. Maintaining
> >code that has been written with clear sections (and adequte comments)
> >is much less of a chore than having to trawl through some other sod's
> >code (or my own some time later) where they've played fast and furious
> >with operator precedence just to show that they can.
>
> I think you miss the point. In Python indenting is a syntactic element.

Maybe so, and I've seen other languages where indentation is syntactic
as well, but this *is* a C++ discussion group, as you of all people well
know.

> In C++ braces are used to delimit scopes but there is no requirement for
> indenting so to be safe we use comments to indicate where we think we
> are.

Yes, I tend to do the same thing myself _as well as_ using indentation
in a manner that seems sensible.

> Personally I would prefer to have a code beautifier that simply
> enforces an appropriate indentation policy where once again the need for
> comments goes away.

I've used such beautifiers in the past, but once I've found out what a
company's
policies are w.r.t. code formatting, I tend not to need them.

As for the *need* for comments - this is very much a style issue that has in
the past caused religious wars. My belief is that where the scoped blocks
are long (for whatever value of "long") and/or there's lots of indentation,
it
(a) doesn't hurt and (b) adds value for future maintainers. If a comment
adds
value, I say go for it.

Geoff
--
Geoff Field, Professional geek, amateur stage-levelling gauge.
Spamtraps: geoff...@suespammers.org, geoff...@hotmail.com or
geoff...@great-atuin.co.uk; Real Email: gcfield at optusnet dot com dot
au
The suespammers.org mail server is located in California; do not send
unsolicited bulk or commercial e-mail to my suespammers.org address

John Potter

unread,

Aug 16, 2002, 11:38:30 AM8/16/02

to

On 15 Aug 2002 10:00:29 -0400, ros...@iperbole.bologna.it (Natale
Fietta) wrote:

> On 13 Aug 2002 11:25:12 -0400, jpo...@falcon.lhup.edu (John Potter)
> wrote:
>
> >Matching braces is a distraction, not reading. Well written code
> >would be more readable with the braces removed, see Python. Braces
> >are for dumb compilers not smart people. The comments go away when
> >the braces go away showing their non-utility.
>
> I dont know Python, if it is possible to write good quality, easily
> readable C++ code without braces i would love to see some example.

You can't write C++ without braces; however, they and comments on them
add nothing to readability. They do make good fodder for religious wars
about where to put them. :)

template <class BI>
void dumbSort (BI b, BI e) {
if (b != e) {
BI p(b);
for (++ p; p != e; ++ p) {
if (*p < *b) {
for (BI t(p); t != b; ) {
BI u(t --);
swap(*t, *u); }}
else {
BI t(p);
for (BI u(t --); *u < *t; u = t --) {
swap(*t, *u); }}}}}
int aFive () {
return 5; }

QED

John

Anthony Williams

unread,

Aug 16, 2002, 12:55:35 PM8/16/02

to

"Andrei Iltchenko" <iltc...@yahoo.com> writes:

> "Herb Sutter" <hsu...@gotw.ca> wrote in message
> news:94r5lus2qovpeuauj...@4ax.com...

> > LIST* ptype = new(U.buff) LIST();
>
> 1. If the standard header <new> had been included, the new-expression
> above would have assuredly been resolved to the following placement
> global allocation function: void* operator new(std::size_t, void*)
> throw(). However, since this allocation function is not implicitly
> declared by the implementation, overload resolution may well not be
> able to find a viable function for the implicit call 'new(U.buff)
> LIST()' rendering the program ill-formed.

Just to expand on that:

Such a placement new is _never_ valid without including the <new>
header, irrespective of any other declarations, since 5.3.4p11 specifies
that overload resolution is used to find the operator function to call,
and the placement new form is not declared implicitly.

However, other standard library headers _may_ include <new>, so it _may_
work OK.

Even if <new> is included, if someone defines

::operator new(std::size_t,unsigned char*)

then overload resolution will use this in preference.

> The explicit destructor invocation above may result in undefined
> behavior. Consider this sequence of statements:
>
> void foo()
> {
> MYUNION myunion;
> LIST & mylist = myunion.getlist();
> mylist.~LIST(); // The naive user ends the lifetime of the LIST
> object.
> STRING & mystring = myunion.getstring(); // A second attempt to
> end
> //
> the lifetime of the LIST object... }

The same could be said for anything that returns a reference, such as
vector::operator[], so I don't think this is a valid complaint.

> All in all, in my opinion this was an article that should never have
> made it into a C++ magazine read by over 40 thousand developers.

Agreed.

Anthony

Anthony Williams

unread,

Aug 16, 2002, 1:41:11 PM8/16/02

to

"Geoff Field" <geoff...@suespammers.org> writes:

> As for the *need* for comments - this is very much a style issue that
> has in the past caused religious wars. My belief is that where the
> scoped blocks are long (for whatever value of "long") and/or there's
> lots of indentation, it
> (a) doesn't hurt and (b) adds value for future maintainers. If a
> comment adds value, I say go for it.

Comments identifying matching braces don't add any value, and make more
work for maintainers in the case that they modify the condition --- if a
block ceases to be an "else" but becomes an "if" in its own right, then
the comment needs to be changed, too.

If code isn't indented neatly to show the block structure, it is badly
formatted, and should be reformatted.

If the start of a block isn't obvious, with nicely-indented code, then
the block is probably *too* long, and needs refactoring.

Anthony

tj bandrowsky

unread,

Aug 17, 2002, 7:03:28 AM8/17/02

to

>(good idea, IMHO, forces
> beginner programmers to structure the code ;).

Isn't the idea that a language should "force" a programmer to do
something the kind of thinking that gave us Pascal?

Tom Plunket

unread,

Aug 17, 2002, 7:05:41 AM8/17/02

to

Geoff Field wrote:

> As for the *need* for comments - this is very much a style issue
> that has in the past caused religious wars. My belief is that
> where the scoped blocks are long (for whatever value of "long")
> and/or there's lots of indentation, it (a) doesn't hurt and (b)
> adds value for future maintainers. If a comment adds value, I
> say go for it.

A comment in this case means that the function is too long. Save
some time, turn that section of code into a function, and the
name of the function is the only documentation that you need.

-tom!

Gennaro Prota

unread,

Aug 17, 2002, 5:24:03 PM8/17/02

to

On 17 Aug 2002 07:05:41 -0400, Tom Plunket <to...@fancy.org> wrote:

>Geoff Field wrote:
>
> > As for the *need* for comments - this is very much a style issue
> > that has in the past caused religious wars. My belief is that
> > where the scoped blocks are long (for whatever value of "long")
> > and/or there's lots of indentation, it (a) doesn't hurt and (b)
> > adds value for future maintainers. If a comment adds value, I
> > say go for it.
>
>A comment in this case means that the function is too long. Save
>some time, turn that section of code into a function, and the
>name of the function is the only documentation that you need.

Usually I prefer not to intervene in such religious/personal taste
wars (de gustibus non est disputandum). Anyhow, in C++ braces are not
used only for blocks: are you in the habit to turn sections of member
declarations into macros in order keep your class definitions "short"?
It's simple: there are things that you should never do, things that
you should always do, and things that are useful only in some cases. I
would never employ a programmer who doesn't acknowledge this third
category. (If I ask someone whether it is correct to derive from a
class without a virtual destructor and he simply answers "no" I don't
consider him a good programmer)

Genny.

Raoul Gough

unread,

Aug 17, 2002, 6:50:31 PM8/17/02

to

"Maciej Sinilo" <msi...@NOSPAM.kki.net.pl (dot)> wrote in message
news:MPG.17c5efa1d...@news.tpi.pl...

> In article from 15 Aug 2002 10:00:29 -0400 Natale Fietta says...
> > On 13 Aug 2002 11:25:12 -0400, jpo...@falcon.lhup.edu (John Potter)
> > wrote:
> >
> > >Matching braces is a distraction, not reading. Well written code
> > >would be more readable with the braces removed, see Python. Braces

> > >are for dumb compilers not smart people. The comments go away when

> > >the braces go away showing their non-utility.
> >
> > I dont know Python, if it is possible to write good quality, easily
> > readable C++ code without braces i would love to see some example.
> As John said -- braces are for the compilers. In C++ it's impossible
> to write without them. In Python there are no braces, because the
> compiler identifies the blocks by indentation only (good idea, IMHO,
> forces beginner programmers to structure the code ;).

The syntax is different, but the dumb compiler still needs to be told
(via the language syntax) what scope each expression is in. It's still
possible to put a statement in the wrong scope in Python, and it's also
possible to confuse the scope of a statement when reading code. The
chances of this are less in well written code in either language, but
once a code block gets bigger than a screenful, you have the same
problems in Python as in C++. This is when the comments start helping,
it's just that there's nowhere obvious to put them in Python code :-)

Regards,
Raoul Gough.

John Potter

unread,

Aug 18, 2002, 7:32:17 AM8/18/02

to

On 17 Aug 2002 07:03:28 -0400, tban...@unitedsoftworks.com (tj
bandrowsky) wrote:

> >(good idea, IMHO, forces
> > beginner programmers to structure the code ;).

> Isn't the idea that a language should "force" a programmer to do
> something the kind of thinking that gave us Pascal?

No. Pascal was designed as a minimal language to allow doing things.

Python forces a programmer to indent just as C forces a programmer to
use braces. We may debate which is the more useful.

John

Natale Fietta

unread,

Aug 18, 2002, 10:48:38 AM8/18/02

to

On 15 Aug 2002 19:04:32 -0400, Maciej Sinilo
<msi...@NOSPAM.kki.net.pl (dot)> wrote:

>As John said -- braces are for the compilers. In C++ it's impossible to
>write without them. In Python there are no braces, because the compiler
>identifies the blocks by indentation only (good idea, IMHO, forces
>beginner programmers to structure the code ;).

Interesting, but i dislike it. i think some closing token is needed to
read any non-toy code.
(Please note: i do not want to start a religious war, this is only my
humble and personal opinion)

Look at this example: (hypothetical C-like syntax without braces)

if (a)
for (int i ...)
if (b)
for (int j ...)
//a page full of code
SomeFunction()

For me this is a nightmare to read (and also to write), to know where
SomeFunction is called i must count whitespaces !

Compare it to the same with some closing syntax (different closing
token for different language construct are morally equivalent to
commented closing braces)

if (a)
for (int i ...)
if (b)
for (int j ...)
//a page full of code
end_for
end_if
SomeFunction()
end_for
end_if

in this code i do know immediately where SomeFunctionCall is called
without the need to count anything.

Regards,
Natale Fietta

Geoff Field

unread,

Aug 19, 2002, 6:44:38 AM8/19/02

to

"Anthony Williams" <ant...@nortelnetworks.com> wrote in message
news:r8gzuf...@nortelnetworks.com...

> "Geoff Field" <geoff...@suespammers.org> writes:
>
> > As for the *need* for comments - this is very much a style issue that
> > has in the past caused religious wars. My belief is that where the
> > scoped blocks are long (for whatever value of "long") and/or there's
> > lots of indentation, it
> > (a) doesn't hurt and (b) adds value for future maintainers. If a
> > comment adds value, I say go for it.
>
> Comments identifying matching braces don't add any value, and make more
> work for maintainers in the case that they modify the condition --- if a
> block ceases to be an "else" but becomes an "if" in its own right, then
> the comment needs to be changed, too.

Agreed, and this is part of the discipline of programming - if changes
cause the code differ from the comment, the comment needs to be
updated.

> If code isn't indented neatly to show the block structure, it is badly
> formatted, and should be reformatted.

Agreed again.

> If the start of a block isn't obvious, with nicely-indented code, then
> the block is probably *too* long, and needs refactoring.

Agreed yet again. It's not always easy to refactor in the time that
marketing people have caused one to have available, but refactoring
is often (OK, usually) the best option in the long run.

Tom Plunket has also made this point, and I agree with him too. In
really low-end systems, sometimes the long monolithic block is hard
to avoid for efficiency reasons, but not many people use C++ in really
low-end systems (I'm talking 8-bitters with Ks rather than Ms of
memory).

Gennaro Prota's point about definitions getting long is also a good one.
Of course, there *is* always the option of sub-classing or simply making
use of inheritance to split the definition into meaningful blocks. Horses
for courses, and there's always a way of making code neater according
to one's own current definition of neatness.

There's also always the problem of Time To Market pressures. Do you
think killing marketing people who impose such pressures could be
called justifiable homicide? ;-)

Geoff

--
Geoff Field, Professional geek, amateur stage-levelling gauge.
Spamtraps: geoff...@suespammers.org, geoff...@hotmail.com or
geoff...@great-atuin.co.uk; Real Email: gcfield at optusnet dot com dot
au
The suespammers.org mail server is located in California; do not send
unsolicited bulk or commercial e-mail to my suespammers.org address

Gennaro Prota

unread,

Aug 19, 2002, 6:47:58 AM8/19/02

to

On 18 Aug 2002 07:32:17 -0400, jpo...@falcon.lhup.edu (John Potter)
wrote:

>Python forces a programmer to indent just as C forces a programmer to

>use braces. We may debate which is the more useful.

Or admit that the question "which is more useful" is ill-defined. This
sorts of questions (Which is best, Java or C++? Which is faster? Which
safer? What is the best sort algorithm? Are virtual calls slow? Is
multiple inheritance ok? Is switch/case faster than if/else? Do
pointers favour memory leaks? Are exceptions slow?) and the
certainties that result from their corresponding wrong answers are
IMHO a big cause of ruin of the software industry. Many people want
yes/no answers, even if incorrect. And many vendors know this fact (he
who has ears to hear...)

Genny.

John Potter

unread,

Aug 19, 2002, 6:48:38 AM8/19/02

to

On 18 Aug 2002 10:48:38 -0400, ros...@iperbole.bologna.it (Natale
Fietta) wrote:

> Interesting, but i dislike it. i think some closing token is needed to
> read any non-toy code.

Standard rules of thumb: The average human is incapable of digesting
logic with more than three levels of nesting or functions which do not
fit on the screen (24 lines).

Non-toy code should be written with these rules in mind.

John

Peter Dimov

unread,

Aug 19, 2002, 7:26:54 PM8/19/02

to

jpo...@falcon.lhup.edu (John Potter) wrote in message news:<3d5bdb2f...@news.earthlink.net>...

Counter:

{
T t;
}
{
T u;
}

vs

{
T t;
T u;

John Potter

unread,

Aug 20, 2002, 7:49:39 AM8/20/02

to

On 19 Aug 2002 19:26:54 -0400, pdi...@mmltd.net (Peter Dimov) wrote:

> Counter:

> {
> T t;
> }
> {
> T u;
> }

> vs

> {
> T t;
> T u;
> }

Sorry, I'm dense. I see that with braces you can declare unusable
variables in different scopes or the same scope. Without braces, it
seems that is not possible. What was your point? How does that apply
to useful programs?

John

Matthew

unread,

Aug 21, 2002, 6:29:25 AM8/21/02

to

John Potter wrote:
> On 19 Aug 2002 19:26:54 -0400, pdi...@mmltd.net (Peter Dimov) wrote:
>
> > Counter:
>
> > {
> > T t;
> > }
> > {
> > T u;
> > }
>
> > vs
>
> > {
> > T t;
> > T u;
> > }
>
> Sorry, I'm dense. I see that with braces you can declare unusable
> variables in different scopes or the same scope. Without braces, it
> seems that is not possible. What was your point? How does that apply
> to useful programs?
>
> John

Those variables are not "unusable". Their constructors may do some
important work. For example:

foo()
{
{
Lock t;
doSubFoo1();
}
{
Lock u;
doSubFoo2();

Peter Dimov

unread,

Aug 21, 2002, 6:30:51 AM8/21/02

to

jpo...@falcon.lhup.edu (John Potter) wrote in message news:<3d61c3f0...@news.earthlink.net>...

> On 19 Aug 2002 19:26:54 -0400, pdi...@mmltd.net (Peter Dimov) wrote:
>
> > Counter:
>
> > {
> > T t;
> > }
> > {
> > T u;
> > }
>
> > vs
>
> > {
> > T t;
> > T u;
> > }
>
> Sorry, I'm dense. I see that with braces you can declare unusable
> variables in different scopes or the same scope. Without braces, it
> seems that is not possible. What was your point? How does that apply
> to useful programs?

The first example executes the constructor of t, the destructor of t,
the constructor of u, the destructor of u. The second has a different
execution order. Without braces, the two examples appear identical.
Useful programs sometimes do interesting things in
constructors/destructors.

Allan W

unread,

Aug 23, 2002, 5:31:31 AM8/23/02

to

tban...@unitedsoftworks.com (tj bandrowsky) wrote in message news:<52e031b9.0208...@posting.google.com>...

> >(good idea, IMHO, forces
> > beginner programmers to structure the code ;).
>
> Isn't the idea that a language should "force" a programmer to do
> something the kind of thinking that gave us Pascal?

That's oversimplified at best.

You could say the same thing about C++ constructors. In fact, I know
some programmers that resist using C++ specifically because when you
create an object, "it automatically calls this function you might not
even know about, and there's nothing you can do about it." (Their
sentiments, not mine.)

Allan W

unread,

Aug 23, 2002, 5:38:01 AM8/23/02

to

jpo...@falcon.lhup.edu (John Potter) wrote

> Standard rules of thumb: The average human is incapable of digesting
> logic with more than three levels of nesting or functions which do not
> fit on the screen (24 lines).

Get a bigger screen! (Or a smaller font.) 24 lines is very short indeed.

Besides, I disagree with that sentiment. Ease of parsing a function
depends not just on the length of the function, but also on a number
of other factors: the complexity, the naming conventions, the
indentation, and so on. Consider:

int func(int);
long func1() {
long ll=func(2),l1,lI=1;while(1000>lI+=1){
for(l1=3;l1*l1<=lI;l1+=2)
if (!(lI%l1)){l1=0;break;
}if(l1)ll+=func(int(lI));
}return ll;
}

Unless I mistyped while mangling this function, it is well-formed.

It is also very short, at 7 lines (not including the prototype). That
is NOT sufficient to make it readable!

int func(int prime);
long func2()
{
long res =
func(2);

for (int poss=3;
poss<1000;
poss += 2)
{

for (int tst=3;
tst*tst<=poss;
tst+=2)
{

if (!(poss%tst))
{
goto np;
}
}

res +=
func(poss);

np:
}
return res;
}

func2() is 28 lines long, not including the prototype. According to
your blanket statement, the average human should be incapable of
digesting func2()! Some of that increased length is artificial --
putting for() loops on three lines, for instance (though this is
standard practice in some shops!). Still, if length were the only
factor, func2() ought to be 4 times harder to read than func1().

I submit that in fact func2() is actually easier to read than func1().

I think it's easy to tell what func2() does in a few seconds, despite
the complete lack of comments, and despite the fact that it uses the
oft-dreaded "goto" statement -- while func1() would require quite a
bit longer for most people to parse. (In fact, they accomplish the
same thing, although the logic is slightly different.)

int func(int prime);

// Function calls "func" with every prime from 2 to 1000,
// and sums the results.
long func3()
{
// Initialize the function result by calling func with
// the first prime value. Note that func() returns an
// int, but the accumulated value could exceed MAX_INT.
long result = func(2);

// This version of func3() doesn't have a table of primes.
// We already called func with the only even-numbered prime,
// so we need to generate all the odd numbers from 3 to 1000.
for (int possibleprime=3; possibleprime<1000; possibleprime += 2)
{
// Need to check if possibleprime is, in fact, prime.
// We already know it isn't divisible by 2.
// For any factor N, there is another factor M, such
// that one of them is <= the square root of possibleprime.
// Therefore, we only need odd numbers less than (or
// equal to!) the square root.
for (int test=3; test*test<=possibleprime; test+=2)
{
// If possibleprime is divisible by test,
// then by definition it is not prime.
if (!(possibleprime%test))
{
goto notprime;
}
}

// We tested every odd number up to the square root,
// and didn't find a hit. It must be prime, so go
// ahead and call func.
result += func(possibleprime);
notprime:
}

// Now that we've accumulated the value, return it.
return result;
}

At 38 lines, this is a LOT longer than your 24-line screen!
This version ought to be the hardest yet to digest -- is it?

Francis Glassborow

unread,

Aug 23, 2002, 2:03:34 PM8/23/02

to

In article <23b84d65.02082...@posting.google.com>, Allan W
<All...@my-dejanews.com> writes

>Get a bigger screen! (Or a smaller font.) 24 lines is very short
indeed.
>
>Besides, I disagree with that sentiment. Ease of parsing a function
>depends not just on the length of the function, but also on a number
>of other factors: the complexity, the naming conventions, the
>indentation, and so on. Consider:
>
> int func(int);
> long func1() {
> long ll=func(2),l1,lI=1;while(1000>lI+=1){
> for(l1=3;l1*l1<=lI;l1+=2)
> if (!(lI%l1)){l1=0;break;
> }if(l1)ll+=func(int(lI));
> }return ll;
> }

Brevity for the sake of brevity and as the only metric is, I think,
something which we can all agree is poor programming.

>
>Unless I mistyped while mangling this function, it is well-formed.
>
>It is also very short, at 7 lines (not including the prototype). That
>is NOT sufficient to make it readable!
>
> int func(int prime);

Here is where the first 'error' occurs. That function declaration is
malformed for human beings. What does it do? Does the argument have to
be a prime? Or is that just true for this usage?

> long func2()
Completely useless function name (again). What is this intended to do?

> {
> long res =
> func(2);
It isn't a result, but a total or accumulation so call it that, and
splitting between lines is just plain obfuscation.

>
> for (int poss=3;
> poss<1000;
> poss += 2)

again I would argue with the variable name, because the key information
is that it is an odd number so write:

for(int oddnumber =3; oddnumber<1000; oddnumber += 2)

> {
>
> for (int tst=3;
> tst*tst<=poss;
> tst+=2)
> {
>
> if (!(poss%tst))
> {
> goto np;
> }
> }

And that should be factored out as:
if(isprime(oddnumber)) total += func(oddnumber)

>
> res +=
> func(poss);
>
> np:
> }
> return res;
> }
>

I submit that the following is very much more readable than any of your
three versions:
int identity_op(int i){return i;}
int (process *)(int) = identity_op;
// use a function pointer to handle what process
// you want to apply to your primes
bool isprime(int);
bool iseven();

long accumulate_processed_primes(int first=2, last = 1000){
if(first<2) throw range_error;
if (first>last) throw range error;
long total = process(first);
int next = first+1;
if (iseven(next) )++next;
for( int oddnumber = next; oddnumber<= last; oddnumber+=2){
if(isprime(oddnumber)) total += process(oddnumber);
}
return total;
}

Is this not more readable? (as well as doing more). The iseven() can be
written as an inline function:
bool iseven(int i){
return !(i%2);
}

Which is simple enough so that many compilers will inline it even if you
do not explicitly ask for it to be inlined.

If I wanted to make the code short, there are several things I could do
to shorten the above. I do not, I want to make it readable. As it
happens, I think it also results in making the compiler's job fairly
easy. More polish could be applied, but eventually we hit the point of
diminishing returns. I think the above is fairly close to that point.

Note that there is no clever or advanced C++ in the above. Also, I did
not set out to write short code, only readable code. It just happens
that a property of most readable code is that it is also short. The
inverse is definitely not true. However it is true that most verbose
code is, IMO, unreadable.

--
Francis Glassborow ACCU
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

John Potter

unread,

Aug 23, 2002, 2:04:45 PM8/23/02

to

On 23 Aug 2002 05:38:01 -0400, All...@my-dejanews.com (Allan W)
wrote:

> jpo...@falcon.lhup.edu (John Potter) wrote

> > Standard rules of thumb: The average human is incapable of
digesting
> > logic with more than three levels of nesting or functions which do
not
> > fit on the screen (24 lines).

> Get a bigger screen! (Or a smaller font.) 24 lines is very short
indeed.

The rule of thumb is about 25 lines max regardless of screen size.
Sorry for the confusion. Just like the average person can determine a
set size to only five without counting. Ten lines or less is easy.
Up to about 20 to 25 is possible. More than that becomes very
difficult.

> Besides, I disagree with that sentiment. Ease of parsing a function
> depends not just on the length of the function, but also on a number
> of other factors: the complexity, the naming conventions, the
> indentation, and so on.

You seem to have missed the context. Given a function with otherwise
good style, are comments on ending braces useful or a distraction?
The claim was that they are useful because of long heavily nested
functions. Keep the size and nesting down and that claim goes away.
I note that you did not use them.

I found all of your examples hard to read. The first as intended.
The last because of comments (usually wrong anyway) hiding the code.
I find almost all inline comments distracting. The second because of
excessive vertical whitespace needlessly increasing its size.
Reformatting, even keeping your three line fors cut it to fifteen
where it was reasonable in spite of the rare logic. This one proves
the point to me better than anything. YMMV.

John

Raoul Gough

unread,

Aug 23, 2002, 4:26:20 PM8/23/02

to

"Allan W" <All...@my-dejanews.com> wrote in message
news:23b84d65.02082...@posting.google.com...
[snip]

> for (int poss=3;
> poss<1000;
> poss += 2)

[snip]

> Some of that increased length is artificial --
> putting for() loops on three lines, for instance (though this is
> standard practice in some shops!).

Thee-line for loops make a lot of sense to me, particularly when the
type
name is already pretty long:

for (std::vector<int>::const_iterator iter = longVectorName.begin()
; iter != longVectorName.end()
; ++iter)
{
//...
}

I'm not really sure why I put the semicolon at the start of the next
line. I
guess I got into this habit when splitting function parameter lists and
longish expressions:

int foo (LongTypeName const &p1
, EvenLongerTypeName const &p2
, MoreOfTheSame &p3)
{
int result
= blahblah
+ lotsOfLongNamesHere;
// ...
}

Maybe it makes the punctuation character clearer, because it follows the
leading white space. Hmmmm... I bet not many people would write code
like
this:

std::cout <<
"x: " <<
x <<
", y: " <<
y <<
"\n";

I also seem to remember a conjecture that this style makes editing a
parameter list easier (for instance adding, removing or re-ordering
parameters). I'm not really sure if that's true, though.

Regards,
Raoul Gough.

tj bandrowsky

unread,

Aug 25, 2002, 2:54:02 PM8/25/02

to

> > Isn't the idea that a language should "force" a programmer to do
> > something the kind of thinking that gave us Pascal?
>
> That's oversimplified at best.
>
> You could say the same thing about C++ constructors. In fact, I know
> some programmers that resist using C++ specifically because when you
> create an object, "it automatically calls this function you might not
> even know about, and there's nothing you can do about it." (Their
> sentiments, not mine.)
>

Well no, that's not the same thing. Forcing a particular style of
indentation is a layout or a cosmetic issue and really does nothing to
change the characteristics of your code. I would think that for any
language you are always going to "get calls to something you don't
know about and yet there is nothing you can do about it." It just goes
with the terrority... I mean, look at what printf used to do (send
characters to a tty), versus now (send characters to a tty simulator
running on top of a windowing abstraction layer on top of a drawing
abstraction layer on top of a device driver on top of a generic
graphics accelerator. You could probably make the claim, given the
complexity of most operating systems - even open source ones, that you
really don't know what your program, even in a "low level" language
like C++, is doing at all.

James Kanze

unread,

Aug 26, 2002, 10:37:28 AM8/26/02

to

All...@my-dejanews.com (Allan W) wrote in message
news:<23b84d65.02082...@posting.google.com>...

> jpo...@falcon.lhup.edu (John Potter) wrote
> > Standard rules of thumb: The average human is incapable of digesting
> > logic with more than three levels of nesting or functions which do
> > not fit on the screen (24 lines).

> Get a bigger screen! (Or a smaller font.) 24 lines is very short
> indeed.

True. But then, my average function length is only about 7 lines (not
including headers and outermost braces.

> Besides, I disagree with that sentiment. Ease of parsing a function
> depends not just on the length of the function, but also on a number
> of other factors: the complexity, the naming conventions, the
> indentation, and so on.

Ease of parsing a function depends on many things. Length, in itself,
is probably one of the lesser factors. On the other hand, excessive
length is often a symptom that the function is doing more than one
thing, or is too complex, which is definitly a bad sign. Given this,
I'm willing to accept that *with exceptions*, shortness is a necessary
(but far from sufficient critera). (Note that the exceptions are
important. A switch with 256 cases will definitly not fit in one
screenful, but breaking it up into several functions will probably hurt
comprehension.)

> Consider:

> int func(int);
> long func1() {
> long ll=func(2),l1,lI=1;while(1000>lI+=1){
> for(l1=3;l1*l1<=lI;l1+=2)
> if (!(lI%l1)){l1=0;break;
> }if(l1)ll+=func(int(lI));
> }return ll;
> }

All of the programs submitted to the IOCCC are quite short:-). It is
important that the brevity be real, and not achieved by "cheating".

[I've cut the rest. Because, of course, the point is well taken.
If short were the only critera, we should all be programming in
APL:-).]

--
James Kanze mailto:jka...@caicheuvreux.com
Conseils en informatique oriente objet/
Beratung in objektorientierter Datenverarbeitung

Kevin Cline

unread,

Aug 26, 2002, 12:40:34 PM8/26/02

to

All...@my-dejanews.com (Allan W) wrote in message
news:<23b84d65.02082...@posting.google.com>...

> jpo...@falcon.lhup.edu (John Potter) wrote
> > Standard rules of thumb: The average human is incapable of
digesting
> > logic with more than three levels of nesting or functions which do
not
> > fit on the screen (24 lines).
>
> Get a bigger screen! (Or a smaller font.) 24 lines is very short
indeed.

No, really it's rather a lot.

> func2() is 28 lines long, not including the prototype...

>
> I submit that in fact func2() is actually easier to read than func1().
>
> I think it's easy to tell what func2() does in a few seconds, despite
> the complete lack of comments, and despite the fact that it uses the
> oft-dreaded "goto" statement -- while func1() would require quite a
> bit longer for most people to parse. (In fact, they accomplish the
> same thing, although the logic is slightly different.)
>
> int func(int prime);
>
> // Function calls "func" with every prime from 2 to 1000,
> // and sums the results.
> long func3()
> {

> ... same as func 2, but with extensive comments

> At 38 lines, this is a LOT longer than your 24-line screen!
> This version ought to be the hardest yet to digest -- is it?

Both functions are too long, and should be decomposed:

bool is_prime(int i) // test primality for odd i >= 3
{
assert(i >= 3 && i % 2);
for (int divisor = 3; divisor * divisor < i; divisor += 2)
{
if (i % divisor == 0) return false;
}
return true;
}

long sum_of_func_over_primes_under_1000() {
sum = func(2);
for (i = 3; i < 1000; i += 2) {
if (is_prime(i)) sum += func(i);
}
return sum;
}

Also, using long to prevent overflow probably won't help. Most often
both int and long are 32 bits.

Allan W

unread,

Aug 27, 2002, 6:45:41 AM8/27/02

to

Francis Glassborow <francis.g...@ntlworld.com> wrote

> Allan W <All...@my-dejanews.com> writes
> >Get a bigger screen! (Or a smaller font.) 24 lines is very short
> indeed.
> >
> >Besides, I disagree with that sentiment. Ease of parsing a function
> >depends not just on the length of the function, but also on a number
> >of other factors: the complexity, the naming conventions, the
> >indentation, and so on. Consider:

[code snipped]

> Brevity for the sake of brevity and as the only metric is, I think,
> something which we can all agree is poor programming.

Yes

> I submit that the following is very much more readable than any of your
> three versions:

[snip]

> Is this not more readable? (as well as doing more).

The "Doing more" also makes it slower, I expect (haven't measured it).
But you've made my point: the length of the function is not any
indicator of how easily it can be "digested" (John Potter's word).

> Note that there is no clever or advanced C++ in the above. Also, I did
> not set out to write short code, only readable code. It just happens
> that a property of most readable code is that it is also short. The
> inverse is definitely not true. However it is true that most verbose
> code is, IMO, unreadable.

I think we agree here as well. You can make

order.total = order.subtotal + order.tax + order.shipping
- order.discount;
customer.balance += order.total;

turn into a six-page function, if you're intentionally trying to be
obtuse. Verbosity for the sake of verbosity, is even worse than brevity
for the sake of brevity -- but they're both pretty bad.

Francis Glassborow

unread,

Aug 27, 2002, 9:36:17 AM8/27/02

to

In article <23b84d65.02082...@posting.google.com>, Allan W
<All...@my-dejanews.com> writes

>The "Doing more" also makes it slower, I expect (haven't measured it).
>But you've made my point: the length of the function is not any
>indicator of how easily it can be "digested" (John Potter's word).

Actually, if I used some slightly heavier C++ (such as function objects)
I think my version would run as fast as any of yours. One feature of
good C++ style is that even with quite extensive refactoring and high
levels of abstraction the compiler can generate high quality code. BTW,
IMO, this is one of the major advantages that C++ has over C.

Now, whether you like it or not, long functions are harder to digest
than short ones written to the same quality. I contend that if you write
for readability you will find that you write short functions. Just as if
you write for readability in English you will write mostly short
sentences (equivalent to program statements) and short paragraphs (sort
of equivalent to paragraphs). Also note that usually books with
shortish (but not too short) chapters (TUs?) are easier to read than
ones with hundreds of pages to each chapter (Are they also easier to
write? Perhaps that is why the fourth volume of 'The Art of Computer
Programming' is still unpublished. For the uninitiated that volume is
scheduled to be in three parts even though it consists of only two
chapters.)

--
Francis Glassborow ACCU
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]