This isn't trying to be negative but I have a question about the 'coding
style' of the VC++ STL. When I look at the source for most of the templates
I find that most of the variable/template paramters have (IMHO) very cryptic
names such as _Ty, _Ax, _Tptr (I am not sure if these are the best choices
to describe what I mean but hopefully I am somewhat clear :) ). The other
thing that I find is that there is not much whitespace used to make
everything more readable.
Is this actually the 'real' coding style that is used? By 'real' - I mean is
the code slightly obscured intentionally for protecting copyrights since
templates have to include the source?
Personally I find it almost impossible to be able to read/understand what
the code is actually trying to do most of the time and I have a hard time
convincing myself that even the developer(s) who work on this don't find
this style more difficult than perhaps some other styles. Is this just me?
Thanks
Better those names than long and descriptive
ones that would just occupy more string space
in the symbol table. Keep in mind that many
of the identifiers belong to the set reserved
to the implementation.
> The other
> thing that I find is that there is not much whitespace used to make
> everything more readable.
Hint: It is not supposed to be readable except
on a frequent basis by the compiler. While not
so critical these days, reducing traffic thru
the compiler's scanner is worth something.
> Is this actually the 'real' coding style that is used? By 'real' - I mean is
> the code slightly obscured intentionally for protecting copyrights since
> templates have to include the source?
I doubt that the source looks that way. But since
the documentation for the library is pretty good,
and it is all I should be relying upon to use the
library, I really don't care what its coding style
happens to be as long as it works.
The copyright is protected equally well, whether
the code is easy to read or not, by laws and folks
with guns and jails.
Somebody from Dinkumware will have to speak to
the obscuration issue. I would be quite surprised
if the 'real' source does not have plenty of
comments, more whitespace, more human-oriented
identifiers, and lots of conditionals to adapt
to different platforms. There is no reason for
the library vendor to give away their master
source just so Microsoft can use one incarnation
of it as part of the VC library.
> Personally I find it almost impossible to be able to read/understand what
> the code is actually trying to do most of the time and I have a hard time
> convincing myself that even the developer(s) who work on this don't find
> this style more difficult than perhaps some other styles. Is this just me?
Why in the world do you feel compelled to read that
code to figure out what it "is trying to do"? The
documentation lays out what it is supposed to do
quite clearly. Do you imagine the authors intend
to do something else that will become apparent from
the code?
--
-Larry Brasfield
(address munged, s/sn/h/ to reply)
Well, that's what I get for answering without reading the rest of the
message. The internal code does have extensive conditionalization to
support various compilers, but other than that, what you see is what's
there. We run it through a stream editor that removes the conditional
code to produce the VC++ version, but that doesn't change names,
comments, or whitespace. Readability is largely a matter of what you're
used to.
--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
It does.
If a developer considers it important saving a few bytes in a file or a
symbol table compared with readability/maintainablity then I believe they
have major problems when it comes to software design (IMHO of course) :)
> Keep in mind that many
> of the identifiers belong to the set reserved
> to the implementation.
I fail to see how the identifiers being in the implementation namespace is
relevant (other than the leading underscores). It does not explain why there
are identifiers such as _R, _N, _E, _C, etc. (after looking at one STL file,
<string>, I am somewhat amused that there is almost a single letter
identifier for every letter of the alphabet.)
> > The other
> > thing that I find is that there is not much whitespace used to make
> > everything more readable.
>
> Hint: It is not supposed to be readable except
> on a frequent basis by the compiler. While not
> so critical these days, reducing traffic thru
> the compiler's scanner is worth something.
Have you never loaded up the source for a library you use in order to debug
or for expanding your understanding for how something works? I have to
somewhat agree that the STL should meet the specifications of the C++
standard but sometimes that is not always the case and it helps to be able
to understand the code to figure out what is wrong.
> The copyright is protected equally well, whether
> the code is easy to read or not, by laws and folks
> with guns and jails.
Sure, and this is why I was curious whether the code is 'unmodified' or not.
Surely the people who have to maintain the code have just as much difficulty
understanding the code sometimes? I suppose if you look at the same code
every day it doesn't matter as much since you have that extra level of
familiarity.
> Somebody from Dinkumware will have to speak to
> the obscuration issue. I would be quite surprised
> if the 'real' source does not have plenty of
> comments, more whitespace, more human-oriented
> identifiers, and lots of conditionals to adapt
> to different platforms.
Based on the other response from Pete Becker - I guess you are surprised? I
know I am a bit.
> Why in the world do you feel compelled to read that
> code to figure out what it "is trying to do"? The
> documentation lays out what it is supposed to do
> quite clearly.
See above.
> Do you imagine the authors intend
> to do something else that will become apparent from
> the code?
Do you always have all your code work 100% as intended? :)
>> I doubt that the source looks that way.
>> Pete Becker wrote:
>> It does.
Hi Pete..
I do not believe that this is coming from you. Are you sure it is
the same P. J. Plauger who wrote the book 'The Elements of Programming
Style' where he mentions, "write clearly" as one of the 77 rules , has also
written the STL code ( which is cryptic to the core ) ?
Thanks Brian for popping the question that I have been thinking of
doing it myself for a long time.
Cheers
Check Abdoul
-----------------
"Pete Becker" <peteb...@acm.org> wrote in message
news:3D727664...@acm.org...
Cryptic is often in the eye of the beholder. The names are short. They
are consistent, and they're quite readable.
... so tell us, what do they mean in the eyes of the beholder - since
they clearly mean very little to most of us who occasionally wander
into this entangled forest ? :)
Dave
--
MVP VC++ FAQ: http://www.mvps.org/vcfaq
template <class _Ty, class _A = allocator<_Ty> > class vector
_Ty is the type that the vector holds and _A is the allocator that the
vector uses.
--
Truth,
James Curran [MVP]
www.NJTheater.com (Professional)
www.NovelTheory.com (Personal)
MVP = Where your high-priced consultant goes for free answers
"Brian Ross" <brian....@rogers.com> wrote in message
news:#iT2qDgUCHA.4092@tkmsftngp11...
I understand thats the reason for having underscores. But what about all
those single letter identifiers? Why not variable names such as _Count,
_NumElements, _NewBuffer? (I am just trying to give an example, sometimes
even these names could be chosen better). I don't see how the user being
able to #define a new meaning for identifiers forces them to use
short-cryptic one or two letter identifiers?
It doesn't.
What about the following code - taken from <string>? I have no idea what
this function does - I just took it at random because it had several
identifiers that, IMHO, are very cryptic.
How can "_E, _Tr, _A, _X, _D, _St, _I, _Myis, or _C" for identifiers be
'easy to understand'?
-----------
template<class _E, class _Tr, class _A> inline
basic_istream<_E, _Tr>& __cdecl getline(basic_istream<_E, _Tr>& _I,
basic_string<_E, _Tr, _A>& _X, const _E _D)
{typedef basic_istream<_E, _Tr> _Myis;
ios_base::iostate _St = ios_base::goodbit;
bool _Chg = false;
_X.erase();
const _Myis::sentry _Ok(_I, true);
if (_Ok)
{_TRY_IO_BEGIN
_Tr::int_type _C = _I.rdbuf()->sgetc();
for (; ; _C = _I.rdbuf()->snextc())
if (_Tr::eq_int_type(_Tr::eof(), _C))
{_St |= ios_base::eofbit;
break; }
else if (_Tr::eq((_E)_C, _D))
{_Chg = true;
_I.rdbuf()->snextc();
break; }
else if (_X.max_size() <= _X.size())
{_St |= ios_base::failbit;
break; }
else
_X += _Tr::to_char_type(_C), _Chg = true;
_CATCH_IO_(_I); }
if (!_Chg)
_St |= ios_base::failbit;
_I.setstate(_St);
return (_I); }
-------------
That makes it much harder to understand the code.
> - I just took it at random because it had several
> identifiers that, IMHO, are very cryptic.
Cryptic is often in the eye of the beholder.
>
> How can "_E, _Tr, _A, _X, _D, _St, _I, _Myis, or _C" for identifiers be
> 'easy to understand'?
I didn't say they were easy to understand. I said they are quite
readable.
> template<class _E, class _Tr, class _A> inline
> basic_istream<_E, _Tr>& __cdecl getline(basic_istream<_E, _Tr>& _I,
> basic_string<_E, _Tr, _A>& _X, const _E _D)
> {typedef basic_istream<_E, _Tr> _Myis;
> ios_base::iostate _St = ios_base::goodbit;
You begin by knowing what the function is supposed to do, and then you
read it. _E is the element type, _Tr is the traits type, _A is the
allocator, _X is an argument of type basic_string<_E, _Tr, _A>&, _D is
an argument of type const _E, _St is a local variable of type
ios_base::iostate, _I is an argument of type basic_istream<_E, _Tr>&,
_Myis is a local variable of type basic_istream<_E, _Tr>, _C (defined
further down) is a local variable of type _Tr::int_type.
This code is not meant to be a tutorial. It works, and those of us who
work with it know what it's doing and why. It apparently does not
satisfy your coding conventions, but it satisfies ours.
Yes, and if you look in the standard document itself, the code there is just
in the same style:
23.2.3.3 Class template stack
namespace std {
template <class T, class Container = deque<T> >
class stack {
public:
typedef typename Container::value_type value_type;
typedef typename Container::size_type size_type;
typedef Container container_type;
protected:
Container c;
public:
explicit stack(const Container& = Container());
bool empty() const { return c.empty(); }
size_type size() const { return c.size(); }
value_type& top() { return c.back(); }
const value_type& top() const { return c.back(); }
void push(const value_type& x) { c.push_back(x); }
void pop() { c.pop_back(); }
};
}
Here, the standard even requires the stack to have a member namned "c" ! (As
it is protected, it is accessible).
So, the names shown are T, x, and Container. As these names are not
available to the implementor, you might try _T, _X, and _Container.
Now, surprise, Microsoft has already used _T for some nasty macro, so try
_Ty for "type".
Not too bad for someone used to standardeze, in my opinion.
Bo Persson
bo...@telia.com
>> Use variable names that mean something.
>> Choose variable names that won't be confused.
( How does naming it E, _Tr, _A, _X, _D, _St, or _I means
anything )
>> Format a program to help the reader understand it.
According to Pete, this is not a requirement anymore.
Pete Wrote:
"This code is not meant to be a tutorial. It works,
and those of us who
work with it know what it's doing and why. It apparently
does not
satisfy your coding conventions, but it satisfies ours."
>> Indent to show the logical structure of a program.
Oh Yaah.. The STL code is nicely indented to understand.
>> Write clearly - don't sacrifice clarity for "efficiency."
Brian wrote :
> > What about the following code - taken from <string>? I have
no idea what
> > this function does
Pete's reply :
>> That makes it much harder to understand the code.
So, as I understand it, the STL code is purposefully made ( maybe by
running it through a stream editor ) to look the way it looks.
Thanks for the clarifications Pete.
Cheers
Check Abdoul
------------------
"Pete Becker" <peteb...@acm.org> wrote in message
news:3D7384A8...@acm.org...
First, a bit of clarification. The headers in .NET have more whitespace,
more comments, and longer names (though not that much longer). The same
is true of the V4.0 code now available for other platforms at our web
site. You might be interested to know the two major reasons why we made
these changes a couple of years ago:
1) Microsoft asked us to. Partly, they passed on requests from customers
that the code be more readable. Partly, they wanted us to avoid some
minor bugs in their debugger, which seems to get confused if you put
too much stuff on one line.
2) We hit too many conflicts with names reserved by headers in other
systems. _T was the only problem in VC++, but many other single-character
names of this form are defined as macros in various C headers. (Which
goes to show that we're not alone in favoring short names.)
The names were originally kept short, and the format tight, for three
reasons that are important to me, at least:
1) When I wrote the first draft of most of this code, computers were still
slow enough that the time spent reading headers was nontrivial. I got
comments both from reviewers and customers to the effect that terse headers
were preferred over verbose ones, if they made compiles go faster. This
constraint no longer applies, and we've taken that into account with
recent revisions.
2) I've published much of the early code in various books. That sets tight
constraints on line width and file length.
3) I've never felt that long identifier names are more readable than short
ones, and I've often been surrounded by better coders than me (such as
Kernighan, Ritchie, and Thompson) who felt the same. YMMV.
Finally, I underscore what Pete Becker has said several times. We don't
feel an obligation to produce tutorial code in our library headers. Some
are damned difficult to write correctly, and would take many a treatise
to explain to the casual reviewer. We write to be readable for *ourselves*
as maintainers. This is not an attempt to obfuscate, but simple expediency
given our size and our ambitious development plans.
> The following rules are taken from 'The Elements of Programming
> Style' book.
>
> >> Use variable names that mean something.
> >> Choose variable names that won't be confused.
>
> ( How does naming it E, _Tr, _A, _X, _D, _St, or _I means
> anything )
We're pretty uniform in our choice of short names. They mean enough to us.
> >> Format a program to help the reader understand it.
>
> According to Pete, this is not a requirement anymore.
>
> Pete Wrote:
> "This code is not meant to be a tutorial. It works,
> and those of us who
> work with it know what it's doing and why. It apparently
> does not
> satisfy your coding conventions, but it satisfies ours."
You have a different notion of who the ``reader'' is.
> >> Indent to show the logical structure of a program.
>
> Oh Yaah.. The STL code is nicely indented to understand.
Actually, it is. And it's very religiously indented, unlike practically
all other commercial (or free) code I've run across. You just happen not
to like the style.
> >> Write clearly - don't sacrifice clarity for "efficiency."
We didn't, except to the minor degree cited above.
> Brian wrote :
> > > What about the following code - taken from <string>? I have
> no idea what
> > > this function does
>
> Pete's reply :
>
> >> That makes it much harder to understand the code.
>
> So, as I understand it, the STL code is purposefully made ( maybe by
> running it through a stream editor ) to look the way it looks.
>
> Thanks for the clarifications Pete.
You're actively misquoting Pete again. In fact, our editing pass makes the
code *more* readable, by scrubbing unneeded ifdef logic and macro names,
and by adding whitespace.
If you still believe I've disregarded the style rules I helped promulgate
a quarter century ago, feel free. My notion of good style continues to
evolve, but I feel that I have stuck by the same basic principles I first
articulated with Kernighan. If you find the code less readable than you'd
like, you have my sympathy, if not my contrition. I've explained a few of
the numerous forces that shape the product, for good and/or for ill. But
if you just want to continue exercising your sarcastic wit, I feel no need
to serve as a foil.
P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
First of all, let me apologize if my reply was sarcastic. I
never meant it to be.
I was just trying to understand the reason behind the short
names chosen for the identifiers. Your reply explains it better.
Sorry for making my statement without looking at the latest
headers in .NET.
Once again I apologize to you and Pete, if my reply was
offensive.
Sorry.
Cheers
Check Abdoul
------------------
"P.J. Plauger" <p...@dinkumware.com> wrote in message
news:3d73e0b5$0$3775$724e...@reader2.ash.ops.us.uu.net...
This is all well and good. Unfortunately the names you use in your
template sources aren't always very obvious to the end
user/programmer, and they do get reproduced by the compiler in
error messages arising from syntactic (and other) errors in
template instantiations.
I don't suggest for a moment that your sourcecode ought to be a
tutorial on template programming, or on the STL, but it would be
nice to see you help to produce more readable error messages. I do
agree that the names in your newer releases are an improvement on
the older ones.
I do understand that code that leads to approachable error messages
from one compiler will not necessarily have the same effect with
another, though.
> We're pretty uniform in our choice of short names. They mean
> enough to us.
A short glossary for the uninitiated would perhaps be beneficial.
Cheers,
Daniel.
Readibility is in the eye of the beholder. Everyone has their own coding
style.
> Is this actually the 'real' coding style that is used? By 'real' - I mean
is
> the code slightly obscured intentionally for protecting copyrights since
> templates have to include the source?
It is the 'real' coding style. No intentional obfuscation.
> Personally I find it almost impossible to be able to read/understand what
> the code is actually trying to do most of the time and I have a hard time
> convincing myself that even the developer(s) who work on this don't find
> this style more difficult than perhaps some other styles. Is this just me?
The indentation doesn't fit my style, and there's no simple visible way to
separate class declarations, but other than that I find the code easily
readable. Besides, the code wasn't meant to be a coding style lesson, it
was meant to be compiled into your project. The Dinkum STL comes with a
great set of documentation that covers everything you need to know about how
the library works.
As in most other realms, such conclusions deserve
a balance of competing values. For something that
will be compiled jillions of times, more weight on
performance is due. I figured the source was in a
more readable form and what we see is the output
of some machine processing. There is no need to
implicate anybody's design methodology. In the
case of Plauger and Becker, I would be especially
hesitant to draw such inferences.
> > Keep in mind that many
> > of the identifiers belong to the set reserved
> > to the implementation.
>
> I fail to see how the identifiers being in the implementation namespace is
> relevant (other than the leading underscores).
It adds another reason for an identifier replacement
step in the release process. Since they do not do
that, I will grant it now appears to be irrelevant.
I had thought such a step was needed to deal with
collision problems in the variety of platforms
supported by Dinkumware's offering.
> It does not explain why there
> are identifiers such as _R, _N, _E, _C, etc. (after looking at one STL file,
> <string>, I am somewhat amused that there is almost a single letter
> identifier for every letter of the alphabet.)
OK.
> > > The other
> > > thing that I find is that there is not much whitespace used to make
> > > everything more readable.
> >
> > Hint: It is not supposed to be readable except
> > on a frequent basis by the compiler. While not
> > so critical these days, reducing traffic thru
> > the compiler's scanner is worth something.
>
> Have you never loaded up the source for a library you use in order to debug
> or for expanding your understanding for how something works?
Yes, on many occasions. But I never thought the
source should conform to my tastes just because
I wanted to look at it. All the more so when it
works as advertised.
> I have to
> somewhat agree that the STL should meet the specifications of the C++
> standard but sometimes that is not always the case and it helps to be able
> to understand the code to figure out what is wrong.
By the time I have to look that that source, I
have reached the phase of desperate measures.
I have found that, once it compiles, the C++
library works as documented. I have looked at
the source sometimes in connection with thread
safety concerns.
...
> > Somebody from Dinkumware will have to speak to
> > the obscuration issue. I would be quite surprised
> > if the 'real' source does not have plenty of
> > comments, more whitespace, more human-oriented
> > identifiers, and lots of conditionals to adapt
> > to different platforms.
>
> Based on the other response from Pete Becker - I guess you are surprised? I
> know I am a bit.
Yes and no. Quite surprised at his and his boss's
revelation that the identifiers you wonder about are
from the actual source and that it is indented the
same way we see the shipped code. My presumption
about 'lots of conditionals' is based on knowing
they sell their library for other platforms and
there are no platform dependent conditionals to
be found in the VC library from Dinkumware.
...
> > Do you imagine the authors intend
> > to do something else that will become apparent from
> > the code?
>
> Do you always have all your code work 100% as intended? :)
No, but when it does not, I generally read my code
rather than some library implementation. If I
suspect a library bug, I put traps on calls going
into it and coming out and compare input/results
with the documentation. In most shops I've been
in, libraries are fixed by the vendor, not the
customer.
For what it's worth, I actually find the Dinkumware source code quite
readable. The abbreviations they use are pretty obvious from context.
Tom
_Ty is a data element template parameter
_Tptr is a pointer to the data element type
_Ax is an allocator
The names are logical, if terse.
I think their use of white-space is excessive; I would never write:
for (; _First != _Last; ++_First)
I personally would write:
for(; _First!=_Last; ++_First)
My eyes pick-up the logical cohesion better that way.
Some people can't stand it (perhaps when I'm older and need glasses,
I'll feel the same way...)
O n c e t h e w h i t e s p a c e i s u n i f o r m , i t n o
l o n g e r c o m m u n i c a t e s i n f o r m a t i o n t o y o
u r C e r e b r a l C o r t e x .
Well, if you read that enough times, it starts to become rather clear.
Amazing how fast the brain adapts, really.
Oncethewhitespaceisremoved,itdoesn'tcommunicateinformationtotheCerebralCorte
xeither.
--
Truth,
James Curran
www.NovelTheory.com (Personal)
www.NJTheater.com (Professional)
www.aurora-inc.com (Day job)
>"Shannon Barber" <shannon...@myrealbox.com> wrote in message
>news:de001473.02090...@posting.google.com...
>> O n c e t h e w h i t e s p a c e i s u n i f o r m , i t n o
>> l o n g e r c o m m u n i c a t e s i n f o r m a t i o n t o y o
>> u r C e r e b r a l C o r t e x .
>>
>
>Oncethewhitespaceisremoved,itdoesn'tcommunicateinformationtotheCerebralCorte
>xeither.
_Yes_it_does.