When laid out in black and white like that, these rules are quite clear.
However, let's consider this: Let's say you're appointed as the
Portability Advisor for a multi-national company that makes billions of
dollar each year. They pay you $500,000 a year, they have you working 30
hours a week and they give you 60 days paid holiday leave per year. They
don't block newsgroups, and their firewall only blocks the most
offensive of sites. They even get a Santa for the kids at the Christmas
party.
Your job is to screen the code that other programmers in the company
write. Every couple of days there's a fresh upload of code to the
network drive, and your job is to scan thru the code and point out and
alter anything that's not portable. Of course tho, you're given a
context in which to judge the code, for instance:
a) This code must run on everything from a hedge-trimmer to an iPod, to
a Playstation 3,
b) This code must run on all the well-known Desktop PC's
Depending on the context, you judge some code harsher than others. For
instance, in context B, you might allow assumptions that there's an even
number of bits in a byte, also that integer types don't contain padding.
While in context A, you might fire that code right back if you see such
assumptions.
So... it's Thursday morning, you sit down to your desk with a hot cup of
tea and a fig-roll bar, you check your mail. You surf the web for a
couple of minutes, perhaps check the latest scores in the election, or
look up where you can get a new electric-window switch for your car
since it mysteriously stopped working this morning.
You get down to it. You open up the network drive and navigate to James
Weir's source file. Its context is "run on anything". You're looking
thru it and you come to the following section of code:
typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually
*/
int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}
You look at this code and you think, "Hmm, this chap plans to write to
'entire' and then subsequently read individual bytes by using the
'bytes' member of the structure". You have a second suspicion that
perhaps James might have made assumptions about the size of "unsigned",
but inspecting the code you find that he hasn't.
Now, the question is, in the real world, at 10:13am on a sunny Thursday
morning, sitting at your desk with a hot cup of tea, munching away on a
fig-roll bar getting small crumbs between the keys on the keyboard, are
you really going to reject this code?
You're sitting there 100% aware that the Standard explicitly forbids you
to write to member A of a union and then read from member B, but how
much do you care?
Later on in the code, you come to:
double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);
Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants. Again, you know that the
Standard in all its marvelous rigidity doesn't want you to dereference
that pointer, but are you bothered? Are you, as the Portability Advisor,
going to reject this code?
What I'm trying to get across is, that, while we may discuss in black
and white what the Standard permits and what it forbids... are we really
going to be so obtuse as to reject this code in the real world? Are we
really going to reject some code for a reason that we see as stupidly
restrictive in the language's definition?
Perhaps it might be useful to point out what exactly can go wrong when
we're treading on a particular rule. In both these cases I've mentioned,
I don't think anything can go wrong, not naturally anyway. What _can_
cause problems tho is aspects of the compiler:
1) Over-zealous with its optimisation
2) Deliberately putting in checks (such as terminating the program when
it thinks you're going to access memory out-of-bounds).
The first thought I think comes to everyone's mind when we're talking
about these unnecessarily rigid rules, is that the Standard just needs
to be neatly amended. But, of course, it's C we're talking about, where
the current standard is from 1989 and where still not too many people
are paying attention to the 1999 standard that came out nine years ago.
So, I wonder, what can we do? If there was a consenus between many of
the world's most skilled and experienced C programmers that a certain
rule in the Standard were unnecessarily rigid, would it not be worth the
compiler vendors' while to listen? Here at comp.lang.c, there are,
without exageration, some of the world's best C programmers. Instead of
contacting each and every compiler vendor to let them know that we'd
prefer to optimise-away assignments to union members, would it be
convenient, both for the programmers and the compiler vendors, to have a
single place to go to to read what the world's best programmers think?
Should we have a webpage that lists the common coding techniques that
skilled programmers use, but which are officially forbidden or "a grey
area" in the Standard?
Two such rules I myself would put on the list are:
1) Accessing different union members
2) De-referencing a pointer to an array
--
Tomás Ó hÉilidhe
> typedef union ConfigFlags {
> unsigned entire; /* Write to all bytes
> char unsigned bytes[sizeof(unsigned)]; at once or access
> } ConfigFlags; them individually
I meant for that to come out as:
typedef union ConfigFlags {
unsigned entire;
char unsigned bytes[sizeof(unsigned)];
} ConfigFlags;
--
Tomás Ó hÉilidhe
However as portability expert your job is to be strict. For instance I
thought that, surely, slash slash comments were standard by now. No, my MPI
compiler won't accept them.
--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
If I haven't been reading comp.lang.c in the last few days, I spend a
few moments wondering what the heck that code is trying to do. Then I
step through it, and once I figure out what it does, I wonder why the
author wrote it that way, especially when there's a clearer and
unambiguously legal way to do the same thing:
double const *const pend = tangents + 5;
Of course 5 is a magic number, so either a named constant should be
used both for the array length and for the offset, or a macro should
be used to compute the length. For example:
#include <stdio.h>
#define ARRAY_LENGTH(a) (sizeof (a) / sizeof (*a))
int main(void)
{
double tangents[] = { 1.2, 2.3, 3.4, 4.5, 5.6 };
double const *const pbegin = tangents; /* just for symmetry */
double const *const pend = tangents + ARRAY_LENGTH(tangents);
double const *iter;
for (iter = pbegin; iter < pend; iter ++) {
printf("%g\n", *iter);
}
return 0;
}
Note that a very similar approach can be used when we don't have a
declared array object, but just a pointer to its first element and its
length. There's no good way to apply the ``*(&tangents + 1)''
approach if you don't have the array object itself.
#include <stdio.h>
#define ARRAY_LENGTH(a) (sizeof (a) / sizeof (*a))
void show_elements(double *array, size_t count)
{
double const *const pbegin = array;
double const *const pend = array + count;
double const *iter;
for (iter = pbegin; iter < pend; iter ++) {
printf("%g\n", *iter);
}
}
int main(void)
{
double tangents[] = { 1.2, 2.3, 3.4, 4.5, 5.6 };
show_elements(tangents, ARRAY_LENGTH(tangents));
return 0;
}
Even leaving portability concerns aside, I find the latter approach
easier to read, easier to use, and easier to think about.
--
Keith Thompson (The_Other_Keith) <ks...@mib.org>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
<snip>
> You get down to it. You open up the network drive and navigate to James
> Weir's source file. Its context is "run on anything". You're looking
> thru it and you come to the following section of code:
>
> typedef union ConfigFlags {
> unsigned entire; /* Write to all bytes
> char unsigned bytes[sizeof(unsigned)]; at once or access
> } ConfigFlags; them individually
> */
>
> int IsRemoteAdminEnabled(ConfigFlags const cf)
> {
> return cf.bytes[3] & 0x3u;
> }
This won't even run on MS-DOS, let alone "anything".
> You look at this code and you think, "Hmm, this chap plans to write to
> 'entire' and then subsequently read individual bytes by using the
> 'bytes' member of the structure".
No, I look at this code and I think, "someone has assumed that unsigned
ints are at least four bytes wide", which isn't true on typical MS-DOS
systems (yes, they're still used, believe it or not), isn't true on
various DSPs, and came within a gnat's whisker of being true on at least
one Cray.
> You have a second suspicion that
> perhaps James might have made assumptions about the size of "unsigned",
> but inspecting the code you find that he hasn't.
Yes, he has.
>
> Now, the question is, in the real world, at 10:13am on a sunny Thursday
> morning, sitting at your desk with a hot cup of tea, munching away on a
> fig-roll bar getting small crumbs between the keys on the keyboard, are
> you really going to reject this code?
Absolutely, yes.
> You're sitting there 100% aware that the Standard explicitly forbids you
> to write to member A of a union and then read from member B, but how
> much do you care?
No, I'm sitting there 100% aware that unsigned ints need not be four bytes
wide.
> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants.
No, I sit here thinking "why is he being so dumb as to dereference an
object that does not exist?".
> Again, you know that the
> Standard in all its marvelous rigidity doesn't want you to dereference
> that pointer, but are you bothered? Are you, as the Portability Advisor,
> going to reject this code?
Yes, of course. That's what they pay me for, right? "your job is to scan
thru the code and point out and alter anything that's not portable" - and
my yardstick for portability is the C Standard.
> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we really
> going to be so obtuse as to reject this code in the real world?
No, we're really going to be so acute as to reject this code in the real
world.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
<snip>
> No, I look at this code and I think, "someone has assumed that unsigned
> ints are at least four bytes wide", which isn't true on typical MS-DOS
> systems (yes, they're still used, believe it or not), isn't true on
> various DSPs, and came within a gnat's whisker of
not
> being true on at least one Cray.
(The Cray implementation in question very nearly defined CHAR_BIT as 64,
but they changed their mind, apparently quite late in the game. Had they
not changed their mind, sizeof(unsigned) would have been 1 on that
implementation.)
Depends on which standard you choose. According to the C one,
#include <fcntl.h>
is undefined behavior. As a portability advisor to real people working
in real trenches, you have to allow for extensions.
What we want to aim for is not avoiding extensions, but rather /
knowing/ what the documented extensions are and using them
deliberately.
> You're sitting there 100% aware that the Standard explicitly forbids you
> to write to member A of a union and then read from member B, but how
> much do you care?
You can write to member A and read through member B, if member B is a
character type that treats the thing as an array of bytes.
I would simply remove the member "entire" from the union, and change
it to a struct. Then see what breaks when you recompile the program,
and fix those occurences. The struct itself gives you the entire
thing. You can define objects of that struct type, pass them to
functions, return them, assign them, etc.
> Later on in the code, you come to:
>
> Â Â double tangents[5];
> Â Â ...
> Â Â double *p = tangents;
> Â Â double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants.
This can simply be edited to:
double *pend = tangents + sizeof tangents / sizeof tangents[0];
Or maybe the size should be a manifest constant somewhere:
double tangents[vector_size];
const double *const pend = &tangents[vector_size];
*(&tangents + 1) stops being neat once you get out of programming
puberty. Stuff like that looked neat when I was learning C for the
first time. It's not neat; it's just cryptic B. S.
99% of the C programmers out there have probably never seen an array
type manipulated as an array type---addresses being taken to make
pointer-to-array types, etc. Their heads will do a double, triple or
even quadruple ``take'' when they see that expression.
> Are you, as the Portability Advisor, going to reject this code?
Of course! There is no need for it to be doing what it's trying to do
by that means. There is no need to rely on an undocumented extension
of behavior to get the desired effect.
> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we really
> going to be so obtuse as to reject this code in the real world?
Absolutely. At /least/ that obtuse, if not way more.
And don't forget that we have a $500K salary as portability advisor,
and so we must scramble for every little thing that can add a line or
two to our weekly status report, and that can make us look very sharp
and justify our job position in the eyes of senior management.
There is always that!
> Are we
> really going to reject some code for a reason that we see as stupidly
> restrictive in the language's definition?
It's not stupidly restrictive when you can rewrite the expression in a
way that will not surprise most of the programmers out there, and that
doesn't break any rules.
A restriction is something that actually gets in your way; it makes
something impossible to do at all, or maybe only with an
unsatisfactory workaround.
> Perhaps it might be useful to point out what exactly can go wrong when
> we're treading on a particular rule. In both these cases I've mentioned,
> I don't think anything can go wrong, not naturally anyway. What _can_
> cause problems tho is aspects of the compiler:
> 1) Over-zealous with its optimisation
> 2) Deliberately putting in checks (such as terminating the program when
> it thinks you're going to access memory out-of-bounds).
There is
3) Wasting people's time when they have to scratch their heads about
what *(&array + 1) actually means and whether or not it's right.
> Should we have a webpage that lists the common coding techniques that
> skilled programmers use, but which are officially forbidden or "a grey
> area" in the Standard?
>
> Two such rules I myself would put on the list are:
> 1) Accessing different union members
Definitely not; count me out from your webpage.
The purpose of a union is to save space in implementing polymorphism.
If you store member X, you read member X.
Type punning is inherently nonportable. It's not enough to say that
type punning is allowed through members of a union, but undefined
elsewhere. To define its behavior, it's not enough to simply permit
some action. The outcome of the action must be specified. And you
cannot do that because it's totally nonportable.
At best you could say that if a member Y is accessed after member X is
stored, then there shall be no aliasing problem: Y will be
reconstituted out of the bits that were actually stored through X.
However, that's not anywhere nearly complete a definition of behavior
to be practically useful. That kind of thing belongs in the
architecture-specific pages of a compiler reference manual, not in the
language.
> 2) De-referencing a pointer to an array
But that is allowed. I think you mean, dereferencing a pointer to one
element past the end of an array-of-array object, where it's not
pointing to any array.
I tend to agree with this.
That is to say, the address-of operator could have some additional
semantic rules, along these lines:
When the operand of the address-of operator is a pointer-
dereferencing
expression based on the unary * operator, the two operators
effectively
cancel each other out, so that &*(E) is equivalent to (E), provided
that (E) is valid pointer: either a pointer to an object, a null
pointer, or a pointer one element past the end of an array object.
I can't think of a way of allowing (E)->member or (*(E)).member to be
defined when E is null, or otherwise valid but not pointing to an
object. Is there a use for this other than implementing offsetof,
which is already done for you?
I've seen passages e.g. like this:
typedef struct foo {.......} foo;
foo *foo; ........; sizeof(foo);
I don't even bother to learn whether or how the (a?) standard (or a
dialect) resolves foo for the purpose of sizeof. Like in a spoken
language, there are many more correct constructs that make no or
ambiguous sense to us mortals.
To that end, MISRA is a great effort to define what intelligent C coders
may say in a polite society. IMHO, it's an overkill in many respects but
it's a great starting point.
In your union example, the result obviously depends on machine
endianness. Depending on the exact access patterns this dependency may
cancel itself out, but it's an unreasonable burden on the maintainer to
verify it throughout. Thus the code like this should be banned from the
portable club.
> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants. Again, you know that the
> Standard in all its marvelous rigidity doesn't want you to dereference
> that pointer, but are you bothered? Are you, as the Portability Advisor,
> going to reject this code?
>
> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we really
> going to be so obtuse as to reject this code in the real world? Are we
> really going to reject some code for a reason that we see as stupidly
> restrictive in the language's definition?
Yes! Respectable people here in a yesterday's thread read the language
rules on this very subject differently. It means that the behavior is
not crystal clear. [And it might not be crystal clear to the compiler
writers either.]
Some people cobble together a compiler for, I'd say, a C-inspired
language just to sell their chip. If your portability requirements
include trimmers and such, you may step into this swamp. The standard
may not apply very well there.
If we favor simple constructs over "look what I can do!" we may get not
only more portable but also more efficient code, because the compiler's
optimizer may recognize more idioms. E.g. a dumb rotation of an unsigned
32-bit `a' left by `n' bits, (a<<n)|(a>>(32-n)) is translated by ARM ADS
into one rotate instruction; any attempt to get clever produces worse code.
OTOH, if you have a choice, you shop for a good compiler first; that
depends on how many "look what I can do!" you need to keep in the code.
Just my $0.02F
--
Ark
> Your job is to screen the code that other programmers in the company
> write.
<snip>
> You're looking
> thru it and you come to the following section of code:
>
> typedef union ConfigFlags {
> unsigned entire; /* Write to all bytes
> char unsigned bytes[sizeof(unsigned)]; at once or access
> } ConfigFlags; them individually
> */
>
> int IsRemoteAdminEnabled(ConfigFlags const cf)
> {
> return cf.bytes[3] & 0x3u;
> }
>
> You look at this code and you think, "Hmm, this chap plans to write to
> 'entire' and then subsequently read individual bytes by using the
> 'bytes' member of the structure". You have a second suspicion that
> perhaps James might have made assumptions about the size of "unsigned",
> but inspecting the code you find that he hasn't.
Already pointed out that the code does assume that sizeof(unsigned) >= 4.
> Now, the question is, in the real world, at 10:13am on a sunny Thursday
> morning, sitting at your desk with a hot cup of tea, munching away on a
> fig-roll bar getting small crumbs between the keys on the keyboard, are
> you really going to reject this code?
>
> You're sitting there 100% aware that the Standard explicitly forbids you
> to write to member A of a union and then read from member B, but how
> much do you care?
It does not. Not as far as I can see, anyway. It says the result is
"unspecified" with an informative footnote to tell us what range of
unspecified behaviour to expect. Basically we are referred to the
Representation of Types section so, as Portability Tsar, I am quite
happy with the type punning aspect of the code...
*But* I'd reject it, even if we could assume that sizeof(unsigned) >=
4 because one part of that unspecified behaviour is that the
resulting configuration file won't move between systems. If the
config file is on an NFS server, the bits in cf.bytes[3] will depend
on the target architecture the program was compiler for.
This is a penalty that /might/ be worth paying, but not if the
alternative is as simple as writing a 4 bytes array.
> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants. Again, you know that the
> Standard in all its marvelous rigidity doesn't want you to dereference
> that pointer, but are you bothered? Are you, as the Portability Advisor,
> going to reject this code?
Yes. This is s clear-cut case. The behaviour is undefined by the
standard. I have posted opinions that suggest I'd like it not to be,
and I am not 100% persuaded that there was any practical reason for
making it so -- but it is. As we speak, compiler writers are tuning
their optimisers, safe in the knowledge that they can do anything they
like with this code. I would not want my product to be in their
hands.
Again, if the payoff is huge, and the alternatives costly, they a case
could be made, but there are too may alternatives here. At the very
least, (void *)(&tangents + 1) has a clearer meaning than above and is
well-defined.
> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we really
> going to be so obtuse as to reject this code in the real world?
How is it obtuse to stick to the standard where practical? Both your
examples have potential risks attached and few benefits. There is
nothing obtuse about avoiding these risks.
> Are we
> really going to reject some code for a reason that we see as stupidly
> restrictive in the language's definition?
>
> Perhaps it might be useful to point out what exactly can go wrong when
> we're treading on a particular rule. In both these cases I've mentioned,
> I don't think anything can go wrong, not naturally anyway. What _can_
> cause problems tho is aspects of the compiler:
> 1) Over-zealous with its optimisation
I don't think it can be over-zealous if it does not break a correct
program. This is the whole point. If you stick to the letter of the
law you can't be banged up!
> 2) Deliberately putting in checks (such as terminating the program when
> it thinks you're going to access memory out-of-bounds).
>
> The first thought I think comes to everyone's mind when we're talking
> about these unnecessarily rigid rules, is that the Standard just needs
> to be neatly amended. But, of course, it's C we're talking about, where
> the current standard is from 1989 and where still not too many people
> are paying attention to the 1999 standard that came out nine years ago.
>
> So, I wonder, what can we do? If there was a consenus between many of
> the world's most skilled and experienced C programmers that a certain
> rule in the Standard were unnecessarily rigid, would it not be worth the
> compiler vendors' while to listen? Here at comp.lang.c, there are,
> without exageration, some of the world's best C programmers. Instead of
> contacting each and every compiler vendor to let them know that we'd
> prefer to optimise-away assignments to union members, would it be
> convenient, both for the programmers and the compiler vendors, to have a
> single place to go to to read what the world's best programmers think?
> Should we have a webpage that lists the common coding techniques that
> skilled programmers use, but which are officially forbidden or "a grey
> area" in the Standard?
>
> Two such rules I myself would put on the list are:
> 1) Accessing different union members
OK as it stands, I think, but often a portability nightmare for
practical reasons due to differing representations.
> 2) De-referencing a pointer to an array
You mean de-referencing a "one past the end" array pointer (for want
of more felicitous wording). I'd be happy if this was allowed in C0x,
but I'd live with any of the alternatives if it were not.
The best example of how this has happened in the past is the now
sanctioned struct array hack.
--
Ben.
IMHO, that's hardly the *only* reason to write portable code. There
are benefits even if the code will never be run on more than one
platform. Most of the time, portable code is simpler and clearer than
code that depends on implementation-specific features. (Not all the
time, just most of the time.)
> I've seen passages e.g. like this:
> typedef struct foo {.......} foo;
> foo *foo; ........; sizeof(foo);
> I don't even bother to learn whether or how the (a?) standard (or a
> dialect) resolves foo for the purpose of sizeof. Like in a spoken
> language, there are many more correct constructs that make no or
> ambiguous sense to us mortals.
That particular construct is illegal, unless the typedef is declared
in an outer scope and the object in an inner one. In that case, the
declaration is legal, and the "foo" in sizeof(foo) refers to the
innermost declaration (the object) -- but the object declaration hides
the typedef, which is a lousy idea. For example:
typedef struct foo { struct foo *foo; } foo;
{
foo *foo; /* legal, alas */
foo *bar; /* illegal, since the typedef name is hidden */
}
That's just a minor quibble, though; I agree that there are plenty of
things you can legally do that you nevertheless shouldn't. The
correct answer to "What does this code do?" is often "It gets rejected
at the code review.".
[snip]
And we're assuming that the issue about the comment is just a typo
by the OP, and not in "James Weir's source file".
> > You look at this code and you think, "Hmm, this chap plans to write to
> > 'entire' and then subsequently read individual bytes by using the
> > 'bytes' member of the structure".
>
> No, I look at this code and I think, "someone has assumed that unsigned
> ints are at least four bytes wide", which isn't true on typical MS-DOS
> systems (yes, they're still used, believe it or not), isn't true on
> various DSPs, and came within a gnat's whisker of being true on at least
> one Cray.
And perhaps unsigned ints are 8 bytes on some new 64-bit systems? (At
least the above then doesn't invoke UB, however.)
> > You have a second suspicion that
> > perhaps James might have made assumptions about the size of "unsigned",
> > but inspecting the code you find that he hasn't.
>
> Yes, he has.
He's also made an assumption about endianness.
> > Now, the question is, in the real world, at 10:13am on a sunny Thursday
> > morning, sitting at your desk with a hot cup of tea, munching away on a
> > fig-roll bar getting small crumbs between the keys on the keyboard, are
> > you really going to reject this code?
>
> Absolutely, yes.
Ditto. Even though every system I happen to work on at the moment
uses 4-byte unsigned ints, some are big-endian and others are little.
[...]
> > What I'm trying to get across is, that, while we may discuss in black
> > and white what the Standard permits and what it forbids... are we really
> > going to be so obtuse as to reject this code in the real world?
>
> No, we're really going to be so acute as to reject this code in the real
> world.
--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:ThisIsA...@gmail.com>
I would think "Hmm, this guy got Tomas/JKop from c.l.c to
write his code". (I've never seen anybody else write 'char
unsigned').
> Now, the question is, in the real world, at 10:13am on a sunny Thursday
> morning, sitting at your desk with a hot cup of tea, munching away on a
> fig-roll bar getting small crumbs between the keys on the keyboard,
What does that have to do with it?
> are you really going to reject this code?
Of course you are. On some systems the flag is cf.entire & 3;
on others it's cf.entire & (3 << 24); and on others it's
something else entirely. You'd have to dig through lots more
code to check that this code snippet is correct.
> You're sitting there 100% aware that the Standard explicitly
> forbids you to write to member A of a union and then read
> from member B
Perhaps you are looking for comp.lang.c++
> but how much do you care?
If your job is to make the code portable, then you care
about undefined behaviour.
> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants.
Come on, you wrote this whole post just to keep trying
to rationalize your ridiculous "clever" idea?
Any sane person would think: What the hell is that?,
fire the guy, and fix the code to be readable and correct.
<snip>
> I would think "Hmm, this guy got Tomas/JKop from c.l.c to
> write his code". (I've never seen anybody else write 'char
> unsigned').
What about that chap of Irish descent from a year or two back? Graham
somebody, wasn't it? The one who kept using "domestic" instead of
"canonical"? Hmmm, "domestic", grummage grummage grummage, oh yes, it was
indeed Graham somebody, for certain values of Graham: Frederick Gotham, in
fact. He was a great one for saving up adjectives, and "char unsigned" was
indeed amongst his specialites[1] du 2006.
[1] Insert accents to taste.
> This won't even run on MS-DOS, let alone "anything".
Wups, that's what I get for rushing an example.
The moral of the original post tho:
Do you think that comp.lang.c should have a page where it lists its
"ammendments" to the Standard?
--
Tomás Ó hÉilidhe
Personally, no, I don't think that's necessary.
The comp.lang.c newsgroup discusses the C language as it is (de jure, de
facto, and de historio(?)), not as clc would like the C language to be.
That's comp.std.c's job, I guess.
> (I've never seen anybody else write 'char
> unsigned').
Are you some how arguing that your ignorance of certain programming
styles and techniques somehow renders them deprecated? 99% of C
programmers have never seen anyone use sizeof without it immediately
being followed by parentheses, so are you saying that we should always
put parentheses after sizeof lest we get bullied by the Style Police?
> Of course you are. On some systems the flag is cf.entire & 3;
> on others it's cf.entire & (3 << 24); and on others it's
> something else entirely. You'd have to dig through lots more
> code to check that this code snippet is correct.
As has already been pointed out, my snippet was erroneous. The point of
my post tho wasn't to do soley with that snippet.
> If your job is to make the code portable, then you care
> about undefined behaviour.
It's the first thing you'd care about, I'd imagine.
> Come on, you wrote this whole post just to keep trying
> to rationalize your ridiculous "clever" idea?
That's one reason, yes. I liked the idea I had for getting the end
pointer of a array, and I could see no reason not to use it other than
politics. More specifically, I couldn't use it just because the
Standard, maybe, might, perhaps, said that I couldn't.
--
Tomás Ó hÉilidhe
<snip tripe>
Forgetting ad hominem attacks for the moment, (or whatever it is you
(plural) are actually trying to do to cast a derisory eye on my proposal),
what do you think of the prospect of comp.lang.c having a page where it
lists its ammendments to the Standard, ammendments that are agreed upon by
some of the world's best C programmers? And before someone goes on to post
more tripe suggesting arrogance or pretentiousnness on my part by implying
that I expressed, either explicitly or implicitly, that I'm a part of this
group, well I have made and make no such expression.
--
Tomás Ó hÉilidhe
> The comp.lang.c newsgroup discusses the C language as it is (de jure, de
> facto, and de historio(?)), not as clc would like the C language to be.
> That's comp.std.c's job, I guess.
Because some of the world's most skilled and most experienced C programmers
hang around comp.lang.c, do you think it would be reasonable to say that it
might be a bit of an authority on these matters?
I agree with you entirely that it is comp.std.c's job, but I think we both
know that lobbying for changes to the C standard is a lost cause. (Even if
the changes were to come to fruition, nobody would pay attention).
The page I'm proposing would be something like "Stupid stuff in the
Standard that shouldn't be there, and stuff that they left out that should
be there".
--
Tomás Ó hÉilidhe
> The page I'm proposing would be something like "Stupid stuff in the
> Standard that shouldn't be there, and stuff that they left out that should
> be there".
And the value of that would be precisely what?
Many programmers do not read this group anymore, completely fed up
with the "regulars"
For instance Paul Hsie participates here only occasionally, and he is
way better than Heathfield and co.
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
> "Tomás Ó hÉilidhe" <t...@lavabit.com> wrote in message
> I think that, yes.
> size_t is a function if you adopt the mathematical definition of the word,
> not if you adopt the programming definition. However I feel it is more
> useful to give it the same syntax as other functions.
(You meant sizeof)
But it doesn't have the same syntax.
func(arg) is a postfix-expression, sizeof(arg) isn't.
So func(arg)[ptr] means ptr[func(arg)],
but sizeof(arg)[ptr] means sizeof ptr[arg], not ptr[sizeof arg].
This is only relevant to IOCCC entries, but they don't
have the same syntax.
(And sizeof isn't a mathematical function either, as 1 == 1ULL, but
sizeof(1) can be != sizeof(1ULL).)
--
Army1987 (Replace "NOSPAM" with "email")
If it were held in high regard, then compiler vendors would pay
attention to it, and programmers would have more freedom to implement more
techniques in their code. Just as an example, it might influence compiler
writers to disable bad-memory-location checks when dereferencing a pointer
to an array. Of course, that's just one example.
--
Tomás Ó hÉilidhe
> Richard Heathfield:
>
> <snip tripe>
In fact I was offering supporting evidence to demonstrate that you are not
the only person who writes char unsigned.
> Forgetting ad hominem attacks for the moment,
Er, what are you talking about?
> (or whatever it is you
> (plural) are actually trying to do to cast a derisory eye on my
> proposal),
I'm not trying to cast any eyes, derisory or otherwise. You've made a
proposal, and I've offered my opinion about it. I'm not deriding your
proposal; I'm merely expressing my own point of view.
> what do you think of the prospect of comp.lang.c having a page
> where it lists its ammendments to the Standard, ammendments that are
> agreed upon by some of the world's best C programmers?
I don't think the world's best C programmers /do/ agree on how C should be
changed, if indeed it should be changed, so such a Web page is a
non-starter.
> And before someone
> goes on to post more tripe suggesting arrogance or pretentiousnness on my
> part by implying that I expressed, either explicitly or implicitly, that
> I'm a part of this group, well I have made and make no such expression.
You /are/ a part of the comp.lang.c newsgroup, because you subscribe to it
and contribute to it. If by "this group" you mean "some of the world's
best C programmers", perhaps you'd better identify who you mean by that,
to save the rest of us the trouble of expressing an opinion.
Your hostile reaction to ordinary to-ing and fro-ing is most peculiar. I
had you down as a fairly sensible chap, but I'm beginning to wonder
whether that was a mistake on my part.
> Richard Heathfield:
>
>> The comp.lang.c newsgroup discusses the C language as it is (de jure, de
>> facto, and de historio(?)), not as clc would like the C language to be.
>> That's comp.std.c's job, I guess.
>
> Because some of the world's most skilled and most experienced C
> programmers hang around comp.lang.c, do you think it would be reasonable
> to say that it might be a bit of an authority on these matters?
Bits of it are, yes. For example, Chris Torek is a de facto authority on C.
The problem is that there is no consensus on who /else/ is an authority.
> I agree with you entirely that it is comp.std.c's job, but I think we
> both know that lobbying for changes to the C standard is a lost cause.
Right.
> (Even if the changes were to come to fruition, nobody would pay
> attention).
Right again.
> The page I'm proposing would be something like "Stupid stuff in the
> Standard that shouldn't be there, and stuff that they left out that
> should be there".
In effect, this sounds like an attempt to create a "clc C standard". I
would suggest that, if ISO can't hack it, clc certainly can't. There is no
consensus here. Ask a C question of half a dozen clcers and you'll get at
least seven different opinions, no deadline for ending the debate, and no
formal voting mechanism. This is not the stuff that Standards are made on.
In your dreams...
Are you sure you phrased that right? I sometimes get the feeling that
the consensus is that noone else is an authority...
>> <snip tripe>
>
> In fact I was offering supporting evidence to demonstrate that you are
> not the only person who writes char unsigned.
Oh, sorry about that, I latched onto Old Wolf's tone and blindly assumed
your post was an extension of it. My bad.
--
Tomás Ó hÉilidhe
> In effect, this sounds like an attempt to create a "clc C standard". I
> would suggest that, if ISO can't hack it, clc certainly can't. There
> is no consensus here. Ask a C question of half a dozen clcers and
> you'll get at least seven different opinions, no deadline for ending
> the debate, and no formal voting mechanism. This is not the stuff that
> Standards are made on.
Actually yeah come to think of it you're right. Seems like a no-goer.
--
Tomás Ó hÉilidhe
He's beginning to feel like one of those people who, as my partner puts
it, can start an argument in an empty room.
>
> "Army1987" <army...@NOSPAM.it> wrote in message news:
>> Malcolm McLean wrote:
>>
>>> "Tomás Ó hÉilidhe" <t...@lavabit.com> wrote in message
>>
>>> I think that, yes.
>>> size_t is a function if you adopt the mathematical definition of the
>>> word,
>>> not if you adopt the programming definition. However I feel it is more
>>> useful to give it the same syntax as other functions.
>>
>> (You meant sizeof)
>>
> Yes. Who will free us of these gibberish types that are even too similar to
> keywords?
Any halfway decent IDE or PTE will make the difference manifest.
--
"Creation began." - James Blish, /A Clash of Cymbals/
Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN
> I liked the idea I had for getting the end
> pointer of a array, and I could see no reason not to use it other than
> politics. More specifically, I couldn't use it just because the
> Standard, maybe, might, perhaps, said that I couldn't.
More precisely, several (but not all) of the posters stated that
_they_ wouldn't use it, and explained why not. YOU get to do anything
you want to any code over which you have control. As you decide what
to write, presuming it's for hire, remember that in that environment,
communicating with your coworkers is as important as communicating
with the compiler.
I like it, but it's a sufficiently odd idiom that I'd define a macro
for it, and as long as I was defining a macro, I'd likely use the more
common (in my experience) sizeof.
#include <stdio.h>
#define ENDOF(array) (array+sizeof array/sizeof *array)
int
main(void)
{
char i[20];
char *orig = *(&i+1);
char *modified = &(&i+1)[0][0];
char *sizof = &i[sizeof i/sizeof i[0]];
printf("%d, %d, %d, %d\n", orig-i, modified-i, sizof-i, ENDOF(i)-i);
return 0;
}
Martin
--
Martin Golding DoD #0236 | fog...@comcast.net
Always code as if the person who ends up maintaining your code will be a
violent psychopath who knows where you live.
As you acknowledged in a later followup, there were no ad hominem
attacks in Richard's article. As you haven't acknowledged, there were
also no ad hominem attacks in Old Wolf's article. He did not attack
your usage of "char unsigned", he merely noted it (it is rather
unusual, after all). He did strongly criticize one of the technical
ideas that you offered (that ``*(&array + 1)'' is a good way to obtain
a pointer just past the end of an array); that is not an ad hominem
attack. If he had said or implied that your ideas are stupid because
they came from you, *that* would have been ad hominem, but he didn't
do that. I'll grant you that Old Wolf's words were fairly harsh,
perhaps more so than necessary. Feel free to criticize him for that
if you like, but my friendly advice is to grow a thicker skin.
Getting to the substantive question you're asking, I'd like to be
clearer on just what you're proposing. When you talk about amendments
(note spelling) to the C standard, do you mean changes that we agree
should be adopted in the next version, or do you mean changes that we
could all safely adopt right now, even if they contradict what the
standard actually says? For the former, comp.std.c is the best place
to discuss such things. If you mean the latter (areas where the
standard's requirements can be safely ignored), then I have no
objection to creating such a list -- as long as it's empty.
Sometimes the line between behavior that the standard defines and
behavior that it doesn't define isn't based on what *can* possibly be
defined; rather, it's based on the need to have a clear and
understandable dividing line between defined and undefined behavior.
In the standard as it's currently written, dereferencing a pointer
value that doesn't point to an object (assuming a pointer-to-object
type, of course) is undefined. There may be some corner cases where
you could get away with it if only the standard were more reasonable.
But covering all those corner cases would, IMHO, be a waste of time;
the benefit would be small, and the result would be a much larger
standard -- with more opportunities for errors. And try getting
agreement on which cases are reasonable and which ones aren't.
As for the specific case of ``*(&array + 1)'', I'll grant you that
it's clever. But in programming, being "clever" is not always a good
thing. You have to factor in the time others will have to spend
<optimism>appreciating your cleverness</optimism>, rather than getting
on with the job of understanding what the code does.
Combine that with the fact that the standard does not define its
behavior, that most implementations apparently will let you get away
with it (which means it's a potential bug that you can't find by
testing), and that there are clearer and more flexible ways to do
exactly the same thing,
Not all code has to be 100% portable. For example, if I'm writing
code that depends intimately on POSIX, I can safely assume that
CHAR_BIT==8 (since that's what POSIX requires). There is a known and
coherent set of implementations on which that assumption is valid.
Your assumption that ``*(&array + 1)'' does what you expect it to do,
on the other hand, depends not on the known nature of the platform,
but on the whim of the compiler writer. Your code could be broken
tomorrow by a new version of an optimizer on a platform that you don't
even use.
--
Keith Thompson (The_Other_Keith) <ks...@mib.org>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
That's just what the regulars want/wish. You can take any random
sample here and even removing the spam and hw questions you can see
that that just isn't the case. Random posts that come in here
invariably are talking about gcc, UNIX, MSVC++, or even C++.
> [...] not as clc would like the C language to be. That's comp.std.c's job,
> I guess.
The (demand and lobbying for the) removal of gets() happened here
(clc). Not there (csc). Google groups is good for one thing: you can
look through the historical record.
--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/
To ostracize and put the lie to the C standard committee perhaps?
That's how gets() was removed from the standard.
> "Army1987" <army...@NOSPAM.it> wrote in message news:
>> Malcolm McLean wrote:
>>> size_t is [...]
>> (You meant sizeof)
>>
> Yes. Who will free us of these gibberish types that are even too similar to
> keywords?
Who will free us of those posters to whom a trailing `_t` isn't
immediately glaring?
<g,d&r>
Yes, if you define it as a function over the set of all C expressions (with
complete type, and let's let VLAs alone).
"Indeed".
Ah yes. Let us not forget Martin Wells either.
You're off-topic on c.l.c. This sort of thing belongs on
comp.std.c.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
--
Posted via a free Usenet account from http://www.teranews.com
Admirable. Then:
int aray[28];
....
b = aray[100];
should never fail. Please advise what the value stored in b should
be? and why.
Good lord. I gather you don't read anything any of them publish.
Any such page will be 47% full of "Things in the standard I am too
ignorant to understand the reason for". Another 47% will be spent
on "Things I am to ignorant to understand why they are not
implementable in the generalized machines used for C". Total: 94%
useless.
> However, let's consider this: Let's say you're appointed as the
> Portability Advisor for a multi-national company that makes billions
> of dollar each year.
[Snip.]
> Your job is to screen the code that other programmers in the company
> write. Every couple of days there's a fresh upload of code to the
> network drive, and your job is to scan thru the code and point out and
> alter anything that's not portable. Of course tho, you're given a
> context in which to judge the code, for instance:
> a) This code must run on everything from a hedge-trimmer to an iPod,
> to a Playstation 3,
> b) This code must run on all the well-known Desktop PC's
[Snip.]
> You get down to it. You open up the network drive and navigate to
> James Weir's source file. Its context is "run on anything". You're
> looking thru it and you come to the following section of code:
>
> typedef union ConfigFlags {
> unsigned entire; /* Write to all
> bytes
> char unsigned bytes[sizeof(unsigned)]; at once or access
> } ConfigFlags; them individually
> */
>
> int IsRemoteAdminEnabled(ConfigFlags const cf)
> {
> return cf.bytes[3] & 0x3u;
> }
>
> You look at this code and you think, "Hmm, this chap plans to write to
> 'entire' and then subsequently read individual bytes by using the
I may suspect this, but I would check that that's what he actually is
planning on doing before I jump to conclusions. If it's not in his
code, as it appears so far, then I would ask him to clarify his
intentions.
> 'bytes' member of the structure". You have a second suspicion that
> perhaps James might have made assumptions about the size of
> "unsigned", but inspecting the code you find that he hasn't.
Yes he has. He's assumed that sizeof(unsigned) is at least 4, since
he's accessing cf.bytes[3].
> Now, the question is, in the real world, at 10:13am on a sunny
> Thursday morning, sitting at your desk with a hot cup of tea, munching
> away on a fig-roll bar getting small crumbs between the keys on the
> keyboard, are you really going to reject this code?
Of course. He's making an assumption about the size of unsigned.
That's a portability deal breaker.
> You're sitting there 100% aware that the Standard explicitly forbids
> you to write to member A of a union and then read from member B, but
> how much do you care?
The standard doesn't forbid it, but mandates an unspecified value,
which may be a trap representation, and does so for a very good reason.
The causes of problems resulting from doing this may or may not exist
on a given implementation, even on the vast majority; but even if it
fails on one implementation, that's a deal breaker when the deal is
absolute portability.
> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants. Again, you know that the
> Standard in all its marvelous rigidity doesn't want you to dereference
> that pointer, but are you bothered? Are you, as the Portability
> Advisor, going to reject this code?
This is dereferencing a pointer to Lala Land. That's not portable, and
is likely to cause problems. You want to leave something like this in,
knowing full well that it's not kosher?
> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we
> really going to be so obtuse as to reject this code in the real world?
Yes, and it's not obtuse. If I'm in charge of making code portable,
I'm going to make code portable to the best of my ability. I may miss
some things, but what I don't miss won't get through. That's my job.
I'm a Portability Advisor.
> Are we really going to reject some code for a reason that we see as
> stupidly restrictive in the language's definition?
You mean a reason that *you* see as stupidly restrictive. But that's
your hangup, man.
> Perhaps it might be useful to point out what exactly can go wrong when
> we're treading on a particular rule.
That's a good point. But sometimes it is unclear what can co wrong.
But that doesn't mean that nothing can go wrong. That's why we often
give the stock answer that "undefined behaviour" means anything can
happen. Anything *can* happen, including the program working correctly
(whatever that may mean in the context of undefined behaviour).
> In both these cases I've
> mentioned, I don't think anything can go wrong, not naturally anyway.
Sure they can. Take the expression *(&tangents + 1), for example.
tangents may be allocated at the end of memory. Dereferencing a pointer
pointing beyond it may yield a bogus value, even a trap that causes an
immediate memory access error. Or it could just wrap to memory location
0, which just happens to be a null pointer on a particular
implementation (very many of them, in fact). Any way you look at it,
dereferencing a pointer that points beyond an object is an error, even
if it's not allocated at the end of memory.
> What _can_ cause problems tho is aspects of the compiler:
> 1) Over-zealous with its optimisation
> 2) Deliberately putting in checks (such as terminating the program
> when it thinks you're going to access memory out-of-bounds).
If you are accessing memory out of bounds, then it makes sense for an
implementation to kill the program or take some other appropriate
action. That's usually the domain of the host system, though, rather
than the compiler/library. If you're not accessing memory out of bounds
(or doing other things that may cause problems), then you should have
no problems. An implementation shouldn't terminate a program just
because it "thinks" you may be about to cause undefined behaviour. But
it's very helpful to terminate when you actually do so.
> The first thought I think comes to everyone's mind when we're talking
> about these unnecessarily rigid rules, is that the Standard just needs
> to be neatly amended.
That may be what comes to your mind. But since when do you speak for
everybody?
[Snip.]
> Should we have a webpage that lists the common coding
> techniques that skilled programmers use, but which are officially
> forbidden or "a grey area" in the Standard?
What for? People who care about portability (in a given project or in
general) endeavour to write portably (in that project or in general),
and those who don't don't.
A more important aspect of writing code portably is separating
non-portable aspects of code (if there are any) from the portable
aspects. For example, one may be writing something that must access a
particular device. That is a non-portable endeavour; but some parts of
the code will essentially be portable. A budding coder needs to learn
how to separate the non-portable code that accesses the device from the
portable code that does other things. Make a web site about *that* and
you may just have something useful.
--
Dig the sig!
----------- Peter 'Shaggy' Haywood ------------
Ain't I'm a dawg!!
You seem to have missed much of this debate! I agree the construct is
undefined but it is very hard to see why any actual dereference
(i.e. memory access) will occur. The type means that the *(...)
object is an array, so it must be converted to a pointer to its first
element. This is a technical (i.e. letter of the law) problem, but is
probably not a practical one since that pointer is valid -- you can
construct it in other ways quite legally.
> Or it could just wrap to memory location
> 0, which just happens to be a null pointer on a particular
> implementation (very many of them, in fact).
If 'tangents' is allocated at "the top" of memory, C requires that
there be a valid pointer to a place just one past the end of it. This
often means that it is allocated "one element down" from the top but
other solutions are possible. On a system with fat pointers (that
encode the range as well as the start) the pointer '&tangets+1' must
be properly representable since that pointer is entirely valid.
The issue is whether the * is allowed. It clearly is not, but not
because the operation implies construction of an invalid (and possibly
wrapped) pointer. The * is not allowed "by fiat", even though it is
hard to show is can denotes an actual memory access.
--
Ben.
> Peter 'Shaggy' Haywood <phay...@alphalink.com.au.NO.SPAM> writes:
> <big snip>
>> Sure they can. Take the expression *(&tangents + 1), for example.
>> tangents may be allocated at the end of memory. Dereferencing a pointer
>> pointing beyond it may yield a bogus value, even a trap that causes an
>> immediate memory access error.
>
> You seem to have missed much of this debate!
Perhaps - but *what* a sig!