Recall that in C, int and float are both 4 byte types. Consider the
following code.
main()
{
float f, g;
union { float f; int i; } u;
srand48(time(0));
f = drand48();
u.f = f;
u.i++;
g = u.f;
// ===== POINT A ===== //
}
At point A, will g be greater than f? Will it always be the next
representable floating point value after f? Can you explain your answers?
I tried printf'ing f and g at point A, but they both show up as equal...
I'm really confused. I don't really understand what the union does.
Thanks for any help!
Anyone??
They might be (and often are), but they may not be the same size.
> Consider the
> following code.
>
> main()
int main(void) please!
> {
> float f, g;
> union { float f; int i; } u;
> srand48(time(0));
> f = drand48();
> u.f = f;
> u.i++;
> g = u.f;
> // ===== POINT A ===== //
> }
>
> At point A, will g be greater than f? Will it always be the next
> representable floating point value after f? Can you explain your answers?
It's very difficult to answer without giving a full answer!
Study unions, see how they are laid out and then investigate the
representation of floats on your system.
--
Ian Collins
Warning: dumb analogy follows. A union is like a box into which one can
put a rabbit or a fox or a cabbage. One can put a cabbage into the box
but one can't then expect to take out a rabbit. Sometimes it's useful to
be able to pass around a "box" that can contain one of several objects
instead of having separate containers for each.
The construct your homework is examining -- put one thing into a union
but then take something else out -- isn't that uncommon (I first recall
seeing it back in the day in comp.sources.unix and thought "Hey, this is
neat!" but I was young and foolish in those days), however the result of
doing so is not specified.
That is, the result of taking the value of a union member other than the
value last stored into the union is unspecified by the standard. The
good news is that it's not undefined behavior, though, and a compiler
may, but is not required to, specify what happens if you do so.
I *think* what the question is getting at is to force you to look at
your compiler's representation of float and whether there may be one or
more bit patterns that are not legal representations of float. That part
is a good question but how it gets there is both shaky and compiler
dependent.
--
Rich Webb Norfolk, VA
To how many decimals? Try as many as possible.
Or try printing the difference.
--
Bartc
I don't think it is a very good one, but that doesn't help you. I will
point out the issues with it as well as suggesting some things...
> Recall that in C, int and float are both 4 byte types.
That may be true on the implementation you use, but it isn't always.
I've used C implementations where int was 2 (8 bit) bytes, and others
where int was 1 (16 bit) byte. All were valid. The size of float can
vary too.
> Consider the
> following code.
>
> main()
This form of definition of main is no longer valid according to the C
standard released in 1999, and even before them many would consider it
bad style. It is no longer valid because "implicit int" is no longer
part of the language. You should really use
int main(void)
> {
> float f, g;
> union { float f; int i; } u;
> srand48(time(0));
> f = drand48();
Those are not standard C functions, although I believe Posix defines
them. However, according to my information they are deprecated and you
should use the standard C functions srand and rand.
> u.f = f;
> u.i++;
> g = u.f;
This is a very dodgy way to do a very dodgy thing.
> // ===== POINT A ===== //
> }
>
> At point A, will g be greater than f? Will it always be the next
> representable floating point value after f? Can you explain your answers?
>
> I tried printf'ing f and g at point A, but they both show up as equal...
> I'm really confused. I don't really understand what the union does.
The union is allowing you to treat the bit pattern which represents the
float you stored as if it was an int. After modifying that bit pattern
as if it was an int it then allows you to treat it as if it is a float
again. I believe C does not actually guarantee that this will do what
your instructor thinks it does.
> Thanks for any help!
I'll assume you know how integers are represented in binary. Do you know
how floating point numbers are represented? If not, have a think about
it. A float (on your implementation) can handle some numbers a lot
bigger than an int (on your implementation), how does it do this? It can
also handle some numbers which are not whole numbers, how does it do
this? Obviously it cannot be using a simple representation like int does.
So you need to read up a bit on how your implementation stores float,
and I would suggest looking in the documentation for things like NaN and
INF (infinities).
--
Flash Gordon
when you get the answer back could you post it here? I'm just curious
what the person who set this *thought* the answer was. It's either a
cunning trick question or the setter of the question is very
confused.
>At point A, will g be greater than f? Will it always be the next
>representable floating point value after f? Can you explain your answers?
This is not really a C question at all, since you're clearly supposed
to make enough assumptions about your C implementation that it comes
down to a question about floating point representations: if you treat
the bits of a floating point number as an integer, what happens when
you increment it? For IEEE floats, in most cases, you'll just be
incrementing the mantissa. Will that always increase the number?
What happens when it overflows into the exponent?
-- Richard
--
Please remember to mention me / in tapes you leave behind.
>It's either a cunning trick question or the setter of the question is
>very confused.
I don't think it's either. Most likely the setter is just using the C
program to make concrete a question about floating point formats, and
doesn't care about the fact that C is not guaranteed to behave as he
expects.
> This is a homework question so please don't give full answer, but I
> really need a hint, I have no idea where to start...
>
> Recall that in C, int and float are both 4 byte types. Consider the
> following code.
>
> main()
> {
> float f, g;
> union { float f; int i; } u;
> srand48(time(0));
> f = drand48();
> u.f = f;
> u.i++;
> g = u.f;
> // ===== POINT A ===== //
> }
>
> At point A, will g be greater than f? Will it always be the next
> representable floating point value after f? Can you explain your answers?
>
> I tried printf'ing f and g at point A, but they both show up as equal...
I won't repeat all the comments about the problem with this
exercise -- it does not help you get it done. What might help is to
use the %A (or %a) format specifier in printf. If you have a version
of the C library that supports it (for example gcc with -std=c99) then
you can get printf to show you a representation of floating point
numbers that makes it much easier to see what is going on.
<snip>
--
Ben.
Maybe. Every `float' I happen to have seen had four bytes,
but I've encountered both two- and four-byte `int'. Other sizes
are possible, and not even the four-byte `float' is guaranteed.
> Consider the
> following code.
Missing some #include directives here, I think.
> main()
> {
> float f, g;
> union { float f; int i; } u;
> srand48(time(0));
> f = drand48();
No declarations for time(), srand48(), or drand48(). The
first is a Standard library function (for which you should have
#include'd <time.h>). The other two are not.
You get undefined behavior here for calling the time()
function via an expression of the wrong type, and with an
argument of the wrong type. You may also be in trouble with
srand48() and drand48(), depending on what they are and how
they expect to be called.
> u.f = f;
> u.i++;
Undefined behavior. In a union, only the element most
recently stored has a predictable value. You've stored the
`f' element, so the `i' is indeterminate. There's no telling
what you may get when you try to fetch, increment, and re-store
that indeterminate value.
> g = u.f;
> // ===== POINT A ===== //
> }
>
> At point A, will g be greater than f? Will it always be the next
> representable floating point value after f? Can you explain your answers?
The program is not even guaranteed to *get* to point A.
That said, it probably will get there, despite having invoked
undefined behavior three times. But there's no telling what
kind of curdled value you'll find in `g'.
> I tried printf'ing f and g at point A, but they both show up as equal...
> I'm really confused. I don't really understand what the union does.
It does what you told it, which is "Do something undefined."
Or, in other words, "Anything at all that you do is fine with me."
--
Eric Sosman
eso...@ieee-dot-org.invalid
as i saw your question i did some googling and fell upon this page .
http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
this must answer your question . I just read this it looks great but
reading the other comments in this thread makes me skeptical about its
validity in C .
With reference to the same link can anyone please tell me what
actually aliasing optimization means .
thanks Mohan
as i saw your question i did some googling and fell upon this page .
etc, etc
I just love this. The OP asks a simple, meaningful question, and all he
ever gets back is "int main(void)", "missing include files", and "that
won't work on the DS9K".
Show of hands now. Who here honestly thinks that this (or any other) OP
is going to read Eric's shit response and say "Wow. I am so enlightened." ?
Notes/comments:
1) The basic idea is: They aren't supposed to like it. We are doing
it for their own good.
Needless to say, I find this childish in the extreme.
2) All the OP is ever going to take away from a post like this (like
Eric's) is: Gee, he must have read (and responded to) some other
article. His post has nothing to do with any issues raised by
me. Maybe there is a problem with his newsreader.
>> Recall that in C, int and float are both 4 byte types.
> Maybe. Every `float' I happen to have seen had four bytes,
> Missing some #include directives here, I think.
> No declarations for time(), srand48(), or drand48(). The
>
> You get undefined behavior here for calling the time()
>> u.f = f;
>> u.i++;
>
> Undefined behavior. In a union, only the element most
> The program is not even guaranteed to *get* to point A.
>> I tried printf'ing f and g at point A, but they both show up as equal...
>> I'm really confused. I don't really understand what the union does.
>
> It does what you told it, which is "Do something undefined."
> Or, in other words, "Anything at all that you do is fine with me."
It sounds like the class is investigating how floating point works on their
machines, which presumably they already know have 32-bit floats and ints.
And they're using C as one way of doing that.
They're probably not (at the minute anyway), looking at writing C code that
will work portably on every conceivable class of hardware in the world. And
this task is necessarily specific to their machine anyway.
(I've tried a similar program on my machine, which also has 32-bit floats
and ints, and I've found a couple of interesting things. I wouldn't have
found out those things by religiously following everything in C standard.
Sometimes you either have to learn to read between the lines, or throw the
thing out the window.)
--
Bartc
> This is a homework question so please don't give full answer,
> but I really need a hint, I have no idea where to start...
>
> Assume that under the particular C environment, int and float
> are both 4 byte types.
[following source code slightly edited to match forum conventions]
#include <stdio.h>
#include <time.h>
#include <stdlib.h> // assumed to also define srand48/drand48
// which seed/generate uniformly distributed
// double-precision random numbers
int main(void)
{
float f, g;
union { float f; int i; } u;
srand48(time(NULL));
f = drand48();
u.f = f;
u.i++;
g = u.f;
// ===== POINT A ===== //
return 0;
}
> At point A, will g be greater than f? Will it always be the
> next representable floating point value after f? Can you
> explain your answers?
>
> I tried printf'ing f and g at point A, but they both show up
> as equal...
> I'm really confused. I don't really understand what the union
> does.
>
> Thanks for any help!
Think of
union { float f; int i; } u;
as a container for both a "float" and a "int" sharing the same
bits; whatever you do to one will affect the other.
Investigate how the bits of a "float" are stored on your machine,
and what ++ does to an int. The former could be as in
http://en.wikipedia.org/wiki/IEEE_754-1985
A spoiler: what if f happens to be just below 1?
Francois Grieu
As has been noted by several posters, you make quite
a few assumptions, use a couple of non-standard functions,
and you do not know what will happen if you try to
store one value in a union and manipulate another.
That said what probably happens is that you take a random
number between 0 and 1, express it as a float, interpret
this bit pattern as an integer, add one, then interpret this
bit pattern as a float. What will happen depends on the
method used to store floating point numbers on your system.
Again, there are no guarantees but you are almost certainly
using IEEE floats. Check out the link Mohan sent.
The stuff on Comparing Using Integers is right on point.
- William Hughes
That's possible. But if so, the O.P.'s question is not
about the C language at all, but about the idiosyncrasies of
his machine. On his machine, with the particular compiler and
optimization level he happens to use, his question may be
answerable. But it's not answerable where he asked it.
--
Eric Sosman
eso...@ieee-dot-org.invalid
> >> Recall that in C, int and float are both 4 byte types.
>
> > Maybe. Every `float' I happen to have seen had four bytes,
> > Missing some #include directives here, I think.
> > No declarations for time(), srand48(), or drand48(). The
>
> > You get undefined behavior here for calling the time()
> >> u.f = f;
> >> u.i++;
>
> > Undefined behavior. In a union, only the element most
> > The program is not even guaranteed to *get* to point A.
>
> >> I tried printf'ing f and g at point A, but they both show up as equal...
> >> I'm really confused. I don't really understand what the union does.
>
> > It does what you told it, which is "Do something undefined."
> > Or, in other words, "Anything at all that you do is fine with me."
>
> It sounds like the class is investigating how floating point works on their
> machines, which presumably they already know have 32-bit floats and ints.
>
> And they're using C as one way of doing that.
>
> They're probably not (at the minute anyway), looking at writing C code that
> will work portably on every conceivable class of hardware in the world. And
> this task is necessarily specific to their machine anyway.
but does the instructor know that's what he's doing. I don't think so.
> (I've tried a similar program on my machine, which also has 32-bit floats
> and ints, and I've found a couple of interesting things. I wouldn't have
> found out those things by religiously following everything in C standard.
> Sometimes you either have to learn to read between the lines, or throw the
> thing out the window.)
but why not do it in a portable fashion? Why not use an array of
unsigned char? You can then do the union trick (or use an unsigned
char* or memcpy it into an array of unsigned char).
[note that which version of The Standard you read will refine
some of these issues, to some extent]
This isn't necessarily true. In *some* implementations, int's
(unqualified -- i.e., not shorts or longs) are 4 bytes. In
*some* implementations, floats are 32 bits (IEEE 754-ish).
But, none of these are guaranteed. For example, a float
can be the same size as a double. And a double can be
the same size as a long double. I.e., a float can be
implemented AS IF it was a long double (8 bytes?).
Likewise, an int needs only be at least as large as a short int.
So, an int can be 2 bytes!
Having said this, just keep it in the back of your mind
as to how it further muddies the situation explained below...
(i.e., lets pretend your statement *is* true)
> main()
> {
> float f, g;
> union { float f; int i; } u;
> srand48(time(0));
> f = drand48();
> u.f = f;
> u.i++;
> g = u.f;
> // ===== POINT A ===== //
> }
How about we write this simpler?
function()
{
float x, y;
union {
float f;
int i;
} u;
x = 3.1415926; /* some random floating point value */
u.f = x;
u.i = u.i + 1;
y = u.f;
// ===== POINT A ===== //
}
I.e., you are storing some "bit pattern" (depending on how
your compiler represents the floating point number 3.1415926)
in the "float" called x.
You are then *copying* that bit pattern into a float called
f that is located within a union called u. How this looks
in memory now not only depends on how the compiler chose to
represent that value, but, also, on how it arranges the storage
requirements for f *within* u!
You are then modifying some *portion* of the union using
access through some *other* object in that union (i.e., the i).
Then, you are reexamining the bit pattern from the original
means by which you accessed it (f).
Now, let's come up with a *similar* example:
function()
{
int x, y;
union {
float f;
int i;
} u;
x = 512; /* some random integer value */
u.i = x;
u.f = u.f + 1.0;
y = u.i;
// ===== POINT A ===== //
}
Note that this is essentially the same problem: storing
a bit pattern in a member of the union, modifying some
*other* member of that same union, then reexamining the
original member's "value". Right?
Now, a third example:
function()
{
int x, y;
union {
unsigned char a[4];
int i;
} u;
x = 27; /* some random integer */
u.i = x;
u.a[0] = u.a[0] + 1;
y = u.i;
// ===== POINT A ===== //
}
This is the same problem as the first two.
Now (guessing as to what you know of C), what do
you expect the results in the third example to be?
In what circumstances do your assumptions make sense?
[apologies if I've let some typos slip through]
I am wondering if you could tell me about a platform in which the following
would compile, but fail at the assertion:
__________________________________________________________
#include <assert.h>
typedef char static_assert
[
sizeof(unsigned int) == sizeof(unsigned long int) ? 1 : -1
];
union foo
{
unsigned int a;
unsigned long int b;
};
int main(void)
{
union foo f = { 0 };
++f.a;
assert(f.b == 1);
return 0;
}
__________________________________________________________
Or perhaps even this example:
__________________________________________________________
#include <assert.h>
#include <limits.h>
#define UCHAR_PER_UINT \
(sizeof(unsigned int) / sizeof(unsigned char))
#if (UCHAR_MAX != 0xFFU)
# error could not compile
#endif
typedef char static_assert
[
sizeof(unsigned char) *
UCHAR_PER_UINT == sizeof(unsigned int) ? 1 : -1
];
union foo
{
unsigned int value;
unsigned char parts[UCHAR_PER_UINT];
};
int main(void)
{
unsigned int i;
union foo f = { 0U };
unsigned char parts[4] = { 0U };
for (i = 0; i < UCHAR_PER_UINT; ++i)
{
++f.parts[i];
}
for (i = 0; i < UCHAR_PER_UINT; ++i)
{
unsigned int offset = i * CHAR_BIT;
unsigned int mask = 0xFFU << offset;
parts[i] = (f.value & mask) >> offset;
}
for (i = 0; i < UCHAR_PER_UINT; ++i)
{
assert(parts[i] == 1);
}
return 0;
}
__________________________________________________________
[...]
If unsigned int is big-endian, and unsigned long is something else,
this assertion might fail.
Is there anything prohibiting a bit layout such that f.a and f.b
(assuming that both are at least 32 bits, which unsigned long would
have to be and the static_assert would fail if unsigned int isn't
the same size) have the bits with values 2**3, 2**7, 2**13, 2**29,
and 2**30 (those expressions are in math, not C, and use the FORTRAN
exponentiation ** operator), and *no others*, line up in corresponding
positions?
There are 32! different ways to map 32 bits in a register to a
32-bit area in storage. You have to be going out of your way to
map these differently for unsigned int and unsigned long, but it's
possible, and probably works that way on the DS9K.
> return 0;
>}
I could see it now... I write code like that, and the FIRST time it runs
happens to be on a hardcore weirdo platform and the damn thing launches
missiles or something.
;^o
Holy Shi%
Humm... Is there a real platform in use today that would cause the assertion
to fail in the first and/or second one of Chris' examples?
I don't think you need be worried about this.
The highly secure U.S. missile launching systems
run Microsoft Vista on Intel boxes.
James
One would think that the U.S. military could have developed a highly
classified operating system by now!
:^|
Of course just for extra safety the programmers have absolutely no idea of
the word sizes, byte-orientation, alignment rules or typical memory sizes of
the processors inside the missile.
(Programming 'blind' like is considered good practice in this group.)
--
Bartc
It fails, of course, on the DeathStation 9000. ;-)
The gotcha for unionized type-punning isn't so much the
remote possibility that types with identical sizes and similar
semantics might have different representations, but that an
aggressive optimizer might cache one member's value in a register
while the program stores something to the other member, making
the cached value stale. The optimizer might reason that storing
an int value somewhere "couldn't possibly" affect the already-
cached value of a long or a double or whatever.
But there are at least two possible objections to my gloomy
assessment: First, unions are so commonly used for type-punning
that an implementor might well "make it work" even if the code
is (technically) wrong. Second, I may have misunderstood the
matter; sometimes the language of the Standard requires study
of an intensity exceeding my ability.
--
Eric Sosman
eso...@ieee-dot-org.invalid
To carry this forward a bit, the specific Thou Shalt Not sentence is in
informative appx J.1, Unspecified Behavior: "The value of a union member
other than the last one stored into (6.2.6.1)."
But if we go back to 6.2.6.1, the discussion on unions states: "When a
value is stored in an object of structure or union type, including in a
member object, the bytes of the object representation that correspond to
any padding bytes take unspecified values."
Given this statement, if the objects used in the type punning are the
same size and thus no padding bytes are involved, does that imply that
the unspecified behavior is not invoked?
--
Rich Webb Norfolk, VA
Point of order - not UB, just unspecified.
> I am wondering if you could tell me about a platform in which the
> following would compile, but fail at the assertion:
> __________________________________________________________
> #include <assert.h>
>
>
> typedef char static_assert
> [
> sizeof(unsigned int) == sizeof(unsigned long int) ? 1 : -1
> ];
Your indenting is bizarre to say the least.
Why didn't you write
typedef char static_assert
[
sizeof
(
unsigned int
)
== sizeof
(
unsigned long int
)
? 1 : -1
];
if you have a strange obsession to split and indent on brackets not
of the curly variety.
> union foo
> {
> unsigned int a;
> unsigned long int b;
> };
>
> int main(void)
> {
> union foo f = { 0 };
> ++f.a;
> assert(f.b == 1);
> return 0;
> }
It's quite possibly an implementation where the least significant
byte of the uint is at address 3, and the of the ulong is at 7.
I've used at least three different platforms like that in the past,
quite possibly more, but as I do my best to avoid non-portable
contstructs, such issues have been abstracted away into irrelevance.
Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1
That's the source of my doubt: I'm not 100% sure the
normative text makes the punning undefined in the "perfect
overlap" case. The appendix says so, but it's non-normative.
And, as you point out, it seems to overstate the normative
text's case a little bit.
Even with perfect overlap, of course, there's still the
possibility that storing one member could produce something
that would be a trap representation when viewed via a different
member. The O.P.'s code stored a float and than manipulated
its representation as an int; integers usually don't have trap
representations but the result when viewed as a float might be
a signalling NaN for all anyone knows to the contrary.
--
Eric Sosman
eso...@ieee-dot-org.invalid
Lest any lurker think we're joking, strategic nuclear-weapons
submarines *do* run Windows:
http://m.linuxjournal.com/content/blue-screen-megadeath
http://www.tomshardware.com/news/Submarines-Windows-Royal-Navy,6718.html
Engineering and maintenance run on documentation.
Every bolt on a nuclear submarine has, I'm sure,
drawings and specifications. Yet the source code
to their computers' OS isn't bundled with the other
documentation -- that's Microsoft trade secret!
I dimly recall that Microsoft eventually agreed
to provide source to U.S. government. I don't know
about U.K., but for them to use closed-source in
critical weapons system would not be without precedent.
Their Chinook helicopters were grounded for several
years:
http://news.bbc.co.uk/2/hi/uk_news/7923341.stm :
> "But the aircraft have never been able to fly
> because the MoD failed to secure access to
> key software source code."
It seems odd that Trident missiles, awesome weapons
indeed, would be controlled by source code offlimits
to the Pentagon (or Britain's MoD) but it seems fitting
in an intellectual environment where governments
cannot test voting machines because that might violate
the machine manufacturer's intellectual property rights.
James Dow Allen
Most of the time you don't need to know that stuff, and I *have* done C
development in the defense industry. Not on missile launching systems,
but...
> (Programming 'blind' like is considered good practice in this group.)
No, what is considered good practice is programming so that those things
don't matter *except* when you have specific *need* to know them. Or
more generally, so that you are not dependent on the specifics of your
implementation except where you need to be.
Oh, and I have debugged software on hardware with drastically different
characteristics to the target hardware. Specifically 8 bit bytes where
the target had 16 bit bytes and I don't (and didn't even then) know what
other differences. All I replaced was the actual routines which
interfaced to the HW, which was a minuscule amount of the code (two
small functions, one of which needed to be written in assembler for
speed on the target HW).
--
Flash Gordon
Most of the time you don't need to know that stuff, and I *have* done C
development in the defense industry. Not on missile launching systems,
but...
> (Programming 'blind' like is considered good practice in this group.)
No, what is considered good practice is programming so that those things
Most of the time you don't need to know that stuff, and I *have* done C
development in the defense industry. Not on missile launching systems,
but...
> (Programming 'blind' like is considered good practice in this group.)
No, what is considered good practice is programming so that those things
> >>>> There are 32! different ways to map 32 bits in a register to a
> >>>> 32-bit area in storage.
but irrelevant to C programmers as C is bit-endian blind. How can you
find out which way the bits are ordered? Does the question even make
sense?
If char is 8-bit then a 32-bit quantity only has 4! ways of being
arranged.
<snip>
> >> I don't think you need be worried about this.
> >> The highly secure U.S. missile launching systems
> >> run Microsoft Vista on Intel boxes.
>
> > Of course just for extra safety the programmers have absolutely no idea
> > of the word sizes, byte-orientation, alignment rules or typical memory
> > sizes of the processors inside the missile.
>
> Most of the time you don't need to know that stuff, and I *have* done C
> development in the defense industry. Not on missile launching systems,
> but...
[semi-embedded in my case] quite right. You define the interface
between the processors in terms of streams of bytes (or octets if you
want to sound good). The two ends *really* don't need to know about
each other. I've worked on a system where the "controller" changed
endianess half way through. The impact on the "controlled" system?
None. I'm pretty sure the controller *didn't* know the endianess, word-
size or alignment rules of the controlled system (I'd have been
worried if *had* they cared).
Take a look at ASN.1 or RPC.
> > (Programming 'blind' like is considered good practice in this group.)
>
> No, what is considered good practice is programming so that those things
> don't matter *except* when you have specific *need* to know them. Or
> more generally, so that you are not dependent on the specifics of your
> implementation except where you need to be.
>
> Oh, and I have debugged software on hardware with drastically different
> characteristics to the target hardware. Specifically 8 bit bytes where
> the target had 16 bit bytes and I don't (and didn't even then) know what
> other differences. All I replaced was the actual routines which
> interfaced to the HW, which was a minuscule amount of the code (two
> small functions, one of which needed to be written in assembler for
> speed on the target HW).
desk-top people, what can you do with 'em...
--
"Anyone who considers protocol unimportant has never dealt with a
cat."
Robert A. Heinlein
> Most of the time you don't need to know that stuff, and I *have* done
> C development in the defense industry. Not on missile launching
> systems, but...
That's a relief because the head of the missile launching systems was Dr
Hans Zarkov.
Tony
In one sense you can't. I've made the point before by
positing a computer built from four-state components each
storing one "quit" that encodes two bits: It makes no sense
to ask about the order of the two bits encoded in a single
quit.
But in another sense you *can* find an order by looking
at the int as an array of unsigned char, and inspecting the
values in those chars. It is conceivable (although highly
unlikely) that you might find
int c[0] c[1] c[2] c[4]
1 1 0 0 0
2 0 1 0 0
4 0 0 1 0
8 0 0 0 1
16 2 0 0 0
...
That is, there are 32! one-to-one mappings of the value bits
of the int to the value bits of the four bytes.
> If char is 8-bit then a 32-bit quantity only has 4! ways of being
> arranged.
Ah, but that assumes a sane machine designer! The Standard
does not require sanity (sometimes it engenders the opposite). ;-)
--
Eric Sosman
eso...@ieee-dot-org.invalid
> >>>>>> There are 32! different ways to map 32 bits in a register to a
> >>>>>> 32-bit area in storage.
>
> > but irrelevant to C programmers as C is bit-endian blind. How can you
> > find out which way the bits are ordered? Does the question even make
> > sense?
I was thinking along these lines
int x = 0;
x |= 1; /* set the least significant bit */
how can this *not* set the bottom most bit? If I overlay the int with
an array of unsiged char (union, pointer or memcpy) then the mapping
of int octets to unsigned char is at the whim of the implementor. But
he doesn't hae the same degree of freedom at the bit level. A
particular bit in the int can only be in one of 4 possible places in
the byte array.
So I think I'm missing your point.
> In one sense you can't. I've made the point before by
> positing a computer built from four-state components each
> storing one "quit" that encodes two bits: It makes no sense
> to ask about the order of the two bits encoded in a single
> quit.
could you implement an ISO compliant C compiler on such a machine?
Well yes but I think you'd have to hide the quits. That x|= 1 would
still have to work even if the C compiler was run on a machine made of
knotted string.
> But in another sense you *can* find an order by looking
> at the int as an array of unsigned char, and inspecting the
> values in those chars. It is conceivable (although highly
> unlikely) that you might find
>
> int c[0] c[1] c[2] c[4]
> 1 1 0 0 0
> 2 0 1 0 0
> 4 0 0 1 0
> 8 0 0 0 1
> 16 2 0 0 0
> ...
>
> That is, there are 32! one-to-one mappings of the value bits
> of the int to the value bits of the four bytes.
now here I got lost... Are we still discussing quits?
> > If char is 8-bit then a 32-bit quantity only has 4! ways of being
> > arranged.
>
> Ah, but that assumes a sane machine designer! The Standard
> does not require sanity (sometimes it engenders the opposite). ;-)
I think it forces some degree of sanity on the implementor
Your claim, then, is that if we look at the bytes of `x'
we will find one byte with the value 1 and the others with
the value 0 (assuming no padding bits). Can you find language
in the Standard to support this claim? I can't, but I may
have overlooked something.
My belief (still assuming no padding bits) is that one
of the bytes must be equal to a power of two and the others
must be zero. But I think that both the identity of the non-
zero byte and the particular power of two are implementation-
defined (or maybe even unspecified).
--
Eric Sosman
eso...@ieee-dot-org.invalid
>> but irrelevant to C programmers as C is bit-endian blind. How can you
>> find out which way the bits are ordered? Does the question even make
>> sense?
[...]
> But in another sense you *can* find an order by looking
>at the int as an array of unsigned char, and inspecting the
>values in those chars. It is conceivable (although highly
>unlikely) that you might find
>
> int c[0] c[1] c[2] c[4]
> 1 1 0 0 0
> 2 0 1 0 0
> 4 0 0 1 0
> 8 0 0 0 1
> 16 2 0 0 0
> ...
To determine a "bit endianness" you need to have a way of addressing
bits. That's what endianness is: at which end (or more generally
position) of the large unit at a given address do you find the small
unit at that address.
Eric's example doesn't tell us the bit endianness; it just shows
something about how the endianness for bytes and ints is related. And
in this case there would be no such thing a byte endianness for ints,
since the byte doesn't appear at a position within the int.
One situation where bit endianness is exposed is old monochrome
bit-mapped displays: you can see whether the low order bit of a byte
is the left or the right pixel.
-- Richard
--
Please remember to mention me / in tapes you leave behind.
My $.02:
You (or more precisely, your instructor) are making completely unsupported
assumptions about the underlying hardware. The question is unanswerable
as stated.
I *think* that if you assume 4-byte ints and floats, and that the ints
are stored in 2's-complement form, and the floats are stored in standard
IEEE format, then the problem is answerable. Learn what the IEEE format
is, learn the range that drand48() can return, try writing down on paper
what a few floating-point values might look like in binary, and you will
find your answer.
(I'm going to assume you know what 2's-complement form is, but then I
may be making my own unsupported assumption.)
--
-Ed Falk, fa...@despams.r.us.com
http://thespamdiaries.blogspot.com/
C is also supposed to be byte-endian blind. Things like using a
union to store data in one form and examine it in another form allow
it to peek.
You set up a union with an unsigned long, and an array of 4 unsigned
chars. You set a bit in the unsigned long, store it in the union,
then examine the unsigned chars and figure out which bit is set.
Repeat for each bit in the unsigned long. This is the same sort
of way people determine endianness.
>If char is 8-bit then a 32-bit quantity only has 4! ways of being
>arranged.
Why?
What requirement in the C standard prevents the most significant
bit and the least significant bit of a long from being stored in
the same byte in memory? Or ordering the bits alphabetically by
English name ("first" comes after "fifth" and before "seventh")?
Sanity of machine designers probably prevents this, but that's
not a C requirement.
Note that if you have an Nx8-bit RAM chip with data input/outputs on
pins D0, D1, ... D7, and you swap two of the data I/O pins, nothing
really changes (unless it's got another set of data lines without
similar swapping). Some data is stored in different physical places
on the chip, but so what? RAM is volatile, so the original contents
don't matter (presumably you powered the chip off while swapping
connections to the pins). True, a "bad bit" on the chip might
appear to be in a different place, but usually you throw out those
chips anyway, or your bad RAM mapping scheme will handle either.
I'll go even further than that: What requirement in the C standard
prevents *encryption* of an unsigned long when it is stored in
memory, and decryption when it is fetched? The size can't change,
and the encryption has to be reversible, so it must map 2**N values
onto 2**N values. The key can be different for each type, except
that const and non-const variants of the same type have to use the
same key. And char * and void * have to use the same key.
I'll note that if you have a machine which *might* encrypt an
unsigned long when storing it, it requires at least (2**32)-1 stores
to truly determine if a machine is big-endian. It *could* merely
represent 0 as 0xffffffff and -1 as 0x00000000 and otherwise behave
as big-endian, when it's not.
The relevant part of 6.2.6.1 is actually the following paragraph:
7 When a value is stored in a member of an object of union type,
the bytes of the object representation that do not correspond to
that member but do correspond to other members take unspecified
values.
> Given this statement, if the objects used in the type punning are the
> same size and thus no padding bytes are involved, does that imply that
> the unspecified behavior is not invoked?
Not necessarily, you still have to worry about potential trap
representations, padding bits, and the unspecified nature of most
representations. In particular, if you store an int value and read it
back as a float, even if ints and floats are the same size, the actual
float value you get is unspecified and may be a trap representation.
--
Larry Jones
...That would be pretty cool, if they weren't out to kill me. -- Calvin
A PDP-11, perhaps. Depends on what size the compiler uses for int
and whether or not chars are unsigned. If chars are signed and
ints are 16 bits, then static_assert would pass but assert() would
fail.
PDP-11 stores its bytes little-endian but its halfwords big-endian.
Of course, we're now way off the rails as far as C standards
are concerned.
>#include <assert.h>
>
>
>typedef char static_assert
>[
> sizeof(unsigned int) == sizeof(unsigned long int) ? 1 : -1
>];
>
>
>union foo
>{
> unsigned int a;
> unsigned long int b;
>};
>
>
>int main(void)
>{
> union foo f = { 0 };
>
> ++f.a;
>
> assert(f.b == 1);
>
> return 0;
>}
Haven't you been reading the news? It's all PS3s now.
>For IEEE floats, in most cases, you'll just be
>incrementing the mantissa.
>What happens when it overflows into the exponent?
Actually the IEEE format is very well designed in this respect for
finite numbers ... when you overflow into the exponent you get the
desired result, the next floating point value in the model (exponent
gets bumped up by one and the mantissa bits all get reset to zero). It
makes rounding up easy to implement at the bit level. And if you were
at the max finite number, it nicely overflows into the bit
representation for infinity (all exponent bits set and mantissa all
zero bits).
James Tursa
I think that your last paragraph is sufficient reason for concluding
that union-punning is indeed officially undefined behaviour in the
general case (even though it does work in the more usual cases).
Richard
I believe that union punning requires implementations only to defeat
their caching optimizations which depend on strict aliasing assumptions.
That is to say, access to a union using a member which was not used for
the most recent store should not fetch stale data.
Whereas straight type punning with pointers is susceptible to this stale
cache problem, in addition to all of the potential representational
issues.
The behavior of union access can be inferred for a given implementation,
if you know the details of the representation of types on a given
implementation. If, according to the representations, the bits all make
sense, then it has to work.
> In article <6ekQm.34188$kY2....@newsfe01.iad>,
> Chris M. Thomasson <n...@spam.invalid> wrote:
> >
> >I am wondering if you could tell me about a platform in which the following
> >would compile, but fail at the assertion:
>
> A PDP-11, perhaps. Depends on what size the compiler uses for int
> and whether or not chars are unsigned. If chars are signed and
> ints are 16 bits, then static_assert would pass but assert() would
> fail.
>
Signedness of char doesn't matter; his dummy array was never accessed.
For I16 (the traditional and IMO best choice for -11) the static
assert will fail; for I32L32 it will compile and run okay, since there
is only one reasonable (hardware-supported) 32-bit integer.