On Wednesday, August 12, 2015 at 10:34:26 AM UTC-4, Ben Bacarisse wrote:
> "Rick C. Hodgin" <
rick.c...@gmail.com> writes:
>
> > On Wednesday, August 12, 2015 at 9:16:30 AM UTC-4, Ben Bacarisse wrote:
> >> "Rick C. Hodgin" <
rick.c...@gmail.com> writes:
> >> > In my humble opinion...
> >> > I would not worry about this issue.
> >>
> >> In my opinion you should worry about it. Not because you'll write code
> >> that breaks on machine X (though who knows what your code will be
> >> running on the 40 years time) but because the cast tells the reader what
> >> you mean and the union does not. It's about being clear by writing what
> >> you mean.
> >
> > The cast provides explicit information line-by-line, but the union also
> > tells you this information by the type of that union member, and without
> > the mechanical requirement of typing out casts line-by-line. In addition,
> > upon seeing that the pointer in use is part of a union (something a self-
> > respecting editor should provide in some way when hovering over the
> > variable), then it will be clear to the adequately versed C developer
> > that the reason a union is being used is because it's a shared data
> > type.
>
> I would hope a C programmer would come to a very different conclusion.
Still don't see it, Ben. I have a feature in my editor called "Code
Definition Window" which, as I navigate through my code, is constantly
loading up the location where the variable I'm on was defined. I can
see, for example, when I'm on a structure or some member, where it was
defined, what's around it. It pops up automatically, so I have that
information without having to have it flaunted in my face in source
code form. It's one of the many benefits of using a GUI-based editor,
which I realize many remote developers and those working on embedded
systems can't use. I can. And even in the cases where I've had to
do some programming on something, I've been able to write some tests
in that editor, then transfer it over, and do the last bit of final
debugging on the target, rather than doing it all in that limited
toolset.
> What's more, I am sure you would agree that they would (and should) come
> to a different conclusion if we were not talking about pointers.
I would agree...
In all cases where they're not pointers, UNLESS it's from something
that's known to be the same size. Pointers in 32-bit code on Windows
and ARM- and x86-based Linux are 32-bits. So, you can use this
example safely because the compiler constraints are enforced by the
architecture itself:
union {
uint32_t _ptr;
void* ptr;
};
I also extend this more generally to 32-bit and 64-bit compilations
using an #if test at the start, and defining custom names based on
the bit size:
// Simplified for illustration, actual test is more complex
#if 32-bits
#define uptr uint32_t
#elif 64-bits
#define uptr uint64_t
#else
#error
#endif
union {
uptr _ptr;
void* ptr;
};
My targets are primarily x86 32-bit and 64-bit, and ARM 32-bit and
64-bit, and I'm content with that because it's a huge percentage of
the market share. And if I ever need to change anything, I can
rename my union members and find out in every instance where they're
used in code by attempting to compile.
> Anyway, I don't expect to change your mind, but I hope any learners who
> might come across this thread will see the danger in writing code that
> does not say what the programmer means, even if if it almost always
> works.
In the case of pointers, using unions does say what the programmer means.
It just does it in a way which doesn't slap you on the face on every line
of source code which references it. And to me, that's a good thing. In
fact, it's a requirement.
> >> > First of all it will only affect the most obtuse or
> >> > special-equipment compilers. Second, you can use some compile-time
> >> > checks to ensure things you do rely upon are the same size, such as
> >> > comparing sizeof(void*) to sizeof(other*), etc., and if they are
> >> > different reporting it during compilation with #if blocks.
> >>
> >> It's not just about size.
> >
> > Then it would be possible to also use a runtime test which, if it fails,
> > reports that condition and exits out explaining in comments how it can
> > be fixed.
>
> Two points: First, can you really do that (and I mean "you" not "one")?
Sure.
> Can you really write a compile-time test that checks the actual
> assumptions that your suggested use of unions relies on, or are you just
> speculating that it must, surely, be possible?
Sure. If there's a test case, such as your example, where the address
of A won't equal B, when A and B should be equal, then it can be tested.
You can use pointer math if your compiler supports it, and if not, then
use a union which encompasses a pointer and a sufficiently large integer,
and simply do a compare. They have to be exactly equal to pass the test.
If they don't, report the failure observed at runtime.
> Second, it's usually possible to test for unwarranted assumption being
> made by come bit of code, but it's much better simply not to make those
> unwarranted assumptions in the first place.
It might be for the well-versed C developer, the one who has spent time
learning and thinking about those things regularly, and has come to form
a pattern of though which exhibits those traits naturally from within
its creation mechanisms.
However, for anyone who doesn't know that information, the mountainous
workload it places upon someone, someone who could otherwise do the task
quickly, yet is now of bogged down to such an extent that it's actually
off-putting, making what should've been a 5 minute fix into a 90 minute
exercise in R&D for something that, in 100% of all cases I know about on
the code I've written, has never been an issue (and I'm sure other people
have similar experiences unless they are something like contract-for-hire
C developers who work on whatever platform the customer wants, and
therefore must be in the pattern of writing code like that because it's
important to their clients).
To put it simply: it's just too much work for most code because most
targets don't need it.
> <snip>
> > I just don't believe the burden should be on every C developer to write
> > code for every possible architecture all the time, especially when they're
> > targeting a handful of architectures where it's not at issue.
>
> But here we are simply talking about doing it right. There is no
> burden in writing the code the correct and portable way. There is no
> possible justification (that I can see) for doing it the wrong way.
This is always our same argument. You place EXTREME value on things I
do not place ANY value upon, except in cases where such value is
warranted, such as were I to decide to write some code which **I INTENDED**
to be released on ever architecture. But, I simply don't do that.
I'm perfectly content to support only those architectures which have the
same sized pointers, and where pointer references in a union will all
point to the same location regardless of whether they were cast or not.
In fact, I would completely avoid an architecture that did not support
those features because knowing what I know about assembly language, and
machine data access at the hardware level, it's ridiculous to not have
the same size pointers for general processing, and I wouldn't trust such
a design decision made by someone for that architecture. I would be
leery of it from the get-go, and would pass for something more standard,
more familiar. I would do this every time, and without hesitation or
reservation. In fact, I would do this making decisions which required
a larger battery, or a different motherboard be created, rather than
cater to some obtuse architecture's design quirks for the gain of some
small amount of something.
You're talking about doing it "right" in the context of supporting all
of those obtuse architectures, those which the code will likely never
be compiled on, under the thinking that a some point, somewhere, in
some distant corner of the globe, some user might haphazardly try to
compile some of that code and (oh my gosh!! much to their horror!!) it
doesn't work on the first try and they have to go in to the source code
and make a few changes to make it work on their known-to-be-obtuse
architecture.
I find that conclusion shocking, and wrong. I think it is so amazingly
wrong that I would never, under any circumstances, purport it, save one
and only one instance: if you had some desire to write code which was
kept to the C standard because you wanted to be able to globally support
all architectures. Which brings up the question, by the way, is there
ANY code ANYWHERE that does that? Is there truly a form of source code
you can write which will ALWAYS compile on ANY architecture without
requiring ANY changes whatsoever?
You place enough value on that "possible contingency" that it may be
used by someone else that you would require all code be written so as
to address it.
No, sir. No how. No way. Not ever. Not on my watch. I have too
many things to do with my time than worry about supporting obtuse
architectures. When it comes to me encountering that need in this
world, I will gladly send the work your way and say, "Ben can do it!
He's fantastic about supporting this architecture." And then I'll
go back to doing something else.