C pointer semantics with dynamically allocated struct

Morten W. Petersen

unread,

Aug 11, 2015, 2:48:17 PM8/11/15

to

Hi there.

I'm working on a project that dynamically allocates different structs
into memory, they all share a basic set of variables.

When compiling the project I get an error on the following line:

https://github.com/morphex/smash_xml/blob/9fd7cc902c1a19c9e0f383da9ceda18b15fd1f09/decode_xml.c#L948

It says "error: request for member `parent' in something not a
struct or union".

What am I doing wrong here?

Regards,

Morten

Melzzzzz

unread,

Aug 11, 2015, 2:59:12 PM8/11/15

to

You don't cast previous, but parent...

Melzzzzz

unread,

Aug 11, 2015, 3:03:32 PM8/11/15

to

On 11 Aug 2015 18:57:44 GMT
r...@zedat.fu-berlin.de (Stefan Ram) wrote:

> "Morten W. Petersen" <mor...@gmail.com> writes:
> >When compiling the project I get an error on the following line:

> >https://github.com/morphex/smash_xml/blob/9fd7cc902c1a19c9e0...
>
> This is a label followed by a comment and compiles fine here:
>
> #include <stdio.h>
> int main( void )
> { printf( "hello, " );
> https://example.com/example/12345
> printf( "world" ); }
>
> . Prints:
>
> hello, world
>
> .
>

gcc -std=c90 -Wall wontcompile.c
wontcompile.c: In function ‘main’:
wontcompile.c:4:9: error: C++ style comments are not allowed in ISO C90
https://example.com/example/12345
^
wontcompile.c:4:9: error: (this will be reported only once per input file)
wontcompile.c:4:3: warning: label ‘https’ defined but not used [-Wunused-label]
https://example.com/example/12345
^
wontcompile.c:5:3: warning: control reaches end of non-void function [-Wreturn-type]
printf( "world" ); }

jacobnavia

unread,

Aug 11, 2015, 3:03:50 PM8/11/15

to

Le 11/08/2015 20:48, Morten W. Petersen a écrit :
> What am I doing wrong here?

You have declared "previous" as a void *, then you try to use it as a
structure...

Not good, not good :-)

FIX:

1) Send me my consulting fees
2) Change the declaration

jacob

Jens Thoms Toerring

unread,

Aug 11, 2015, 3:14:24 PM8/11/15

to

Morten W. Petersen <mor...@gmail.com> wrote:

That line seems to be:

new->parent = (struct xml_element*) previous->parent;

and somewhere before you have

void *previous = NULL;

So previous is a void pointer and thus isn't a struct pointer.
I guess you acually intended to write

new->parent = ((struct xml_element*) previous)->parent;

since the '->' operator binds more tightly than a cast
and you were casting not 'previous' but the result of
'previous->parent' (but, if you get it, already had
that type).

The following line

(struct xml_element* ) previous->next = new;

shows the same problem, though.

A better solution instead of using all those ugly casts
probably would be to define 'previous' as a 'struct
xml_element' pointer - it doesn't seem to be used for
anything else anyway. And avoiding casts is INHO always
a good thing.
Regards, Jens
--
\ Jens Thoms Toerring ___ j...@toerring.de
\__________________________ http://toerring.de

Ben Bacarisse

unread,

Aug 11, 2015, 3:24:23 PM8/11/15

to

"Morten W. Petersen" <mor...@gmail.com> writes:

The line is:

new->parent = (struct xml_element*) previous->parent;

and the declaration of previous shows that it's a void * -- i.e. it
points to something that is not (or at least not known to be) a struct.
You might have meant

new->parent = ((struct xml_element*) previous)->parent;

but then why is parent not a pointer to an xml_element already? A cast
should indicate something odd going on and setting one element's parent
pointer form that of another does not seem to me to be an odd thing to
do.

BTW, it seems a shame to make the code system specific with the use of
fileno and fstat but if you decide to keep that code, you should probably
put it off to one side in a system-specific helper function.

--
Ben.

James Kuyper

unread,

Aug 11, 2015, 3:32:38 PM8/11/15

to

On 08/11/2015 03:03 PM, Melzzzzz wrote:
> On 11 Aug 2015 18:57:44 GMT
> r...@zedat.fu-berlin.de (Stefan Ram) wrote:
>
>> "Morten W. Petersen" <mor...@gmail.com> writes:
>>> When compiling the project I get an error on the following line:
>>> https://github.com/morphex/smash_xml/blob/9fd7cc902c1a19c9e0...
>>
>> This is a label followed by a comment and compiles fine here:
>>
>> #include <stdio.h>
>> int main( void )
>> { printf( "hello, " );
>> https://example.com/example/12345
>> printf( "world" ); }
>>
>> . Prints:
>>
>> hello, world
>>
>> .
>>
>
> gcc -std=c90 -Wall wontcompile.c

His joke is obviously targeted at C99 or later.
--
James Kuyper

Morten W. Petersen

unread,

Aug 11, 2015, 3:47:53 PM8/11/15

to

3) Watch Jacob laugh all the way to the bank.

Heh, heh.

-Morten

Morten W. Petersen

unread,

Aug 11, 2015, 3:50:36 PM8/11/15

to

On 11.08.2015 21:14, Jens Thoms Toerring wrote:
[...]

> So previous is a void pointer and thus isn't a struct pointer.
> I guess you acually intended to write
>
> new->parent = ((struct xml_element*) previous)->parent;
>
> since the '->' operator binds more tightly than a cast
> and you were casting not 'previous' but the result of
> 'previous->parent' (but, if you get it, already had
> that type).
>
> The following line
>
> (struct xml_element* ) previous->next = new;
>
> shows the same problem, though.
>
> A better solution instead of using all those ugly casts
> probably would be to define 'previous' as a 'struct
> xml_element' pointer - it doesn't seem to be used for
> anything else anyway. And avoiding casts is INHO always
> a good thing.

Ah yes you were right about the casting and precedence, that's
what I was looking for.. As for making a stricter definition,
I haven't quite sorted out how the thing should work yet
and don't want to use variables that are unnecessary.

-Morten

Morten W. Petersen

unread,

Aug 11, 2015, 3:51:48 PM8/11/15

to

On 11.08.2015 21:24, Ben Bacarisse wrote:
[...]

> BTW, it seems a shame to make the code system specific with the use of
> fileno and fstat but if you decide to keep that code, you should probably
> put it off to one side in a system-specific helper function.

Ah yes, the idea is to make it platform-independent, guess I just found
an example and didn't think it through.

It should compile and run on every major platform.

-Morten

James Kuyper

unread,

Aug 11, 2015, 4:13:14 PM8/11/15

to

That's reasonable in itself, but nowhere near as important as avoiding
using unnecessary casts. Give previous a more appropriate definition,
and the casts are no longer necessary.

--
James Kuyper

Rick C. Hodgin

unread,

Aug 11, 2015, 4:32:29 PM8/11/15

to

Another idea assuming your compiler supports anonymous unions:

struct SWhatever {
void* previous;
union {
struct xml_element* prev_xml_element;
struct xml_whatever* prev_xml_whatever;
};
};

And then in your code below, rather than referencing previous member,
reference the appropriate-by-context member directly:

new->parent = prev_xml_element->parent;

Best regards,
Rick C. Hodgin

Rick C. Hodgin

unread,

Aug 11, 2015, 4:36:55 PM8/11/15

to

On Tuesday, August 11, 2015 at 4:32:29 PM UTC-4, Rick C. Hodgin wrote:
> struct SWhatever {
> union {
> void* previous;

> struct xml_element* prev_xml_element;
> struct xml_whatever* prev_xml_whatever;
> };
> };

Before I get pounced on by snipers, it should be as above.

Morten W. Petersen

unread,

Aug 11, 2015, 4:45:30 PM8/11/15

to

Well, if you take a look at the code, the struct definitions have all
got the same header. So when working with a parsed internal
representation of the data, casting will be done based on the first
entry in the given struct, the type.

As for the parsing loop, I haven't quite figured out how it should
work yet, so instead of being specific, I'm going to try to be
generic until the point where it doesn't pay off.

void *previous is defined on line 914, and the following loop could
encounter for example character data which will be represented by
xml_text, and xml_text can have a next pointer and could because of
that be a struct that previous points to.

-Morten

Morten W. Petersen

unread,

Aug 11, 2015, 4:53:00 PM8/11/15

to

On 11.08.2015 22:32, Rick C. Hodgin wrote:
[...]

> Another idea assuming your compiler supports anonymous unions:
>
> struct SWhatever {
> void* previous;
> union {
> struct xml_element* prev_xml_element;
> struct xml_whatever* prev_xml_whatever;
> };
> };
>
> And then in your code below, rather than referencing previous member,
> reference the appropriate-by-context member directly:
>
> new->parent = prev_xml_element->parent;

And what's the benefit of doing it this way? Please elaborate. :)

-Morten

David Brown

unread,

Aug 11, 2015, 4:57:33 PM8/11/15

to

Can I pounce on you a bit, to suggest that you don't edit quotations
(even to correct a quotation from your own post)? Just put the
corrected code as a normal post - Usenet standard is for quoted parts to
be exact quotations.

Other than that, unions such as you use here are often a better solution
than casting - they can avoid complications due to strict aliasing
optimisations, as well as keeping the type changes more limited. (By
that I mean that a union such as the one here gives you a choice of
three different pointer types, while a cast lets you pick /any/ type.
Limiting the options reduces the risk of errors.)

Morten W. Petersen

unread,

Aug 11, 2015, 5:23:53 PM8/11/15

to

On 11.08.2015 22:57, David Brown wrote:
[...]

> Other than that, unions such as you use here are often a better solution
> than casting - they can avoid complications due to strict aliasing
> optimisations, as well as keeping the type changes more limited. (By
> that I mean that a union such as the one here gives you a choice of
> three different pointer types, while a cast lets you pick /any/ type.
> Limiting the options reduces the risk of errors.)

Aha, I see. Yes that could be a useful feature to have, but I'm
not sure I want that level of strictness.

However, I don't expect XML to change much, same for the internal
representation of the XML, and I don't see other developers developing
a snide xml_* struct that has to work.. So maybe it's just good to be
strict and allow the compiler to give a good, simple explanation when
something is wrong.

-Morten

Rick C. Hodgin

unread,

Aug 11, 2015, 7:02:06 PM8/11/15

to

On Tuesday, August 11, 2015 at 4:53:00 PM UTC-4, Morten W. Petersen wrote:
> On 11.08.2015 22:32, Rick C. Hodgin wrote:
> [...]
> > Another idea assuming your compiler supports anonymous unions:
> >
> > struct SWhatever {

> > union {
> > void* previous;

> > struct xml_element* prev_xml_element;
> > struct xml_whatever* prev_xml_whatever;
> > };
> > };
> >
> > And then in your code below, rather than referencing previous member,
> > reference the appropriate-by-context member directly:
> >
> > new->parent = prev_xml_element->parent;
>
> And what's the benefit of doing it this way? Please elaborate. :)

For me, the biggest advantage is it helps document your code so you
can see clearly what's being done line-by-line in a straight-forward
manner without "overcasting" yourself into a visual frenzy.

I place a high degree of importance on code maintenance because you
only initially write your code once, but you will be changing it over
time likely many times. It is much more important to have easy-to-
maintain code, than highly optimized code in about 99.9% of cases.
And for the other 0.1%, you can write an easy-to-maintain version
nearby which works, and then comment it out so that it's there for
follow-on maintenance, and then optimize it to its stripped-down form.

Personally, I'd find this a lot more easy to read and maintain:

new->parent = prev_xml_element->parent;

Than this:

new->parent = ((struct xml_element*)previous)->parent;

And even if you add a #define for "struct xml_element" to be
something like "xmlptr" you're still left with:

new->parent = ((xmlptr*)previous)->parent;

If you wanted to rename the xml_element structure, you only have
one place to go now ... the original structure with the union.
If you want to change it when using casting, your editor had better
have a symbol rename feature or you'll be engaged in some manual
work which increases the risk of mistakes, wasting time needlessly.

Keith Thompson

unread,

Aug 11, 2015, 7:38:59 PM8/11/15

to

David Brown <david...@hesbynett.no> writes:
[...]

> Can I pounce on you a bit, to suggest that you don't edit
> quotations (even to correct a quotation from your own post)?
> Just put the corrected code as a normal post - Usenet standard
> is for quoted parts to be exact quotations.

I sometimes reformat quoted text (not code), particularly if
the original text has very long lines, changing *only* spacing
and line breaks. I've done so with the quoted paragraph above.
I usually don't bother to mention it.

I don't do this for code samples.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

glen herrmannsfeldt

unread,

Aug 11, 2015, 7:46:46 PM8/11/15

to

Keith Thompson <ks...@mib.org> wrote:

(snip on quoting rules for usenet)

> I sometimes reformat quoted text (not code), particularly if
> the original text has very long lines, changing *only* spacing
> and line breaks. I've done so with the quoted paragraph above.
> I usually don't bother to mention it.

My news host requires posts to mostly fit in 80 columns.
(Usually it will allow a small number of long lines,
especially URLs.) So I often have to reformat line breaks.

> I don't do this for code samples.

I sometimes remove blank lines from code samples, and maybe reformat
if lines are longer than 80 columns.

I might only quote part of a code sample, that is, remove lines
before and after the part of interest, especially if it is long.

I didn't know that there was an actual rule before, though.

-- glen

Morten W. Petersen

unread,

Aug 11, 2015, 8:05:47 PM8/11/15

to

glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:

> I sometimes remove blank lines from code samples, and maybe reformat
> if lines are longer than 80 columns.
>
> I might only quote part of a code sample, that is, remove lines
> before and after the part of interest, especially if it is long.
>
> I didn't know that there was an actual rule before, though.

Well just for fun I fired up Gnus in Emacs, and it seems to be working
quite well. Not doing any manual line wrapping now and looks like
Gnus/Emacs does the right thing.

-Morten

Keith Thompson

unread,

Aug 11, 2015, 8:07:36 PM8/11/15

to

I don't know that there's an actual rule. I've seen documents that
discuss "netiquette", but I haven't read any of them lately. I
generally just try to follow common sense.

Richard Heathfield

unread,

Aug 11, 2015, 8:11:58 PM8/11/15

to

I don't think it's an "actual rule". It's really only common sense. A
quotation is a playing-back of something someone said (or, to be more
precise, wrote). If you change the text (other than in trivial ways,
such as whitespace), it is no longer a quotation.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

Rick C. Hodgin

unread,

Aug 11, 2015, 8:35:25 PM8/11/15

to

There's a place for the language of C,
where if ever you find you to be,
take ye heed of the locals,
for their "compilers" are vocal,
in matters of triviality.

Richard Heathfield

unread,

Aug 11, 2015, 8:51:48 PM8/11/15

to

On 12/08/15 01:35, Rick C. Hodgin wrote:
> There's a place for the language of C,
> where if ever you find you to be,
> take ye heed of the locals,
> for their "compilers" are vocal,
> in matters of triviality.

A limerick's rules are so tight
That you have to compose them just right.
The rhyming, the rhythm,
You gotta stick with 'em,
This is not haiku.

Morten W. Petersen

unread,

Aug 11, 2015, 9:02:25 PM8/11/15

to

On 12.08.2015 02:51, Richard Heathfield wrote:
> On 12/08/15 01:35, Rick C. Hodgin wrote:
>> There's a place for the language of C,
>> where if ever you find you to be,
>> take ye heed of the locals,
>> for their "compilers" are vocal,
>> in matters of triviality.
>
> A limerick's rules are so tight
> That you have to compose them just right.
> The rhyming, the rhythm,
> You gotta stick with 'em,
> This is not haiku.

If we're going to flaunt, I created this a while ago:

https://soundcloud.com/morten-w-petersen/as-i-walk

Lyrics:

So I was thinking to myself, things are gonna change.
For whatever reason I will, cover the rain.

As I go to my school, I, notice sharks in the pool.
That's just the way it is and, I can keep it cool.

Oooooooooo. Oooooooooooooooo, ohm.

Again I think to myself, things are gonna chaaange.

-Morten

Ben Bacarisse

unread,

Aug 11, 2015, 9:57:01 PM8/11/15

to

Sorry, I have to snipe because that's not a safe technique (unless I'm
misunderstanding how you intended to use the union). A cast permits the
pointer to be converted from one type to another, but the union causes
the bits to be re-interpreted and that's not 100% portable. Of course
it may work everywhere that matters to you, but other readers may want
to take care.

--
Ben.

Morten W. Petersen

unread,

Aug 11, 2015, 10:07:58 PM8/11/15

to

Well, isn't a pointer a pointer regardless of what it points to? I
thought the point of casting was so that the compiler could find the
right offsets etc. to work on in a struct.

-Morten

Ben Bacarisse

unread,

Aug 11, 2015, 10:09:07 PM8/11/15

to

"Morten W. Petersen" <mor...@gmail.com> writes:

<snip>

> Well, if you take a look at the code, the struct definitions have all
> got the same header. So when working with a parsed internal
> representation of the data, casting will be done based on the first
> entry in the given struct, the type.

I'd say that using void * and casting is not the right way to go about
this. I'd put the common elements into a struct and put that struct
first in all other others. Most of the time you can use exactly the
right type

xml_complex_type *p = malloc(sizeof *p);
p->common->parent = ...;

and sometimes you can use a pointer to the common part as a generic
pointer because generic operations usually use only the common parts.
Every now and then you need to convert between a pointer to the common
part to a pointer to one of the more complex structures and the cast
will alert you and others to the fact that this is an insecure
operation.

(If you use a "discriminated union" you can check this conversion at run
time, but that's a separate matter.)

<snip>
--
Ben.

Rick C. Hodgin

unread,

Aug 11, 2015, 10:43:47 PM8/11/15

to

Ben, I appreciate your knowledge of C (and Keith Thompson's, and a few
others) very much. I frequently learn a lot from each of you.

Ben Bacarisse

unread,

Aug 11, 2015, 10:45:34 PM8/11/15

to

"Morten W. Petersen" <mor...@gmail.com> writes:

That's only part of it. On one machine I've used a cast from void * to
a pointer to any struct type generated code to halve the address: void *
was a byte pointer and struct pointers were word pointers. Were you to
do this:

struct xml_element elem;
struct SWhatever sw = { .previous = &elem; };

you'd find that &elem != sw.prev_xml_element. The initialisation
converts the word pointer to a byte pointer, but the union access simply
re-interprets the converted bits as something they are not.

--
Ben.

Rick C. Hodgin

unread,

Aug 11, 2015, 10:50:57 PM8/11/15

to

On Tuesday, August 11, 2015 at 9:02:25 PM UTC-4, Morten W. Petersen wrote:
> https://soundcloud.com/morten-w-petersen/as-i-walk

[Off topic alert]

Nice. A while back I posted some music from Liberty, Missouri,
recorded back in 2008:

https://soundcloud.com/rickchodgin/sets/rick-and-jerry-2008

Jerry Adkins played lead. I played rhythm. We recorded this on a
little handheld recorder sitting on the floor between us:

http://www.amazon.com/Olympus-DS-30-Digital-Voice-Recorder/dp/B000MSDL6K

Neither one of us had played music with the other very much before
this recording. Just a little on a couple songs. This was all
free-flow live.

Rick C. Hodgin

unread,

Aug 11, 2015, 11:09:51 PM8/11/15

to

And here's me playing live again in front of a camera around that
same time (late 2000s anyway). I'm a little greyer now, but apart
from that I look about the same. :-)

https://www.youtube.com/watch?v=LTeRYsPcCrI

David Brown

unread,

Aug 12, 2015, 3:18:02 AM8/12/15

to

On 12/08/15 02:11, Richard Heathfield wrote:
> On 12/08/15 00:46, glen herrmannsfeldt wrote:
>> Keith Thompson <ks...@mib.org> wrote:
>>
>> (snip on quoting rules for usenet)
>>
>>> I sometimes reformat quoted text (not code), particularly if
>>> the original text has very long lines, changing *only* spacing
>>> and line breaks. I've done so with the quoted paragraph above.
>>> I usually don't bother to mention it.
>>
>> My news host requires posts to mostly fit in 80 columns.
>> (Usually it will allow a small number of long lines,
>> especially URLs.) So I often have to reformat line breaks.
>>
>>> I don't do this for code samples.
>>
>> I sometimes remove blank lines from code samples, and maybe reformat
>> if lines are longer than 80 columns.
>>
>> I might only quote part of a code sample, that is, remove lines
>> before and after the part of interest, especially if it is long.
>>
>> I didn't know that there was an actual rule before, though.
>
> I don't think it's an "actual rule". It's really only common sense. A
> quotation is a playing-back of something someone said (or, to be more
> precise, wrote). If you change the text (other than in trivial ways,
> such as whitespace), it is no longer a quotation.
>

Yes, that's the point. Although there are a few /real/ rules for
Usenet, most are "unwritten rules". As noted by others, trivial
formatting, spacing or line breaking changes are not an issue (except in
comp.lang.python...). But changing the content of a quotation can be
confusing, and if it is someone else's text (not the case here) it may
be rude.

David Brown

unread,

Aug 12, 2015, 3:22:31 AM8/12/15

to

In practice, that is correct on the great majority of systems. But in
theory, and on a few odd systems (such as the one Ben has used), there
can be more to it. It all depends on how portable you want to get -
sometimes in C you can write significantly neater and clearer code by
restricting your portability a little.

Rosario19

unread,

Aug 12, 2015, 5:13:16 AM8/12/15

to

On 11 Aug 2015 18:57:44 GMT, r...@zedat.fu-berlin.de (Stefan Ram)
wrote:

>"Morten W. Petersen" <mor...@gmail.com> writes:

>>When compiling the project I get an error on the following line:
>>https://github.com/morphex/smash_xml/blob/9fd7cc902c1a19c9e0...
>
> This is a label followed by a comment and compiles fine here:
>
>#include <stdio.h>
>int main( void )
>{ printf( "hello, " );
> https://example.com/example/12345
> printf( "world" ); }
>
> . Prints:
>
>hello, world

pheraps

#include <stdio.h>
int main( void )
{ printf( "hello, " );
https://example.com/example/12345:
printf( "world" );
}

or

#include <stdio.h>
int main(void)
{printf("hello, ");
"https://example.com/example/12345"; printf("world");}

Morten W. Petersen

unread,

Aug 12, 2015, 5:54:49 AM8/12/15

to

This is cool. I noticed the sound of a lighter somewhere in there. :)

I usually accompany myself when doing (slight) improvisations, that
is I record one guitar track at a time.

If you had miked each guitar on a separate track, minor misses on the
lead guitar here and there could simply have been muted, and the "sound
picture" could have sounded a bit more organized with the right
timing and mixing. This is just something I say as a tip for further
refining the process, I think the result is quite good given the
circumstances.

One of the reasons I started with C was that I thought about creating
a simple multi-track recorder using the Raspberry Pi, and with that
amount of data flowing C looked like the right choice.

-Morten

Morten W. Petersen

unread,

Aug 12, 2015, 6:10:12 AM8/12/15

to

On 12.08.2015 04:45, Ben Bacarisse wrote:
> "Morten W. Petersen" <mor...@gmail.com> writes:

[...]

>> Well, isn't a pointer a pointer regardless of what it points to? I
>> thought the point of casting was so that the compiler could find the
>> right offsets etc. to work on in a struct.
>
> That's only part of it. On one machine I've used a cast from void * to
> a pointer to any struct type generated code to halve the address: void *
> was a byte pointer and struct pointers were word pointers. Were you to
> do this:
>
> struct xml_element elem;
> struct SWhatever sw = { .previous = &elem; };
>
> you'd find that &elem != sw.prev_xml_element. The initialisation
> converts the word pointer to a byte pointer, but the union access simply
> re-interprets the converted bits as something they are not.

OK.. I kind of get it, but can you elaborate a bit?

-Morten

Rick C. Hodgin

unread,

Aug 12, 2015, 6:35:27 AM8/12/15

to

On Wednesday, August 12, 2015 at 5:54:49 AM UTC-4, Morten W. Petersen wrote:
> On 12.08.2015 04:50, Rick C. Hodgin wrote:
> > On Tuesday, August 11, 2015 at 9:02:25 PM UTC-4, Morten W. Petersen wrote:
> >> https://soundcloud.com/morten-w-petersen/as-i-walk
> >
> > [Off topic alert]
> >
> > Nice. A while back I posted some music from Liberty, Missouri,
> > recorded back in 2008:
> >
> > https://soundcloud.com/rickchodgin/sets/rick-and-jerry-2008
> >
> > Jerry Adkins played lead. I played rhythm. We recorded this on a
> > little handheld recorder sitting on the floor between us:
> >
> > http://www.amazon.com/Olympus-DS-30-Digital-Voice-Recorder/dp/B000MSDL6K
> >
> > Neither one of us had played music with the other very much before
> > this recording. Just a little on a couple songs. This was all
> > free-flow live.
>
> This is cool. I noticed the sound of a lighter somewhere in there. :)

Yes. Jerry was a smoker. You can hear me coughing in there too. :-)
He used to buy coffee can size containers of tobacco and roll his
own cigarettes without filters. He did this nearly all day long. It
was a mark upon him.

> I usually accompany myself when doing (slight) improvisations, that
> is I record one guitar track at a time.
>
> If you had miked each guitar on a separate track, minor misses on the
> lead guitar here and there could simply have been muted, and the "sound
> picture" could have sounded a bit more organized with the right
> timing and mixing. This is just something I say as a tip for further
> refining the process, I think the result is quite good given the
> circumstances.
>
> One of the reasons I started with C was that I thought about creating
> a simple multi-track recorder using the Raspberry Pi, and with that
> amount of data flowing C looked like the right choice.

C's an incredible language. Only a dozen things added to it by C++
make it better. Unfortunately, C has not adopted those things even
in 2015. Fortunately some compilers have adopted them as extensions.

Ben Bacarisse

unread,

Aug 12, 2015, 6:39:24 AM8/12/15

to

I'm sure you are not surprised by this:

union arith { long long i; double d; };
union arith u = { .d = 42 };
printf("%lld\n", u.i); // output on this machine 4631107791820423168

That's because, whilst the initialisation correctly converts an int (42)
to a double for storage in u.d, the access of u.i simply reinterprets
the bits, and the bit pattern for (long long){42} is very different to
that for (double){42.0}.

So it is for pointers in a union, but we have got used to the idea that
all pointer types have the same representation -- that a void * and a
char * and struct S * pointing at the same object all use exactly the
same pattern of bits. This is has not been true historically, and the C
standard does not require that it be true. C does guarantee *some*
things about pointer representations, but the details are not important
here.

The key thing is to try, wherever possible, to use the right pointer
type and to use a cast to convert one pointer type to another where
this is not possible.

Unions have two functions. One, to save space by enabling the storage
of more than one kind of object in the same location at different times.
In this usage you always access the union member that was last stored.
Sometimes you do this by having a run-time "type" member somewhere --
this is a so-called discriminated union -- and sometimes the logic of
the program can ensure that the right member is always being accessed
without needing to store any record of it.

The second use is to re-interpret the representation of one type as if
it were another. This is a much and specialised usage.

--
Ben.

Richard Heathfield

unread,

Aug 12, 2015, 6:46:34 AM8/12/15

to

On 12/08/15 10:54, Morten W. Petersen wrote:

> One of the reasons I started with C was that I thought about creating
> a simple multi-track recorder using the Raspberry Pi, and with that
> amount of data flowing C looked like the right choice.

Another way in which C can be used quite easily is in the composition of
aleatoric music. Lilypond has a text format that is well-suited to this
task. For example, here's a simple aleatorically-generated phrase in C
major:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(void)
{
char note[] = "abcdefg";
int notenumber = 0;
srand(time(0));
printf("mymelody =\n{\n \\relative c'\n {\n");
printf(" \\clef treble\n");
printf(" \\override Staff.TimeSignature #'style = #'()\n");
printf(" \\time 4/4\n ");
while(notenumber++ < 16)
{
printf(" %c4", note[rand() % 7]);
}
printf("\n \\bar \"|.\"\n");
printf(" }\n}\n");
return 0;
}

That isn't a complete Lilypond file, but it's a fragment that can easily
be inserted into one.

(Obviously that's just a very, very simplistic example. You can get as
complicated as you want.)

Morten W. Petersen

unread,

Aug 12, 2015, 7:00:27 AM8/12/15

to

On 12.08.2015 12:35, Rick C. Hodgin wrote:
> On Wednesday, August 12, 2015 at 5:54:49 AM UTC-4, Morten W. Petersen wrote:
>> On 12.08.2015 04:50, Rick C. Hodgin wrote:
>>> On Tuesday, August 11, 2015 at 9:02:25 PM UTC-4, Morten W. Petersen wrote:
>>>> https://soundcloud.com/morten-w-petersen/as-i-walk
>>>
>>> [Off topic alert]
>>>
>>> Nice. A while back I posted some music from Liberty, Missouri,
>>> recorded back in 2008:
>>>
>>> https://soundcloud.com/rickchodgin/sets/rick-and-jerry-2008
>>>
>>> Jerry Adkins played lead. I played rhythm. We recorded this on a
>>> little handheld recorder sitting on the floor between us:
>>>
>>> http://www.amazon.com/Olympus-DS-30-Digital-Voice-Recorder/dp/B000MSDL6K
>>>
>>> Neither one of us had played music with the other very much before
>>> this recording. Just a little on a couple songs. This was all
>>> free-flow live.
>>
>> This is cool. I noticed the sound of a lighter somewhere in there. :)
>
> Yes. Jerry was a smoker. You can hear me coughing in there too. :-)
> He used to buy coffee can size containers of tobacco and roll his
> own cigarettes without filters. He did this nearly all day long. It
> was a mark upon him.

Well, something is gonna kill ya anyway, right?

That said I'm glad I don't smoke, used to, but that shit is just a
bag of different nasty side-effects.

> C's an incredible language. Only a dozen things added to it by C++
> make it better. Unfortunately, C has not adopted those things even
> in 2015. Fortunately some compilers have adopted them as extensions.

Maybe I should try out programming in C++ as well, but focusing on
C and Assembler is enough for now. If anything it is good getting
practice in procedural programming.

And if it ain't broken..

-Morten

Morten W. Petersen

unread,

Aug 12, 2015, 7:06:48 AM8/12/15

to

On 12.08.2015 12:46, Richard Heathfield wrote:
> On 12/08/15 10:54, Morten W. Petersen wrote:
>
>> One of the reasons I started with C was that I thought about creating
>> a simple multi-track recorder using the Raspberry Pi, and with that
>> amount of data flowing C looked like the right choice.
>
> Another way in which C can be used quite easily is in the composition of
> aleatoric music. Lilypond has a text format that is well-suited to this
> task. For example, here's a simple aleatorically-generated phrase in C
> major:

[...]

Generating music randomly is an interesting thing, and it could also be
fun to try to map different mathematical geometric functions to sound.

Couple that with a like/dislike feature and a network of users and you
might just have something.

-Morten

Morten W. Petersen

unread,

Aug 12, 2015, 7:41:53 AM8/12/15

to

On 12.08.2015 12:39, Ben Bacarisse wrote:
[...]

> So it is for pointers in a union, but we have got used to the idea that
> all pointer types have the same representation -- that a void * and a
> char * and struct S * pointing at the same object all use exactly the
> same pattern of bits. This is has not been true historically, and the C
> standard does not require that it be true. C does guarantee *some*
> things about pointer representations, but the details are not important
> here.

Out of curiosity, what were some of the cases where pointers differed
in size, why were some smaller etc.?

> The key thing is to try, wherever possible, to use the right pointer
> type and to use a cast to convert one pointer type to another where
> this is not possible.

Yes, having some strictness seems like the way to go.

-Morten

Rick C. Hodgin

unread,

Aug 12, 2015, 7:52:48 AM8/12/15

to

On Tuesday, August 11, 2015 at 8:35:25 PM UTC-4, Rick C. Hodgin wrote:
> There's a place for the language of C,
> where if ever you find you to be,
> take ye heed of the locals,
> for their "compilers" are vocal,
> in matters of triviality.

Version 2.0:

Title: "The Jagged comp.lang.c Shoreline"

There's a place for the language of C,

I advise there you listen to me,
take wise heed of thy locals,
their "compilers" are vocal,
and do clamor on excessively.

Rick C. Hodgin

unread,

Aug 12, 2015, 7:59:46 AM8/12/15

to

In my humble opinion...

I would not worry about this issue. First of all it will only affect
the most obtuse or special-equipment compilers. Second, you can use
some compile-time checks to ensure things you do rely upon are the
same size, such as comparing sizeof(void*) to sizeof(other*), etc.,
and if they are different reporting it during compilation with #if
blocks.

Every modern standard machine people use and write applications for,
all of them should have all pointers being the same size. And even if
they don't, it's still not worth all of the extra code added to cast
everything when the code you have was specifically designed for the
purposes of using the various and disparate values the union exposes.
That's what unions are for -- to share a data space based on runtime
context of reuse.

If you must cast, cast to one of your union types at one place only,
and then always use the union type thereafter. And I do pray that's
what the advice givers meant by using casting, though I am suspect
of that hopeful conclusion.

Rick C. Hodgin

unread,

Aug 12, 2015, 8:02:07 AM8/12/15

to

Version 2.1:

Title: "The Jagged comp.lang.c Shoreline"

There's a place for the language of C,
I advise there you listen to me,
take wise heed of thy locals,
their "compilers" are vocal,

and oft clamor on excessively.

Richard Damon

unread,

Aug 12, 2015, 8:07:45 AM8/12/15

to

On 8/12/15 7:41 AM, Morten W. Petersen wrote:
> On 12.08.2015 12:39, Ben Bacarisse wrote:
> [...]
>> So it is for pointers in a union, but we have got used to the idea that
>> all pointer types have the same representation -- that a void * and a
>> char * and struct S * pointing at the same object all use exactly the
>> same pattern of bits. This is has not been true historically, and the C
>> standard does not require that it be true. C does guarantee *some*
>> things about pointer representations, but the details are not important
>> here.
>
> Out of curiosity, what were some of the cases where pointers differed
> in size, why were some smaller etc.?
>

The classic case is a machine where native pointers point to a 'word'
(because that is how the machine addresses memory) so to handle a
pointer to a specific byte, you need to add another word with that
information. Thus pointers to char and void will be large than other
pointers.

Richard Heathfield

unread,

Aug 12, 2015, 8:08:45 AM8/12/15

to

On 12/08/15 12:06, Morten W. Petersen wrote:

<snip>

> Generating music randomly is an interesting thing, and it could also be
> fun to try to map different mathematical geometric functions to sound.

See "Dirk Gently's Holistic Detective Agency", by Douglas Adams, where
this idea is discussed at some length.

> Couple that with a like/dislike feature and a network of users and you
> might just have something.

That's not a bad idea, actually - a genetic algorithm for composition,
where the fitness function consists of a straw poll.

Rick C. Hodgin

unread,

Aug 12, 2015, 8:10:47 AM8/12/15

to

Version 2.2:

Title: "The Jagged comp.lang.c Shoreline"

There's a place for the language of C,
I advise there you listen to me,
take wise heed of thy locals,
their "compilers" are vocal,

and -pedantic's their friend you will see.

Richard Bos

unread,

Aug 12, 2015, 8:18:17 AM8/12/15

to

"Morten W. Petersen" <mor...@gmail.com> wrote:

> On 12.08.2015 12:46, Richard Heathfield wrote:
> > On 12/08/15 10:54, Morten W. Petersen wrote:
> >
> >> One of the reasons I started with C was that I thought about creating
> >> a simple multi-track recorder using the Raspberry Pi, and with that
> >> amount of data flowing C looked like the right choice.
> >
> > Another way in which C can be used quite easily is in the composition of
> > aleatoric music. Lilypond has a text format that is well-suited to this
> > task. For example, here's a simple aleatorically-generated phrase in C
> > major:

Guess what this program(me!) is called:

#include <stdio.h>

int main(void)
{
puts("");

return 273;

}

> Generating music randomly is an interesting thing, and it could also be
> fun to try to map different mathematical geometric functions to sound.
>
> Couple that with a like/dislike feature and a network of users and you
> might just have something.

We already have that. It's called The X Factor.

Richard

Morten W. Petersen

unread,

Aug 12, 2015, 8:26:53 AM8/12/15

to

On 12.08.2015 14:18, Richard Bos wrote:
> "Morten W. Petersen" <mor...@gmail.com> wrote:

[...]

>> Couple that with a like/dislike feature and a network of users and you
>> might just have something.
>
> We already have that. It's called The X Factor.

LOL. I was thinking along those lines myself. :)

-Morten

Richard Heathfield

unread,

Aug 12, 2015, 8:37:05 AM8/12/15

to

I've seen a few X Factor clips on Youtube, and it seems to focus
primarily (although by no means solely) on "pop" music. I think
aleatorically generated music would probably be better suited to light
classical - the kind of thing used in computer games. The trick here is
to program in /some/ repetition (so that there's a sense of structure to
the music), but not too much (otherwise it becomes mind-numbing).

Ben Bacarisse

unread,

Aug 12, 2015, 9:08:45 AM8/12/15

to

Richard Damon <Ric...@Damon-Family.org> writes:

> On 8/12/15 7:41 AM, Morten W. Petersen wrote:
>> On 12.08.2015 12:39, Ben Bacarisse wrote:
>> [...]
>>> So it is for pointers in a union, but we have got used to the idea that
>>> all pointer types have the same representation -- that a void * and a
>>> char * and struct S * pointing at the same object all use exactly the
>>> same pattern of bits. This is has not been true historically, and the C
>>> standard does not require that it be true. C does guarantee *some*
>>> things about pointer representations, but the details are not important
>>> here.
>>
>> Out of curiosity, what were some of the cases where pointers differed
>> in size, why were some smaller etc.?
>
> The classic case is a machine where native pointers point to a 'word'
> (because that is how the machine addresses memory) so to handle a
> pointer to a specific byte, you need to add another word with that
> information. Thus pointers to char and void will be large than other
> pointers.

On the word-addressed machine I was referring to (the Perq), there was a
different scheme for C[1]. All data pointers were the same size, but in
character pointers the low-order bit indicated which byte to access (it
was a long time ago when 16 bits of data per-program seemed like
enough). The result was that some casts generated a shift instruction.

Code addresses were double size with a 16-bit segment and a 16-bit
offset, but every function was compiled to it's own segment with the
entry point being offset zero. Thus, as far as C was concerned,
function pointers were just 16-bit numbers. This was great because
you'd get to know the numbers! Early on, when there was no symbolic
debugger, you could tell that you'd crashed in function 28 (printf) or
178 (qsort) and so on. Very handy.

<snip>

[1] And the "for C" matters. The machine had a micro-coded instruction
set that was part of the process state. A C program would have the
instruction set designed for C whereas, for example, a Lisp program had
a lisp machine's instruction set. Sometimes I miss those days -- we no
longer live in interesting times as far as machine architecture goes.
--
Ben.

Morten W. Petersen

unread,

Aug 12, 2015, 9:13:31 AM8/12/15

to

On 12.08.2015 14:36, Richard Heathfield wrote:
> On 12/08/15 13:26, Morten W. Petersen wrote:
>> On 12.08.2015 14:18, Richard Bos wrote:
>>> "Morten W. Petersen" <mor...@gmail.com> wrote:
>> [...]
>>>> Couple that with a like/dislike feature and a network of users and you
>>>> might just have something.
>>>
>>> We already have that. It's called The X Factor.
>>
>> LOL. I was thinking along those lines myself. :)
>
> I've seen a few X Factor clips on Youtube, and it seems to focus
> primarily (although by no means solely) on "pop" music. I think
> aleatorically generated music would probably be better suited to light
> classical - the kind of thing used in computer games. The trick here is
> to program in /some/ repetition (so that there's a sense of structure to
> the music), but not too much (otherwise it becomes mind-numbing).

Mm, yes. Well there are rules in most genres of music, and
creating something that follows those rules shouldn't be too hard.

It would be interesting to work with something like this, but there
is enough digital stuff in my life already. Nothing quite beats
the feeling of a good jam or screaming out a bit of vocals with
force and emotional emphasis.

-Morten

Ben Bacarisse

unread,

Aug 12, 2015, 9:16:30 AM8/12/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Wednesday, August 12, 2015 at 7:41:53 AM UTC-4, Morten W. Petersen wrote:
>> On 12.08.2015 12:39, Ben Bacarisse wrote:
>> [...]
>> > So it is for pointers in a union, but we have got used to the idea that
>> > all pointer types have the same representation -- that a void * and a
>> > char * and struct S * pointing at the same object all use exactly the
>> > same pattern of bits. This is has not been true historically, and the C
>> > standard does not require that it be true. C does guarantee *some*
>> > things about pointer representations, but the details are not important
>> > here.
>>
>> Out of curiosity, what were some of the cases where pointers differed
>> in size, why were some smaller etc.?
>>
>> > The key thing is to try, wherever possible, to use the right pointer
>> > type and to use a cast to convert one pointer type to another where
>> > this is not possible.
>>
>> Yes, having some strictness seems like the way to go.
>>
>> -Morten
>
> In my humble opinion...
>
> I would not worry about this issue.

In my opinion you should worry about it. Not because you'll write code
that breaks on machine X (though who knows what your code will be
running on the 40 years time) but because the cast tells the reader what
you mean and the union does not. It's about being clear by writing what
you mean.

> First of all it will only affect
> the most obtuse or special-equipment compilers. Second, you can use
> some compile-time checks to ensure things you do rely upon are the
> same size, such as comparing sizeof(void*) to sizeof(other*), etc.,
> and if they are different reporting it during compilation with #if
> blocks.

It's not just about size.

<snip>
--
Ben.

Rick C. Hodgin

unread,

Aug 12, 2015, 9:37:25 AM8/12/15

to

On Wednesday, August 12, 2015 at 9:16:30 AM UTC-4, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
> > In my humble opinion...
> > I would not worry about this issue.
>
> In my opinion you should worry about it. Not because you'll write code
> that breaks on machine X (though who knows what your code will be
> running on the 40 years time) but because the cast tells the reader what
> you mean and the union does not. It's about being clear by writing what
> you mean.

The cast provides explicit information line-by-line, but the union also
tells you this information by the type of that union member, and without
the mechanical requirement of typing out casts line-by-line. In addition,
upon seeing that the pointer in use is part of a union (something a self-
respecting editor should provide in some way when hovering over the
variable), then it will be clear to the adequately versed C developer
that the reason a union is being used is because it's a shared data type.

> > First of all it will only affect
> > the most obtuse or special-equipment compilers. Second, you can use
> > some compile-time checks to ensure things you do rely upon are the
> > same size, such as comparing sizeof(void*) to sizeof(other*), etc.,
> > and if they are different reporting it during compilation with #if
> > blocks.
>
> It's not just about size.

Then it would be possible to also use a runtime test which, if it fails,
reports that condition and exits out explaining in comments how it can
be fixed.

-----
I've only used a few C compilers in my life, and they've all been on
x86, ARM, 68K, and a few mainframe systems in school whose processors
I don't even know what they were.

I cannot recall any occasion where pointer sizes were different
(except back in the DOS days with near and far pointers).

I realize this doesn't mean it can't happen, and won't someday happen,
but my point is that I think it's exceedingly rare where it would be
an issue. And as I see it, the person using some architecture where
something doesn't work (with shared union pointers apart from assignment
as by explicit casts), that person would know this, and has probably
designed their compilers to issue warnings about such shared pointer
use, or that values are assigned to union members without explicit
casts.

I just don't believe the burden should be on every C developer to write
code for every possible architecture all the time, especially when they're
targeting a handful of architectures where it's not at issue. I think it
should be a burden upon the shoulders of those developers who choose to
operate on those architectures where such limits exist, and it should be
their job to fixup any code they receive from me should they choose to
use that resource for their benefit. After all, THEY are the experts on
that architecture, and the ones who should know all about its quirks.

This is my position this matter, and it's also one I believe over time
would draw future hardware makers in toward standards. And in the case
where such a hardware exception warrants the atypical behavior, as by
providing some grand increase in performance or throughput, then it's
just the cost of doing developer business on that particular and unique
machine.

David Brown

unread,

Aug 12, 2015, 10:11:04 AM8/12/15

to

On 12/08/15 14:36, Richard Heathfield wrote:
> On 12/08/15 13:26, Morten W. Petersen wrote:
>> On 12.08.2015 14:18, Richard Bos wrote:
>>> "Morten W. Petersen" <mor...@gmail.com> wrote:
>> [...]
>>>> Couple that with a like/dislike feature and a network of users and you
>>>> might just have something.
>>>
>>> We already have that. It's called The X Factor.
>>
>> LOL. I was thinking along those lines myself. :)
>
> I've seen a few X Factor clips on Youtube, and it seems to focus
> primarily (although by no means solely) on "pop" music.

It should be no big surprise that in a music competition judged by
random people, pop music comes out top - that is pretty much the
definition of popular music!

> I think
> aleatorically generated music would probably be better suited to light
> classical - the kind of thing used in computer games. The trick here is
> to program in /some/ repetition (so that there's a sense of structure to
> the music), but not too much (otherwise it becomes mind-numbing).
>

I heard recently about double-blind test comparing algorithmically
generated light music and human composed music. Most people - music
experts and "ordinary" people alike - rated the computer-generated music
as noticeably more pleasant, and guessed that it was the human-composed
pieces.

Ben Bacarisse

unread,

Aug 12, 2015, 10:34:26 AM8/12/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Wednesday, August 12, 2015 at 9:16:30 AM UTC-4, Ben Bacarisse wrote:
>> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>> > In my humble opinion...
>> > I would not worry about this issue.
>>
>> In my opinion you should worry about it. Not because you'll write code
>> that breaks on machine X (though who knows what your code will be
>> running on the 40 years time) but because the cast tells the reader what
>> you mean and the union does not. It's about being clear by writing what
>> you mean.
>
> The cast provides explicit information line-by-line, but the union also
> tells you this information by the type of that union member, and without
> the mechanical requirement of typing out casts line-by-line. In addition,
> upon seeing that the pointer in use is part of a union (something a self-
> respecting editor should provide in some way when hovering over the
> variable), then it will be clear to the adequately versed C developer
> that the reason a union is being used is because it's a shared data
> type.

I would hope a C programmer would come to a very different conclusion.
What's more, I am sure you would agree that they would (and should) come
to a different conclusion if we were not talking about pointers.

Anyway, I don't expect to change your mind, but I hope any learners who
might come across this thread will see the danger in writing code that
does not say what the programmer means, even if if it almost always
works.

>> > First of all it will only affect the most obtuse or
>> > special-equipment compilers. Second, you can use some compile-time
>> > checks to ensure things you do rely upon are the same size, such as
>> > comparing sizeof(void*) to sizeof(other*), etc., and if they are
>> > different reporting it during compilation with #if blocks.
>>
>> It's not just about size.
>
> Then it would be possible to also use a runtime test which, if it fails,
> reports that condition and exits out explaining in comments how it can
> be fixed.

Two points: First, can you really do that (and I mean "you" not "one")?
Can you really write a compile-time test that checks the actual
assumptions that your suggested use of unions relies on, or are you just
speculating that it must, surely, be possible?

Second, it's usually possible to test for unwarranted assumption being
made by come bit of code, but it's much better simply not to make those
unwarranted assumptions in the first place.

<snip>

> I just don't believe the burden should be on every C developer to write
> code for every possible architecture all the time, especially when they're
> targeting a handful of architectures where it's not at issue.

But here we are simply talking about doing it right. There is no
burden in writing the code the correct and portable way. There is no
possible justification (that I can see) for doing it the wrong way.

<snip>
--
Ben.

Rick C. Hodgin

unread,

Aug 12, 2015, 11:17:55 AM8/12/15

to

On Wednesday, August 12, 2015 at 10:34:26 AM UTC-4, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>
> > On Wednesday, August 12, 2015 at 9:16:30 AM UTC-4, Ben Bacarisse wrote:
> >> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
> >> > In my humble opinion...
> >> > I would not worry about this issue.
> >>
> >> In my opinion you should worry about it. Not because you'll write code
> >> that breaks on machine X (though who knows what your code will be
> >> running on the 40 years time) but because the cast tells the reader what
> >> you mean and the union does not. It's about being clear by writing what
> >> you mean.
> >
> > The cast provides explicit information line-by-line, but the union also
> > tells you this information by the type of that union member, and without
> > the mechanical requirement of typing out casts line-by-line. In addition,
> > upon seeing that the pointer in use is part of a union (something a self-
> > respecting editor should provide in some way when hovering over the
> > variable), then it will be clear to the adequately versed C developer
> > that the reason a union is being used is because it's a shared data
> > type.
>
> I would hope a C programmer would come to a very different conclusion.

Still don't see it, Ben. I have a feature in my editor called "Code
Definition Window" which, as I navigate through my code, is constantly
loading up the location where the variable I'm on was defined. I can
see, for example, when I'm on a structure or some member, where it was
defined, what's around it. It pops up automatically, so I have that
information without having to have it flaunted in my face in source
code form. It's one of the many benefits of using a GUI-based editor,
which I realize many remote developers and those working on embedded
systems can't use. I can. And even in the cases where I've had to
do some programming on something, I've been able to write some tests
in that editor, then transfer it over, and do the last bit of final
debugging on the target, rather than doing it all in that limited
toolset.

> What's more, I am sure you would agree that they would (and should) come
> to a different conclusion if we were not talking about pointers.

I would agree...

In all cases where they're not pointers, UNLESS it's from something
that's known to be the same size. Pointers in 32-bit code on Windows
and ARM- and x86-based Linux are 32-bits. So, you can use this
example safely because the compiler constraints are enforced by the
architecture itself:

union {
uint32_t _ptr;
void* ptr;
};

I also extend this more generally to 32-bit and 64-bit compilations
using an #if test at the start, and defining custom names based on
the bit size:

// Simplified for illustration, actual test is more complex
#if 32-bits
#define uptr uint32_t

#elif 64-bits
#define uptr uint64_t

#else
#error
#endif

union {
uptr _ptr;
void* ptr;
};

My targets are primarily x86 32-bit and 64-bit, and ARM 32-bit and
64-bit, and I'm content with that because it's a huge percentage of
the market share. And if I ever need to change anything, I can
rename my union members and find out in every instance where they're
used in code by attempting to compile.

> Anyway, I don't expect to change your mind, but I hope any learners who
> might come across this thread will see the danger in writing code that
> does not say what the programmer means, even if if it almost always
> works.

In the case of pointers, using unions does say what the programmer means.
It just does it in a way which doesn't slap you on the face on every line
of source code which references it. And to me, that's a good thing. In
fact, it's a requirement.

> >> > First of all it will only affect the most obtuse or
> >> > special-equipment compilers. Second, you can use some compile-time
> >> > checks to ensure things you do rely upon are the same size, such as
> >> > comparing sizeof(void*) to sizeof(other*), etc., and if they are
> >> > different reporting it during compilation with #if blocks.
> >>
> >> It's not just about size.
> >
> > Then it would be possible to also use a runtime test which, if it fails,
> > reports that condition and exits out explaining in comments how it can
> > be fixed.
>
> Two points: First, can you really do that (and I mean "you" not "one")?

Sure.

> Can you really write a compile-time test that checks the actual
> assumptions that your suggested use of unions relies on, or are you just
> speculating that it must, surely, be possible?

Sure. If there's a test case, such as your example, where the address
of A won't equal B, when A and B should be equal, then it can be tested.

You can use pointer math if your compiler supports it, and if not, then
use a union which encompasses a pointer and a sufficiently large integer,
and simply do a compare. They have to be exactly equal to pass the test.
If they don't, report the failure observed at runtime.

> Second, it's usually possible to test for unwarranted assumption being
> made by come bit of code, but it's much better simply not to make those
> unwarranted assumptions in the first place.

It might be for the well-versed C developer, the one who has spent time
learning and thinking about those things regularly, and has come to form
a pattern of though which exhibits those traits naturally from within
its creation mechanisms.

However, for anyone who doesn't know that information, the mountainous
workload it places upon someone, someone who could otherwise do the task
quickly, yet is now of bogged down to such an extent that it's actually
off-putting, making what should've been a 5 minute fix into a 90 minute
exercise in R&D for something that, in 100% of all cases I know about on
the code I've written, has never been an issue (and I'm sure other people
have similar experiences unless they are something like contract-for-hire
C developers who work on whatever platform the customer wants, and
therefore must be in the pattern of writing code like that because it's
important to their clients).

To put it simply: it's just too much work for most code because most
targets don't need it.

> <snip>
> > I just don't believe the burden should be on every C developer to write
> > code for every possible architecture all the time, especially when they're
> > targeting a handful of architectures where it's not at issue.
>
> But here we are simply talking about doing it right. There is no
> burden in writing the code the correct and portable way. There is no
> possible justification (that I can see) for doing it the wrong way.

This is always our same argument. You place EXTREME value on things I
do not place ANY value upon, except in cases where such value is
warranted, such as were I to decide to write some code which **I INTENDED**
to be released on ever architecture. But, I simply don't do that.

I'm perfectly content to support only those architectures which have the
same sized pointers, and where pointer references in a union will all
point to the same location regardless of whether they were cast or not.

In fact, I would completely avoid an architecture that did not support
those features because knowing what I know about assembly language, and
machine data access at the hardware level, it's ridiculous to not have
the same size pointers for general processing, and I wouldn't trust such
a design decision made by someone for that architecture. I would be
leery of it from the get-go, and would pass for something more standard,
more familiar. I would do this every time, and without hesitation or
reservation. In fact, I would do this making decisions which required
a larger battery, or a different motherboard be created, rather than
cater to some obtuse architecture's design quirks for the gain of some
small amount of something.

You're talking about doing it "right" in the context of supporting all
of those obtuse architectures, those which the code will likely never
be compiled on, under the thinking that a some point, somewhere, in
some distant corner of the globe, some user might haphazardly try to
compile some of that code and (oh my gosh!! much to their horror!!) it
doesn't work on the first try and they have to go in to the source code
and make a few changes to make it work on their known-to-be-obtuse
architecture.

I find that conclusion shocking, and wrong. I think it is so amazingly
wrong that I would never, under any circumstances, purport it, save one
and only one instance: if you had some desire to write code which was
kept to the C standard because you wanted to be able to globally support
all architectures. Which brings up the question, by the way, is there
ANY code ANYWHERE that does that? Is there truly a form of source code
you can write which will ALWAYS compile on ANY architecture without
requiring ANY changes whatsoever?

You place enough value on that "possible contingency" that it may be
used by someone else that you would require all code be written so as
to address it.

No, sir. No how. No way. Not ever. Not on my watch. I have too
many things to do with my time than worry about supporting obtuse
architectures. When it comes to me encountering that need in this
world, I will gladly send the work your way and say, "Ben can do it!
He's fantastic about supporting this architecture." And then I'll
go back to doing something else.

Rick C. Hodgin

unread,

Aug 12, 2015, 11:38:19 AM8/12/15

to

On Wednesday, August 12, 2015 at 11:17:55 AM UTC-4, Rick C. Hodgin wrote:
> On Wednesday, August 12, 2015 at 10:34:26 AM UTC-4, Ben Bacarisse wrote:
> > What's more, I am sure you would agree that they would (and should) come
> > to a different conclusion if we were not talking about pointers.
>
> I would agree...
>
> In all cases where they're not pointers, UNLESS it's from something
> that's known to be the same size.

And before the snipers get me again, I'll add the (to me obvious, so
obvious that I didn't include it in the first post) comment regarding
type switching, such as casting float to int. Those obviously require
casting, unless you're simply looking at the underlying data, in which
case you can do something like this on an architecture where float and
int are both known to be 32-bits:

struct SWhatever
{
union {
float f;
int i;
};
};

SWhatever a, b;

a.f = 1.0f;
b.i = a.i;
printf("%f\n", b.f);

I don't recommend doing this generally, but I can see where you may
want to do it at some point. I had occasion recently when writing
my generical "call DLL" function. I was only concerned about putting
data on the stack, and I didn't care what kind it was because it was
all just data that I had to move from A to B. So, everything that
was 32-bit was converted to an unsigned 32-bit integer, and 64-bit to
an unsigned 64-bit integer, and it was loaded that way.

Ben Bacarisse

unread,

Aug 12, 2015, 12:20:02 PM8/12/15

to

I'm not talking about what you can see. I mean that I'd hope that a C
programmer who can see how you are using the union will conclude that
it's bad code written by someone who's playing non-portable tricks for
no good reason. I hope they will not think, "I see, he wants to have
several differently typed pointer to the same object so he's naturally
put them in a union". I want him or her to say, "oh, this program needs
very careful checking because the programmer does not know what a union
is for".

>> What's more, I am sure you would agree that they would (and should) come
>> to a different conclusion if we were not talking about pointers.
>
> I would agree...
>
> In all cases where they're not pointers, UNLESS it's from something
> that's known to be the same size.

Let's say float and in are the same size. I think you'd agree that
using a union is not the right way to covert between one and the other.
As I keep saying, it's not about size -- size is only the most obvious
problem.

> Pointers in 32-bit code on Windows
> and ARM- and x86-based Linux are 32-bits. So, you can use this
> example safely because the compiler constraints are enforced by the
> architecture itself:
>
> union {
> uint32_t _ptr;
> void* ptr;
> };

Why would do that? Why write code that will break on os many other
systems? What do you think you are gaining by the incorrect use of a
union?

> I also extend this more generally to 32-bit and 64-bit compilations
> using an #if test at the start, and defining custom names based on
> the bit size:
>
> // Simplified for illustration, actual test is more complex
> #if 32-bits
> #define uptr uint32_t
>
> #elif 64-bits
> #define uptr uint64_t

Have you heard of uintptr_t?

No, that won't do what you claimed. It's not about integers, it's about
void * and a struct xxx *. You said a compile time test could verify
what your union is assuming.

>> Second, it's usually possible to test for unwarranted assumption being
>> made by come bit of code, but it's much better simply not to make those
>> unwarranted assumptions in the first place.
>
> It might be for the well-versed C developer, the one who has spent time
> learning and thinking about those things regularly, and has come to form
> a pattern of though which exhibits those traits naturally from within
> its creation mechanisms.

Let's not fuss about getting it right in every case. At least as far as
converting a pointer is concerned the OP seemed to know about the right
way you introduced the wrong way to him! That's setting learning back.

<snip>

> To put it simply: it's just too much work for most code because most
> targets don't need it.

No, the right way is simple and no work at all. The wrong way involves
a new type and member access. This is not simpler than a cast hear and
there. (And the really, really right way will not be converting
between void * and struct pointers anyway.)

>> <snip>
>> > I just don't believe the burden should be on every C developer to write
>> > code for every possible architecture all the time, especially when they're
>> > targeting a handful of architectures where it's not at issue.
>>
>> But here we are simply talking about doing it right. There is no
>> burden in writing the code the correct and portable way. There is no
>> possible justification (that I can see) for doing it the wrong way.
>
> This is always our same argument. You place EXTREME value on things I
> do not place ANY value upon, except in cases where such value is
> warranted, such as were I to decide to write some code which **I INTENDED**
> to be released on ever architecture. But, I simply don't do that.

I have not been arguing from the point of view of portability. My point
(which I'll just assume you've missed rather than ignored) is that the
union tells the reader the wrong thing. To any well-read C programmer
it raises the question of why the author is re-interpreting bits rather
than converting between values.

What would you think if you saw a number being negated using ~x + 1?
Would you say, ah, OK this is only supposed to work on integer types and
one some machines, or would you start to scour the code for other places
where the author is misusing C features?

> I'm perfectly content to support only those architectures which have
>the same sized pointers, and where pointer references in a union will
>all point to the same location regardless of whether they were cast or
>not.
>
> In fact, I would completely avoid an architecture that did not support
> those features

That's fine. I have no interest in the portability of your code. You
made a suggestion to someone who's learning C and it needed to be
cleared up. I don't want you to start doing it right, I just don't want
more and more people to get the wrong ideas about unions vs. conversions.

> You're talking about doing it "right" in the context of supporting all
> of those obtuse architectures,

No. I'm talking about doing it right to be clear about what the code
means. I would not use a union even for a throw-away program on one
architecture because it's more complicated and does not communicate
correctly what the code is doing.

> those which the code will likely never
> be compiled on, under the thinking that a some point, somewhere, in
> some distant corner of the globe, some user might haphazardly try to
> compile some of that code and (oh my gosh!! much to their horror!!) it
> doesn't work on the first try and they have to go in to the source code
> and make a few changes to make it work on their known-to-be-obtuse
> architecture.
>
> I find that conclusion shocking, and wrong. I think it is so amazingly
> wrong that I would never, under any circumstances, purport it, save one
> and only one instance: if you had some desire to write code which was
> kept to the C standard because you wanted to be able to globally support
> all architectures. Which brings up the question, by the way, is there
> ANY code ANYWHERE that does that? Is there truly a form of source code
> you can write which will ALWAYS compile on ANY architecture without
> requiring ANY changes whatsoever?
>
> You place enough value on that "possible contingency" that it may be
> used by someone else that you would require all code be written so as
> to address it.
>
> No, sir. No how. No way. Not ever.

That's a straw man. I don't advocate that position. You (and anyone
else) can write code as architecture-specific as you like. Just don't
tell people learning C to use a union to (not) convert pointer types.

> Not on my watch.

Eh? You are prepared to lay down you life for the right to tell people
who are learning C to use a complex and confusing way to convert between
pointer types! I'll have to pry that union out of your cold dead hands!

No, there's no nobility here. Just write it the right way. It's
simpler and conveys the code's meaning more clearly. Try not to be
offended by the fact that it also happens to be portable to many odd
architectures. That's just an incidental advantage.

<snip>
--
Ben.

Rick C. Hodgin

unread,

Aug 12, 2015, 12:52:54 PM8/12/15

to

On Wednesday, August 12, 2015 at 12:20:02 PM UTC-4, Ben Bacarisse wrote:
> I'm not talking about what you can see. I mean that I'd hope that a C
> programmer who can see how you are using the union will conclude that
> it's bad code written by someone who's playing non-portable tricks for
> no good reason. I hope they will not think, "I see, he wants to have
> several differently typed pointer to the same object so he's naturally
> put them in a union". I want him or her to say, "oh, this program needs
> very careful checking because the programmer does not know what a union
> is for".

It's exactly what a union is for, sharing memory when only one type
is needed at a time, or when you want to examine or access its
fundamental form as a series of bytes, for example.

> > Pointers in 32-bit code on Windows
> > and ARM- and x86-based Linux are 32-bits. So, you can use this
> > example safely because the compiler constraints are enforced by the
> > architecture itself:
> >
> > union {
> > uint32_t _ptr;
> > void* ptr;
> > };
>
> Why would do that? Why write code that will break on os many other
> systems? What do you think you are gaining by the incorrect use of a
> union?

Simplicity in populating values without casting when it's just data.
I am able to pass an unsigned integer (32-bit or 64-bit as per the
compile flags), and then populate into the target using a simple
assignment.

Most often it goes like this:

union {
sptr _funcWhatever;
int (*funcWhatever) (SWhatever* w, int i, float f, ...);
};

Rather than trying to pass my function address as a parameter by
casting it appropriately, or to cast my function into that
funcWhatever form, I simply pass it by sptr value, and then
populate into. The address of the function might be in a loaded
DLL, for example, or it might come from some native code.

I use this so often I've added it as a feature of RDC. Every variable
that's defined has the ability to access its fundamental value as an
unsigned integer of appropriate size using the "_name" convention for
a "name" definition. It is not required to be defined anywhere, but
is available everywhere. And it can be disabled with a switch.

> > I also extend this more generally to 32-bit and 64-bit compilations
> > using an #if test at the start, and defining custom names based on
> > the bit size:
> >
> > // Simplified for illustration, actual test is more complex
> > #if 32-bits
> > #define uptr uint32_t
> >
> > #elif 64-bits
> > #define uptr uint64_t
>
> Have you heard of uintptr_t?

Nope. I think the u****_t forms are silly. I typedef them at the
start of my program to s32, u32, f32, etc., and never look back.

It's about the address of a void* and a struct xyz*, which can be done
by comparing integers of the appropriate size. In order for them to be
the same, they must be equal, regardless of what value they are.

> >> Second, it's usually possible to test for unwarranted assumption being
> >> made by come bit of code, but it's much better simply not to make those
> >> unwarranted assumptions in the first place.
> >
> > It might be for the well-versed C developer, the one who has spent time
> > learning and thinking about those things regularly, and has come to form
> > a pattern of though which exhibits those traits naturally from within
> > its creation mechanisms.
>
> Let's not fuss about getting it right in every case. At least as far as
> converting a pointer is concerned the OP seemed to know about the right
> way you introduced the wrong way to him! That's setting learning back.

The way I introduced him to is not wrong. It is the purpose of the
union. The only places where it wouldn't work is in the case where
you identify on one architecture where the addresses of the void* may
be different than another pointer. And, the startup runtime test
would've already concluded that it's not an issue by the time it gets
to this location in the running program, thereby highlighting the issue
at startup, or completely removing the concern by the time it gets here.

> <snip>
> > To put it simply: it's just too much work for most code because most
> > targets don't need it.
>
> No, the right way is simple and no work at all. The wrong way involves
> a new type and member access. This is not simpler than a cast hear and
> there. (And the really, really right way will not be converting
> between void * and struct pointers anyway.)

I'm sorry, Ben, but in my view your solution is insane. The cast would
be required on every use instance, and that is insane.

If that's something the C standard purports by its definition, then the
C standard is wrong and needs to be changed. It is insane to impose
that burden upon every developer when the compiler is perfectly capable
of doing all that busywork for you.

It might be worth forcing a cast on the first assignment into the union
member, which is why I use C++ compilers for my "C code," because I want
that extra type checking. But, not to the extent you're talking about.

I appreciate your knowledge of C and its standard, and I would rely upon
you (and others here) were I asking questions about that end, but for
practical code, your comment is to me like the conversations between
Leah Brahms and Geordi regarding the several things he does differently
on board the Enterprise when out in deep space, compared to what she had
defined to be the requirements in the workshops at Utopia Planetia. The
two just don't mate up in the real world, and are of no value in the
vast majority of cases I've seen (at least in this area).

> >> <snip>
> >> > I just don't believe the burden should be on every C developer to write
> >> > code for every possible architecture all the time, especially when they're
> >> > targeting a handful of architectures where it's not at issue.
> >>
> >> But here we are simply talking about doing it right. There is no
> >> burden in writing the code the correct and portable way. There is no
> >> possible justification (that I can see) for doing it the wrong way.
> >
> > This is always our same argument. You place EXTREME value on things I
> > do not place ANY value upon, except in cases where such value is
> > warranted, such as were I to decide to write some code which **I INTENDED**
> > to be released on ever architecture. But, I simply don't do that.
>
> I have not been arguing from the point of view of portability. My point
> (which I'll just assume you've missed rather than ignored) is that the
> union tells the reader the wrong thing. To any well-read C programmer
> it raises the question of why the author is re-interpreting bits rather
> than converting between values.

Any well-read C programmer spent some of that well-reading time outside
of textbooks, I hope, and into practical code out there in the wild.

Using unions for shared memory that takes on a form based on some cue
IS COMPLETELY STANDARD. It's done every day. It's the purpose of a
union, especially when one recognizes that char and int types are not
some mystical things, but are locations in memory of a certain size,
sporting a particular value range. It's just data. They are not
physical constructions.

> What would you think if you saw a number being negated using ~x + 1?

I would think: "Craziness. Lunacy."

> Would you say, ah, OK this is only supposed to work on integer types and
> one some machines, or would you start to scour the code for other places
> where the author is misusing C features?

I would stop looking at the code and find some other library which does
the same thing, or rewrite it myself if it were small enough.

> > I'm perfectly content to support only those architectures which have
> >the same sized pointers, and where pointer references in a union will
> >all point to the same location regardless of whether they were cast or
> >not.
> >
> > In fact, I would completely avoid an architecture that did not support
> > those features
>
> That's fine. I have no interest in the portability of your code. You
> made a suggestion to someone who's learning C and it needed to be
> cleared up. I don't want you to start doing it right, I just don't want
> more and more people to get the wrong ideas about unions vs. conversions.

I don't see how it's wrong, Ben. You have yet to explain it apart from
those cases where pointers to a block of bytes in memory will physically
change location when casting that block to a void*, or casting it to a
struct xyz*. You'll have to give me a concrete example, a real-world
example you've seen, one I can download, test, and compile, and examine
so I can arrive at the, "Oh yeah! Ben was right!" moment.

> > You're talking about doing it "right" in the context of supporting all
> > of those obtuse architectures,
>
> No. I'm talking about doing it right to be clear about what the code
> means. I would not use a union even for a throw-away program on one
> architecture because it's more complicated and does not communicate
> correctly what the code is doing.

Using unions is completely clear in this context. They are all pointers.
The reality is you are passed some block of data which, based on a cue
of some kind, indicates what type it is.

How is that not clear? It couldn't be more clear.

> > those which the code will likely never
> > be compiled on, under the thinking that a some point, somewhere, in
> > some distant corner of the globe, some user might haphazardly try to
> > compile some of that code and (oh my gosh!! much to their horror!!) it
> > doesn't work on the first try and they have to go in to the source code
> > and make a few changes to make it work on their known-to-be-obtuse
> > architecture.
> >
> > I find that conclusion shocking, and wrong. I think it is so amazingly
> > wrong that I would never, under any circumstances, purport it, save one
> > and only one instance: if you had some desire to write code which was
> > kept to the C standard because you wanted to be able to globally support
> > all architectures. Which brings up the question, by the way, is there
> > ANY code ANYWHERE that does that? Is there truly a form of source code
> > you can write which will ALWAYS compile on ANY architecture without
> > requiring ANY changes whatsoever?
> >
> > You place enough value on that "possible contingency" that it may be
> > used by someone else that you would require all code be written so as
> > to address it.
> >
> > No, sir. No how. No way. Not ever.
>
> That's a straw man. I don't advocate that position. You (and anyone
> else) can write code as architecture-specific as you like. Just don't
> tell people learning C to use a union to (not) convert pointer types.

I will tell all of them to use unions. You're going to have to show
me an example which will fail doing it that way, and won't fail when
using a dizzying array of casts.

> > Not on my watch.
>
> Eh? You are prepared to lay down you life for the right to tell people
> who are learning C to use a complex and confusing way to convert between
> pointer types! I'll have to pry that union out of your cold dead hands!

There's nothing complex about it. In fact, it greatly simplifies source
code, and the understanding therein.

> No, there's no nobility here. Just write it the right way. It's
> simpler and conveys the code's meaning more clearly. Try not to be
> offended by the fact that it also happens to be portable to many odd
> architectures. That's just an incidental advantage.

I do write it the right way. It's the only way it should be done
because any other way requires a mass of casts, and that's just eye
clutter, especially for someone with dyslexia.

Rick C. Hodgin

unread,

Aug 12, 2015, 1:17:36 PM8/12/15

to

On Tuesday, August 11, 2015 at 10:45:34 PM UTC-4, Ben Bacarisse wrote:
> "Morten W. Petersen" <mor...@gmail.com> writes:
>

> > On 12.08.2015 03:56, Ben Bacarisse wrote:
> >> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
> >>

> >>> On Tuesday, August 11, 2015 at 4:32:29 PM UTC-4, Rick C. Hodgin wrote:
> >>>> struct SWhatever {
> >>>> union {
> >>>> void* previous;
> >>>> struct xml_element* prev_xml_element;
> >>>> struct xml_whatever* prev_xml_whatever;
> >>>> };
> >>>> };
> >>>
> >>> Before I get pounced on by snipers, it should be as above.
> >>
> >> Sorry, I have to snipe because that's not a safe technique (unless I'm
> >> misunderstanding how you intended to use the union). A cast permits the
> >> pointer to be converted from one type to another, but the union causes
> >> the bits to be re-interpreted and that's not 100% portable. Of course
> >> it may work everywhere that matters to you, but other readers may want
> >> to take care.

> >
> > Well, isn't a pointer a pointer regardless of what it points to? I
> > thought the point of casting was so that the compiler could find the
> > right offsets etc. to work on in a struct.
>
> That's only part of it. On one machine I've used a cast from void * to
> a pointer to any struct type generated code to halve the address: void *
> was a byte pointer and struct pointers were word pointers. Were you to
> do this:
>
> struct xml_element elem;
> struct SWhatever sw = { .previous = &elem; };
>
> you'd find that &elem != sw.prev_xml_element. The initialisation
> converts the word pointer to a byte pointer, but the union access simply
> re-interprets the converted bits as something they are not.

There are no circumstances where I would consider not using a union
to reinterpret bits based on the fact that an architecture like this
exists.

Instead, I would switch the warning you issue around completely and
WARN EVERYONE to NOT use such an architecture because it breaks the
ability to use unions properly. Or, to WARN EVERYONE that if they
do use this architecture, they'll have to handle unions differently.

-----
I really think you've got it backwards here, Ben. This should be the
far-and-away occasional exception you might encounter once in your
career when then forces you, for that project, to alter the way you
normally write code, and not the you-must-daily-cater-to-it rule.

Again, I think it is completely insane to suggest avoiding using
unions in this way because there are architectures like this which
exist. On top of which, I'm frankly amazed there are architectures
like this which exist. It must be for some optimized application,
which in and of itself would warrant special coding requirements.

Rick C. Hodgin

unread,

Aug 12, 2015, 1:24:30 PM8/12/15

to

This is one valid reason I can see for not writing unions the way I
do, which is if you place a high emphasis on writing to the standard.
I can see that being important to some people. It's just not important
to me, at least not important enough to make it a requirement that all
developers on all architectures everywhere do it.

> C does guarantee *some*
> things about pointer representations, but the details are not important
> here.
>
> The key thing is to try, wherever possible, to use the right pointer
> type and to use a cast to convert one pointer type to another where
> this is not possible.

I agree with the idea of casting it into the union member the first
time. I do not agree with casting it on use.

> Unions have two functions. One, to save space by enabling the storage
> of more than one kind of object in the same location at different times.
> In this usage you always access the union member that was last stored.
> Sometimes you do this by having a run-time "type" member somewhere --
> this is a so-called discriminated union -- and sometimes the logic of
> the program can ensure that the right member is always being accessed
> without needing to store any record of it.
>
> The second use is to re-interpret the representation of one type as if
> it were another. This is a much and specialised usage.

The second was the form Morten was interested in using. A pointer was
passed to him, and it pointed to some location in memory which had its
data arranged as the indicated structure. And rather than casting every
reference, by populating it into the void* location, and then using the
appropriately named structure member, each member of that structure
could be accessed without a cast.

I see this as a 500:1 preferable solution over casting on every use.
If there was only one reference, I could see casting it on use. If
there were two or more, nope.

luser droog

unread,

Aug 12, 2015, 4:35:58 PM8/12/15

to

On Wednesday, August 12, 2015 at 5:39:24 AM UTC-5, Ben Bacarisse wrote:

> Unions have two functions. One, to save space by enabling the storage

>[...]

>
> The second use is to re-interpret the representation of one type as if
> it were another. This is a much and specialised usage.

Indeed this is the /only/ way if you want avoid weird warnings
about "strict aliasing" that pointer-punning can produce.

--
alias what?
just alias

Ben Bacarisse

unread,

Aug 12, 2015, 5:41:42 PM8/12/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Wednesday, August 12, 2015 at 6:39:24 AM UTC-4, Ben Bacarisse wrote:

<snip>

>> Unions have two functions. One, to save space by enabling the storage
>> of more than one kind of object in the same location at different times.
>> In this usage you always access the union member that was last stored.
>> Sometimes you do this by having a run-time "type" member somewhere --
>> this is a so-called discriminated union -- and sometimes the logic of
>> the program can ensure that the right member is always being accessed
>> without needing to store any record of it.
>>
>> The second use is to re-interpret the representation of one type as if
>> it were another. This is a much and specialised usage.

["This is a much more rare and specialised usage"]

> The second was the form Morten was interested in using.

No. Its the form you are suggesting he use. He wants (logically) a
type conversion, and you are suggesting a hack that happens to work.

> A pointer was
> passed to him, and it pointed to some location in memory which had its
> data arranged as the indicated structure. And rather than casting every
> reference, by populating it into the void* location, and then using the
> appropriately named structure member, each member of that structure
> could be accessed without a cast.

You've not worked out the details, presumably because you've never
written the code using the correct types, but the "correct type"
alternative for Morten is much simpler than your hack. The irony is
that doing it correctly will, in his specific use-case, involve no casts
at all because he should not be using void * in the first place!

I've written exactly the sort of code Morten is writing numerous times.
The result is not a mess of casts. There are one or two in very
specific cases which have not yet even been illustrated, and the fact
that the casts stand out in those cases is important: they flag a
potentially risky type conversion. You don't need them at all for
setting the parent pointer of one type of node from the parent pointer
of another.

> I see this as a 500:1 preferable solution over casting on every use.
> If there was only one reference, I could see casting it on use. If
> there were two or more, nope.

Can we agree to leave it at that? You continue to use the
reinterpretation of representations in those cases where it happens to
work, and I'll continue to tell people learning C to use type conversion
when they want to convert between types?

Nether of us will change our minds and I'm happy to let others simply
decide for themselves whose advice to take. If they are unsure, maybe a
search for previous post by you and by me might help them decide, but
there's clearly no point in continuing this exchange.

--
Ben.

Ben Bacarisse

unread,

Aug 12, 2015, 5:41:42 PM8/12/15

to

Nor I. When I want to reinterpret bits, I use a union (or the more
dodgy *(T *)&object) idiom). The argument is over the fact that you
advocate using a union when you mean a type conversion from one type to
another.

> Instead, I would switch the warning you issue around completely and
> WARN EVERYONE to NOT use such an architecture because it breaks the
> ability to use unions properly.

The union is doing exactly what it should. If you simplified the code
by removing the unnecessary union, and then replaced the member access
with a conversion (which may not even need a cast -- it depends) you'll
have clearer code (because it's explicit about what it does) which just
happens to work on odd machines.

<snip>

> Again, I think it is completely insane to suggest avoiding using
> unions in this way because there are architectures like this which
> exist.

There are other reasons to avoid using a union where a conversion is
intended.

<snip>
--
Ben.

Ben Bacarisse

unread,

Aug 12, 2015, 5:41:43 PM8/12/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Wednesday, August 12, 2015 at 12:20:02 PM UTC-4, Ben Bacarisse wrote:
>> I'm not talking about what you can see. I mean that I'd hope that a C
>> programmer who can see how you are using the union will conclude that
>> it's bad code written by someone who's playing non-portable tricks for
>> no good reason. I hope they will not think, "I see, he wants to have
>> several differently typed pointer to the same object so he's naturally
>> put them in a union". I want him or her to say, "oh, this program needs
>> very careful checking because the programmer does not know what a union
>> is for".
>
> It's exactly what a union is for, sharing memory when only one type
> is needed at a time, or when you want to examine or access its
> fundamental form as a series of bytes, for example.

That's not what's being done here. Instead, a value of one type needs
to be converted to a value of another. The union does not do that.
When both values have the same representation you can, indeed, get the
desired effect using the otherwise entirely unnecessary new type -- the
union.

>> > Pointers in 32-bit code on Windows
>> > and ARM- and x86-based Linux are 32-bits. So, you can use this
>> > example safely because the compiler constraints are enforced by the
>> > architecture itself:
>> >
>> > union {
>> > uint32_t _ptr;
>> > void* ptr;
>> > };
>>
>> Why would do that? Why write code that will break on os many other
>> systems? What do you think you are gaining by the incorrect use of a
>> union?
>
> Simplicity in populating values without casting when it's just data.
> I am able to pass an unsigned integer (32-bit or 64-bit as per the
> compile flags), and then populate into the target using a simple
> assignment.

You have code where uint32_t is 32 or 64 bits as per compile flags?

> Most often it goes like this:
>
> union {
> sptr _funcWhatever;
> int (*funcWhatever) (SWhatever* w, int i, float f, ...);
> };
>
> Rather than trying to pass my function address as a parameter by
> casting it appropriately, or to cast my function into that
> funcWhatever form, I simply pass it by sptr value, and then
> populate into. The address of the function might be in a loaded
> DLL, for example, or it might come from some native code.

You keep introducing new examples. First a void * and two struct
pointers. Then a uint32_t (that might be 64 bits?) and a void *. Now
two function pointer types. But in all cases all you do is show the
unnecessary union. In no case do you show the code that gets simplified
by it's use. You may have a point here (since I'd never use a union to
fake a type conversion I've no experience in writing in that style) but
if you never show an example I could re-write using the correct types
your claim is hard to verify.

<snip>

>> >> Can you really write a compile-time test that checks the actual
>> >> assumptions that your suggested use of unions relies on, or are you just
>> >> speculating that it must, surely, be possible?
>> >
>> > Sure. If there's a test case, such as your example, where the address
>> > of A won't equal B, when A and B should be equal, then it can be tested.
>> >
>> > You can use pointer math if your compiler supports it, and if not, then
>> > use a union which encompasses a pointer and a sufficiently large integer,
>> > and simply do a compare. They have to be exactly equal to pass the test.
>> > If they don't, report the failure observed at runtime.
>>
>> No, that won't do what you claimed. It's not about integers, it's about
>> void * and a struct xxx *. You said a compile time test could verify
>> what your union is assuming.
>
> It's about the address of a void* and a struct xyz*, which can be done
> by comparing integers of the appropriate size. In order for them to be
> the same, they must be equal, regardless of what value they are.

Unless you write the test I'm going to have to assume you can't. I
really don't know how to do it, so I'm not being coy here, but since I
also can't prove that you can't there is only one way to settle it:
write the test. How do you test, at compile time, that a particular
implementation has the same representation for all void * and struct
pointer pairs that point to the same object?

>> >> Second, it's usually possible to test for unwarranted assumption being
>> >> made by come bit of code, but it's much better simply not to make those
>> >> unwarranted assumptions in the first place.
>> >
>> > It might be for the well-versed C developer, the one who has spent time
>> > learning and thinking about those things regularly, and has come to form
>> > a pattern of though which exhibits those traits naturally from within
>> > its creation mechanisms.
>>
>> Let's not fuss about getting it right in every case. At least as far as
>> converting a pointer is concerned the OP seemed to know about the right
>> way you introduced the wrong way to him! That's setting learning back.
>
> The way I introduced him to is not wrong. It is the purpose of the
> union. The only places where it wouldn't work is in the case where
> you identify on one architecture where the addresses of the void* may
> be different than another pointer.

If a technique is very inefficient it can be wrong even if it works. If
a technique is complex or obscure it can be wrong even it it works. If
a technique interferes with the usual techniques of code review it can
be wrong even if it works. This technique is deceptive since it
circumvents the type system without using the usual signal for that
circumvention (as cast). That makes it wrong even where it works.

<snip>

> I'm sorry, Ben, but in my view your solution is insane. The cast would
> be required on every use instance, and that is insane.

The member access would be required on every instance. You are adding
an unnecessary type, and replacing every type conversion with a union
access (which does no conversion) and you call my advice insane?

I think you've just invested so much into the idea that bit are bits and
types are just the compiler being fussy that you can't see the other
side of the argument.

> If that's something the C standard purports by its definition, then the
> C standard is wrong and needs to be changed. It is insane to impose
> that burden upon every developer when the compiler is perfectly capable
> of doing all that busywork for you.

There is no such burden. You need to write some code that uses your
void * struct * union so that can re-write it using the corect types.
Until you do that I can't show you that your "simplification" is just
obfuscation.

I can't really do the reverse because in the code I tend to write casts
are very rare indeed. I'm struggling to think of an example I could
write using the correct types that you could "improve" with the union.
Maybe you could suggest a micro code project that you think will need
this technique and I'll have a go?

<snip>

>> >> But here we are simply talking about doing it right. There is no
>> >> burden in writing the code the correct and portable way. There is no
>> >> possible justification (that I can see) for doing it the wrong way.
>> >
>> > This is always our same argument. You place EXTREME value on things I
>> > do not place ANY value upon, except in cases where such value is
>> > warranted, such as were I to decide to write some code which **I INTENDED**
>> > to be released on ever architecture. But, I simply don't do that.
>>
>> I have not been arguing from the point of view of portability. My point
>> (which I'll just assume you've missed rather than ignored) is that the
>> union tells the reader the wrong thing. To any well-read C programmer
>> it raises the question of why the author is re-interpreting bits rather
>> than converting between values.
>
> Any well-read C programmer spent some of that well-reading time outside
> of textbooks, I hope, and into practical code out there in the wild.
>
> Using unions for shared memory that takes on a form based on some cue
> IS COMPLETELY STANDARD. It's done every day.

Yes, but using a union where a type conversion is meant is not standard
and should never be thought of as standard. I've never seen it done in
any of the hundreds of thousands of lines of C I've read, contributed to
and written in the last 32 years.

If you think your suggestion is the same as

struct value {
int type;
union {
int i;
double d;
void *p;
} v;
};

then you've completely missed the points I'm making.

<snip>

>> What would you think if you saw a number being negated using ~x + 1?
>
> I would think: "Craziness. Lunacy."

Why? It works, doesn't it? OK, so it works by using the representation
in x rather than the value of x, but you are OK with that provided the
code works.

>> Would you say, ah, OK this is only supposed to work on integer types and
>> one some machines, or would you start to scour the code for other places
>> where the author is misusing C features?
>
> I would stop looking at the code and find some other library which does
> the same thing, or rewrite it myself if it were small enough.
>
>> > I'm perfectly content to support only those architectures which have
>> >the same sized pointers, and where pointer references in a union will
>> >all point to the same location regardless of whether they were cast or
>> >not.
>> >
>> > In fact, I would completely avoid an architecture that did not support
>> > those features
>>
>> That's fine. I have no interest in the portability of your code. You
>> made a suggestion to someone who's learning C and it needed to be
>> cleared up. I don't want you to start doing it right, I just don't want
>> more and more people to get the wrong ideas about unions vs. conversions.
>
> I don't see how it's wrong, Ben. You have yet to explain it apart from
> those cases where pointers to a block of bytes in memory will physically
> change location when casting that block to a void*, or casting it to a
> struct xyz*. You'll have to give me a concrete example, a real-world
> example you've seen, one I can download, test, and compile, and examine
> so I can arrive at the, "Oh yeah! Ben was right!" moment.

Sorry, but that's not my concern here. I'm happy for you to continue to
think I'm wrong. All I really care about is that people who are
undecided (and that will be people learning C) don't end up thinking
you're right. To do that, I just need to make my case to them --
readers who might stumble across this thread in the future as well a a
few who might be reading now.

>> > You're talking about doing it "right" in the context of supporting all
>> > of those obtuse architectures,
>>
>> No. I'm talking about doing it right to be clear about what the code
>> means. I would not use a union even for a throw-away program on one
>> architecture because it's more complicated and does not communicate
>> correctly what the code is doing.
>
> Using unions is completely clear in this context. They are all
> pointers.

Yes, the union is clear but the code that uses it is not because the use
of a union hides the purpose -- to convert one type to another.

Of course, I know that in your mind, the purpose is to take the bits and
use them as you want -- i.e. to get round the pesky type system that C
puts in the way of a programmer working in that odd place called the
real world.

> The reality is you are passed some block of data which, based on a cue
> of some kind, indicates what type it is.
>
> How is that not clear? It couldn't be more clear.

You could make it more clear by converting the value from the type is
currently is to the one you want to consider it to be. That's the right
way because it works for all convertible type pairs -- you can do for
int and float in exactly the same was as for void * and int *. And you
don't need to invent a new type that has no other value than to fail to
do this conversion.

<snip>

>> That's a straw man. I don't advocate that position. You (and anyone
>> else) can write code as architecture-specific as you like. Just don't
>> tell people learning C to use a union to (not) convert pointer types.
>
> I will tell all of them to use unions. You're going to have to show
> me an example which will fail doing it that way, and won't fail when
> using a dizzying array of casts.
>
>> > Not on my watch.
>>
>> Eh? You are prepared to lay down you life for the right to tell people
>> who are learning C to use a complex and confusing way to convert between
>> pointer types! I'll have to pry that union out of your cold dead hands!
>
> There's nothing complex about it. In fact, it greatly simplifies source
> code, and the understanding therein.

You need to show an example of that. Your "solution" appears to need a
new type, not otherwise needed, and it replaces a possible cast with a
definite member access. That's not "greatly simplified" in my book.

Let's see... A beginner asks "I have a void * that points to a node in
a linked list. How do I access the members of the node?". I say:

struct list_node *lnp = vp;
lnp->data ... lnp->next ...

not even one cast in sight. Now, what is your answer to the keen
beginner?

>> No, there's no nobility here. Just write it the right way. It's
>> simpler and conveys the code's meaning more clearly. Try not to be
>> offended by the fact that it also happens to be portable to many odd
>> architectures. That's just an incidental advantage.
>
> I do write it the right way. It's the only way it should be done
> because any other way requires a mass of casts, and that's just eye
> clutter, especially for someone with dyslexia.

No mass of casts, no. Not unless there is a mass of member accesses in
your version, of course. There will be no more casts than there will be
member access (and there can be fewer -- you can't ever omit a member
access). But the casts tell the truth while every member access is a
little lie, and for every one of those a bit dies somewhere in the
world.

--
Ben.

Rick C. Hodgin

unread,

Aug 12, 2015, 6:01:47 PM8/12/15

to

In the land of the vast union brigade,
sat *THE* finest coding tool that's ever made,
for bridging data cross the wiley format gaps,
it was the envy, oh the envy of the apps.

Yet there were those who served only to repress,
union's ability to awe and impress,
they'd sooner pin e'ry union to the ground,
to cater to some old fogies on the outskirts of town.

But mighty union was completely undaunted,
for it knew all along just what it wanted,
and that was to give every developer a hand,
in leading them to the coding promised land!

Where the rigors of past coding style's dying,
being replaced with modern tools that are less trying,
those which help keep new developers from crying,
whilst also keeping all the new apps safely flying.

[doomp doomp dooo]

Lo the moral of the story is courageous,
you can toil using yer tools for bygone ages,
or you can step up and get with the modern program,
where the tools have all moved on as code has called them.

But you better make your choice soon I'd advise ya,
lest ye wind up like the "mighty dinosau-r,"
for while they once were mighty fearsome towering creatures,
today you only find them round about museums.

So as you ponder yer ways in coding forward,
don't regard past coding styles less than stallward,
for while they served might-i-ly during their tenure,
but at some point all things get put out to pasture.

And as the new architectures roll along,
and they sing their sweet victory song,
we must remember that their data needs are blooming,
enter in the steadfast homeplace of the union.

Undaunted all along and filled with patience,
for the day it would rise up and take command,
displacing former ways with those of el-e-gance,
and simplifying code in the devie's hand.

ALL HAIL THE MIGHTY UNION!

The end.

Rick C. Hodgin

unread,

Aug 12, 2015, 11:00:09 PM8/12/15

to

On Wednesday, August 12, 2015 at 5:41:42 PM UTC-4, Ben Bacarisse wrote:
> ... The argument is over the fact that you advocate using a union when

> you mean a type conversion from one type to another.

Well, Ben... I know you interpret what I mean under the context of the
understanding you have about what you think I mean, but you are doing
so with values placed in areas where I do not have values, and you are
doing so with no values placed in areas where I do have values. As
such, we will never see eye-to-eye on this until one or more of us
changes... But, that being said...

I do not mean type conversion. I mean there is a pointer pointing to
a location in memory where data has been arranged a particular way.
The data exists. And the pointer can enter in to wherever it's used
via any form of carrier, yet in the end, that pointer always points
back to the start of that block of data, regardless of whether it's
assigned to a void*, struct xyz*, int*, or any other form of pointer.
This is the requirement of what I consider to be a proper C standard,
and a proper machine architecture.

Use of a union in that case allows that data to be viewed through a
series of lenses, each with a particular polarization allowing different
views from the same form. When a particular structure is applied on that
data, it presents a particular way. When another structure is applied,
it presents in another way. But when viewed through both structures,
the underlying data did not change, nor has the address that data in
memory changed, nor has any value used by any of the pointers which
access it. The pointer value resolves back to some address which points
to that block of data.

That's the way it is on x86 and ARM. And that's the way it should be
in the standard. That's my position. And that's my argument.

-----
I would go so far in my declaration on the necessity of that ability
existing, that I would say any architecture which does not support it
is, by definition, a /fringe/ architecture, and one that by its very
design requires some type of specialized programming beyond the scope
of normal things.

And as such, the requirements of that fringe architecture have no
place whatsoever in the C standard, but should be entirely relegated
to footnotes, caveats, and addendums, all of which exist above or
outside the standard, in a layer of extensions.

-----
As I've said many times, the C standard should be for the standard
core set of abilities, with extensions defined allowing all sorts
of fringe things to also exist, but only being expressly identified
as being exactly that: fringe things, and not standard things.

I cannot imagine the mindset where the needs of a few tiny machine
architectures, featuring traits so obscure as to be the noted fringe
things which exist against the catalog of modern architectures, would
then dictate (via the C standard) that all code written for the much
larger catalog of machines which do not require nor exhibit those rare
traits, cater to those most peculiar of traits, forcing all developers
everywhere to support the smallest handful of the obtuse architectures.

It's beyond my imagination that anyone would want to support such a
standard, save having a personal and specific need of using that
architecture for whatever reason -- but even then, the fact that it
is of such a peculiar form, it would and should have some special
programming needs which are flagged and defined well outside of the
standard.

I will never live at that place. And I will advise anyone who would
approach an architecture with different pointer sizes to reconsider
something more modern, more standard, less peculiar, for the sake of
developer sanity. And as such, I will advise everyone on using unions
for pointers because they greatly simplify code at the expense of some
compile-time type checking, but when you realize that pointers are just
N-bit quantities, then you also realize that it does not matter how they
are carried, and that they can be passed by any N-bit hauler to get from
A to B, to then be cast into their role as by the mighty union.

In any event, barring answering questions, this is my final post on the
matter. And I really am shocked that there are people purporting to
write code in such a way when it will only make a difference on the
smallest fringe architectures that most developers will never see.

RDC will define pointers which are all the same size and can be used in
a union without reservation. And on those obtuse architectures RDC might
be ported to, it will be advised that the lack of the ability to have
union-based shared pointers without casting, that a few compiler warnings
be injected to advise the developer on the odd use, but only on that
architecture, and those warnings will not be generated for the masses.

I really am shocked at the thinking. To me it is completely backwards,
and not as it should be.

Richard Damon

unread,

Aug 12, 2015, 11:30:36 PM8/12/15

to

On 8/12/15 1:17 PM, Rick C. Hodgin wrote:
>
> There are no circumstances where I would consider not using a union
> to reinterpret bits based on the fact that an architecture like this
> exists.
>
> Instead, I would switch the warning you issue around completely and
> WARN EVERYONE to NOT use such an architecture because it breaks the
> ability to use unions properly. Or, to WARN EVERYONE that if they
> do use this architecture, they'll have to handle unions differently.
>
> -----
> I really think you've got it backwards here, Ben. This should be the
> far-and-away occasional exception you might encounter once in your
> career when then forces you, for that project, to alter the way you
> normally write code, and not the you-must-daily-cater-to-it rule.
>
> Again, I think it is completely insane to suggest avoiding using
> unions in this way because there are architectures like this which
> exist. On top of which, I'm frankly amazed there are architectures
> like this which exist. It must be for some optimized application,
> which in and of itself would warrant special coding requirements.
>
> Best regards,
> Rick C. Hodgin
>

Remember that the rule in C is that you are only allowed to read the
member of the union that was last written. Anything else has put you
into the realm of undefined behavior. (There are a few exception where
the standard does give us defined behavior, but void * to struct x* is
not one of them).

There are also non-portable, but perhaps implementation guaranteed cases
where this can be used (like getting the bit representation of a pointer
or floating point value).

Morten W. Petersen

unread,

Aug 13, 2015, 12:19:33 AM8/13/15

to

On 12.08.2015 21:45, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:

[...]

Well, I think it is an important point that once the parser has run
through the file, accessing the data for example for rendering the
document as an XML file again, will entail going through each struct
in memory and casting it to the right type, based on the first entry
in the struct, the type.

Rick talks about the union solution while you Ben talk about the
casting. Would a good and correct compromise here be to use a union to
restrict what is being pointed to, and at the same time use casting to
work with the data?

-Morten

David Brown

unread,

Aug 13, 2015, 7:31:45 AM8/13/15

to

On 12/08/15 18:52, Rick C. Hodgin wrote:
> On Wednesday, August 12, 2015 at 12:20:02 PM UTC-4, Ben Bacarisse wrote:
>> I'm not talking about what you can see. I mean that I'd hope that a C
>> programmer who can see how you are using the union will conclude that
>> it's bad code written by someone who's playing non-portable tricks for
>> no good reason. I hope they will not think, "I see, he wants to have
>> several differently typed pointer to the same object so he's naturally
>> put them in a union". I want him or her to say, "oh, this program needs
>> very careful checking because the programmer does not know what a union
>> is for".
>
> It's exactly what a union is for, sharing memory when only one type
> is needed at a time, or when you want to examine or access its
> fundamental form as a series of bytes, for example.

(The following information is correct to the best of my knowledge, but
it's a complicated subject. If I've got something wrong, hopefully one
of the other experts here will correct it. And if it's right, then
hopefully others will confirm it. Until that time, treat it with a
little bit of caution.)

That is one use of unions. Another is for type punning. But type
punning using unions is not about using unions to change pointer types -
it is about using pointers to unions to make the changes safe.

Let's take an example - code that swaps the high and low 16-bit halves
of a 32-bit integer. I am using uint16_t and uint32_t, so assumptions
about CHAR_BIT being 8 or 16 and so on apply.

uint32_t swapWrong1(uint32_t a) {
uint32_t b;
uint16_t* pA = (uint16_t*) &a;
uint16_t* pB = (uint16_t*) &b;

pB[1] = pA[0];
pB[0] = pA[1];

return b;
}

This code is wrong, because of aliasing - the compiler knows that "a"
and "b" are uint32_t, and therefore any accesses (read or write) done
via pointers to uint16_t cannot apply to either a or b, because uint32_t
and uint16_t objects cannot be aliased (i.e., they cannot exist at the
same address). So the code has undefined behaviour.

uint32_t swapWrong2(uint32_t a) {
uint32_t b;

union {
uint32_t* p32;
uint16_t* p16;
} pA, pB;

pA.p32 = &a;
pB.p32 = &b;

pB.p16[1] = pA.p16[0];
pB.p16[0] = pA.p16[1];

return b;
}

This code uses your "union of pointers" in order to do the same thing,
but avoiding the casts. It has /exactly/ the same problems as
swapWrong1 regarding aliasing and undefined behaviour.

(I agree with Ben that the use of a union of pointers to convert between
pointer types, as shown above and in your code, is poor style. But that
is a separate issue from the point I am making here.)

uint32_t swapRight1(uint32_t a) {
uint32_t b;

union {
uint32_t x32;
uint16_t x16[2];
} uA, uB;

uA.x32 = a;

uB.x16[1] = uA.x16[0];
uB.x16[0] = uA.x16[1];

b = uB.x32;

return b;
}

This works, because type-punning using unions is allowed in C (if the
implementation-dependent representations of the union members are
compatible - which the existence of uint16_t and uint32_t types guarantees).

The incorrect versions will /probably/ work, especially as stand-alone
functions - but they are not guaranteed to do so. If they are inlined,
with a good optimising compiler, then they may not do what the
programmer intended.

>
> Simplicity in populating values without casting when it's just data.
> I am able to pass an unsigned integer (32-bit or 64-bit as per the
> compile flags), and then populate into the target using a simple
> assignment.
>
> Most often it goes like this:
>
> union {
> sptr _funcWhatever;
> int (*funcWhatever) (SWhatever* w, int i, float f, ...);
> };
>
> Rather than trying to pass my function address as a parameter by
> casting it appropriately, or to cast my function into that
> funcWhatever form, I simply pass it by sptr value, and then
> populate into. The address of the function might be in a loaded
> DLL, for example, or it might come from some native code.

The way to handle this is to pass the union as a parameter.

David Brown

unread,

Aug 13, 2015, 7:47:32 AM8/13/15

to

On 13/08/15 04:59, Rick C. Hodgin wrote:

> That's the way it is on x86 and ARM. And that's the way it should be
> in the standard. That's my position. And that's my argument.
>

I think at least some of this argument is because you seem to think a
/pointer/ is just an /address/. That is not the case in C. A /pointer/
certainly contains an address, but it also has a type - and the type is
important. A pointer to an "int" is not the same thing as a pointer to
a "struct foo". It does not matter whether the pointers have the same
size, or the same implementation in the hardware - that's irrelevant.
The two pointers have different types.

This is important to the programmer, in order to be able to write
clearer code and to get the best possible help from the tools in
avoiding or spotting mistakes. If you mix up objects of different
types, whether they are pointers or not, it's often a good indication of
an error in the code. Writing casts explicitly can be a good way of
making it clear that you know what you are doing in this slightly
dangerous or difficult code.

The type differences are also important to the compiler - as well as
obvious uses (such as when type changes mean real code and value
changes, for example when changing between an int and a double), the
compiler uses type information to improve code generation because it
knows that a pointer to type A cannot alias a pointer to type B except
in certain prescribed circumstances.

None of this has anything to do with how pointers happen to be
implemented in different targets or compilers - it applies equally to
x86 and ARM.

Ben Bacarisse

unread,

Aug 13, 2015, 8:18:12 AM8/13/15

to

"Morten W. Petersen" <mor...@gmail.com> writes:

<snip>

> Well, I think it is an important point that once the parser has run
> through the file, accessing the data for example for rendering the
> document as an XML file again, will entail going through each struct
> in memory and casting it to the right type, based on the first entry
> in the struct, the type.
>
> Rick talks about the union solution while you Ben talk about the
> casting. Would a good and correct compromise here be to use a union to
> restrict what is being pointed to, and at the same time use casting to
> work with the data?

That's not detailed enough to be sure what you mean.

Here's my take on it: don't use a union to convert between types -- any
types. That's not what it does. Using a union in the way Rick has been
suggesting is simply a hack to get round the type system, and it works
for pointers because, on most systems, all pointers to the same object
use the same bit pattern. I hope you will simply reject this hack out
of hand. If you need to convert a value of one type to a value of
another type, C provides a way to do that with a cast (though for void *
you don't even need a cast if the context makes the desired pointer type
explicit).

As for your final design, I don't see any need to convert pointers at
all. This all started because you happened to have a void * and you
simply put a cast in the wrong place. But you don't either the void *
or the cast at all. Maybe there will be, one day, be a need to convert
pointers, and I hope you will do that correctly with a cast, but in the
meantime I would imagine you can do most of what you want with a
discriminated union: place the common elements at the start of a struct
and include in those some type indicator. After that, place a union
that describes the various different data that are specific to each of
your node types.

For your purposes, this whole discussion has been a distraction.

--
Ben.

Rick C. Hodgin

unread,

Aug 13, 2015, 8:35:55 AM8/13/15

to

The C language relates information through translation to the needs of
physical hardware, which conducts the work. As such, pointers are
addresses fundamentally. The type form is only important to the
compiler because it conveys information needed to access members when
they are used in processing. However, the memory address for the start
of something is always the same regardless of whether it's been cast to
a void*, a struct xyz*, or some native form pointer. And where it's
not, I dub that to be a /fringe/ architecture that has no place in the
standard.

Fundamentally, my argument is that the standard is too restrictive in
its current form, and mandates that unnecessary and unwieldy coding
requirements be applied to all developers everywhere, when they are
not necessary except on those system to which they are necessary, and
in all of those cases it should be the exception and not the rule,
being applied to them only.

I don't know how to be more clear about this.

If you examine the assembly code generated for a compiled C program,
you'll find nothing about types. You'll find machine-based fundamental
accesses to memory and other machine resources. That's what's really
taking place when you write a C program. The types are just there to
make it easy for human beings to wield and relate to. But once it
gets past the compiler, never again are any types considered, and from
that point forward the instructions/command sequences the compiler
generated for the hardware will conduct fundamental machine-based unit
workloads to carry out the work, which gives the data the appearance
of the original thing, without respect to optimization, for the purposes
of doing the work as indicated. And in the case of optimization, it was
the compiler which determined what could be removed because it wasn't
used, needed that way, or whatever, while still fulfilling the workload
portion that is required, is used, is needed, or whatever.

I come from a low-level background, and I think in those terms. I do
recognize what the translation of the pointer type means to hardware,
and I place value on that hardware-side of the equation because within
the confines of C alone, it is unnecessarily restrictive (when it doesn't
need to be at the hardware level).

These "hacks" as Ben calls them work correctly in Windows, Linux, and on
Microsoft's compilers and GCC for x86 and ARM, and I bet they work on
many other systems, most other systems even, and on many other hardware
forms. They don't work on all of them, and therein lies the issue I take
with that reality existing in the C /standard/. I believe that C should
account for those peculiar hardware forms, but they should be extensions,
and not part of the standard, because the standard should relate more
closely to physical hardware in its current semiconductor form, which is
the form which has existed since at least the 1980s in 32-bit form, and
back to the early 70s in 4-bit and larger forms.

We have a tool that is well distributed (semiconductor-based processors).
We have a toolset designed to program those processors (languages like
C). And there's no reason the two of them shouldn't have a more
fundamental awareness of each other, or at least that the toolset should
not have a more fundamental awareness of the tool itself.

And for what it's worth, I honestly don't understand why people I would
consider to be experts like Ben, do not understand this fundamental
relationship of the toolset knowing the tool in a more directly connected
way.

Ben Bacarisse

unread,

Aug 13, 2015, 9:46:33 AM8/13/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Wednesday, August 12, 2015 at 5:41:42 PM UTC-4, Ben Bacarisse wrote:
>> ... The argument is over the fact that you advocate using a union when
>> you mean a type conversion from one type to another.
>
> Well, Ben... I know you interpret what I mean under the context of the
> understanding you have about what you think I mean, but you are doing
> so with values placed in areas where I do not have values, and you are
> doing so with no values placed in areas where I do have values. As
> such, we will never see eye-to-eye on this until one or more of us
> changes... But, that being said...
>
> I do not mean type conversion.

Yes, I know. That's exactly why you are wrong. You *should* mean
conversion because that's what's required here. You are proposing an
alternative to conversion that is not really an alternative. Yes, your
suggestion works (on the systems you care about) because the same bit
pattern gives the right value for both pointer types, and the union
therefore gets you round the type system, but the whole thing involves
an otherwise unnecessary type and is, in effect, a tiny lie. It will
complicate the code and will make a reader wonder what else you might be
doing that's a bit hacky.

> I mean there is a pointer pointing to
> a location in memory where data has been arranged a particular way.
> The data exists. And the pointer can enter in to wherever it's used
> via any form of carrier, yet in the end, that pointer always points
> back to the start of that block of data, regardless of whether it's
> assigned to a void*, struct xyz*, int*, or any other form of pointer.
> This is the requirement of what I consider to be a proper C standard,
> and a proper machine architecture.

Yes, I know. You don't think pointers values should ever need
converting -- the bits are enough. Readers of your code who know C will
sit up and take note. They will start looking for other things you
might be assuming are true about "proper C".

<snip>

> In any event, barring answering questions, this is my final post on the
> matter. And I really am shocked that there are people purporting to
> write code in such a way when it will only make a difference on the
> smallest fringe architectures that most developers will never see.

You keep saying this, presumably because you have no response to the
other argument -- that the code is a lie. Rather than writing what you
mean, you take advantage of a special case and re-interpret the bits.
Do you do this when you want to convert an unsigned into to a signed
int? I'm guessing you don't. You certainly can't do it for int and
float because it simply doesn't work on *any* system.

<snip>

> I really am shocked at the thinking. To me it is completely backwards,
> and not as it should be.

Yes I am sure you are socked, but have you found anyone who agrees with
you? If not (and I've never seem this hack advocated before) you might
want to see if you can find any merit in the conventional wisdom so as
to temper your certainly that everyone else is wrong.

--
Ben.

David Brown

unread,

Aug 13, 2015, 9:59:30 AM8/13/15

to

The type information is /critical/ - types are a vital part of C. This
is why pointers are more than just a simple address.

> However, the memory address for the start
> of something is always the same regardless of whether it's been cast to
> a void*, a struct xyz*, or some native form pointer. And where it's
> not, I dub that to be a /fringe/ architecture that has no place in the
> standard.

I know you like to treat all systems that don't have a single linear
address space, and identical pointer representation for all types, as
being weird or "fringe" and therefore not relevant to most programmers.

I am somewhat in agreement with you, although not necessarily about
pointer size (since I fail to see any reason why it is particularly
relevant - if simplifying portability assumptions don't lead to easer or
clearer code, then there is no point in making them). Having a single
address space is much more relevant - and I (and most programmers)
typically write code with that assumption. That is even though I /do/
work with processors that have more than one address space - I am happy
to accept that code written for these processors may not be portable
back and forth between other processors, because writing such fully
portable code is ugly.

However, for the issues under discussion here, the fact that pointers to
different types are the same size on most architectures is irrelevant.
Please put it out of your mind for now - everything I wrote applies to
x86 and ARM as much as to AVR's or "fringe" systems.

>
> Fundamentally, my argument is that the standard is too restrictive in
> its current form, and mandates that unnecessary and unwieldy coding
> requirements be applied to all developers everywhere, when they are
> not necessary except on those system to which they are necessary, and
> in all of those cases it should be the exception and not the rule,
> being applied to them only.
>
> I don't know how to be more clear about this.

Again, I don't disagree with you on this general principle (though we
differ in the details). But you are missing the point.

You have to understand how types work in C, and what they mean. Just as
a 32-bit "int" is a completely different thing than a 32-bit "float"
despite being the same size, an int* pointer is a completely different
thing than a float* pointer.

There are times when types are different, but near enough that you can
usefully convert between them - and perhaps usefully do so without any
code being generated at all. But logically, in C, it is still a conversion.

>
> If you examine the assembly code generated for a compiled C program,
> you'll find nothing about types.

That is because assembly is a typeless language, while C has a type
system. When the C code is compiled to assembly, the type information
is lost.

> You'll find machine-based fundamental
> accesses to memory and other machine resources. That's what's really
> taking place when you write a C program. The types are just there to
> make it easy for human beings to wield and relate to.

Types also exist to help you write /correct/ code, and readable and
maintainable code - even if that is not easier. They can also help the
compiler generate better object code.

> But once it
> gets past the compiler, never again are any types considered, and from
> that point forward the instructions/command sequences the compiler
> generated for the hardware will conduct fundamental machine-based unit
> workloads to carry out the work, which gives the data the appearance
> of the original thing, without respect to optimization, for the purposes
> of doing the work as indicated. And in the case of optimization, it was
> the compiler which determined what could be removed because it wasn't
> used, needed that way, or whatever, while still fulfilling the workload
> portion that is required, is used, is needed, or whatever.
>
> I come from a low-level background, and I think in those terms. I do
> recognize what the translation of the pointer type means to hardware,
> and I place value on that hardware-side of the equation because within
> the confines of C alone, it is unnecessarily restrictive (when it doesn't
> need to be at the hardware level).

I am not trying to be competitive here, but believe I come from a
lower-level background than you do, with a wider experience of many more
processor architectures, including how they are designed as well as
their ISA and assemblies, and have long experience as an assembler and
low-level C programmer. I haven't done much on the x86, unlike you, but
I have worked with many more architectures. And I believe that I pay a
lot more attention to the generated code than most C programmers,
yourself included - because for many of the systems I work with,
careless C coding can mean correct but inefficient generated code, which
means bigger and more expensive microcontrollers.

It is a good thing to understand the compilation process, and be aware
of how things work underneath. But it is not a good thing to get
obsessive about it, or to make the mistake of thinking that you
understand exactly what the compiler will do - compilers have a great
deal of flexibility in how they generate code.

>
> These "hacks" as Ben calls them work correctly in Windows, Linux, and on
> Microsoft's compilers and GCC for x86 and ARM, and I bet they work on
> many other systems, most other systems even, and on many other hardware
> forms.

There are two issues with your hacks, one of which Ben is concerned
with, and the other of which I have been discussing. Neither is target
dependent - and they apply as much to gcc, x86, ARM, Linux, and Windows
as any other system.

First, Ben dislikes your use of a union for type conversion because it
is unclear, and does not express the programmer's intentions well.
While I agree with Ben, this is a somewhat subjective opinion - and
obviously when one has used a particular style or habit for a long time,
it becomes clear and obvious to that person. Thus we must rely on the
opinions of others when judging whether a style is common or easily
understood by other programmers.

Secondly, I pointed out the strict aliasing issues with your solution.
These are, to the best of my knowledge, facts - the C standards have
rules about aliasing, and compilers use these rules when generating
code. If you break these rules, your code is incorrect and has
undefined behaviour, and whether it does or does not work as intended is
a matter of circumstance.

Thus your code is wrong, /even if it appears to work/. Being bad style
is not necessarily a serious "crime", but it is not something to
recommend to others. (Or since this is a self-correcting newsgroup, you
/can/ recommend it to someone else - and then listen when others say it
is bad style. We learn by answering questions as well as posing them.)

And while your code may work in the circumstances that you tested it, it
will fail in other cases. In particular, it will fail when heavy
optimisations in a good compiler lead the compiler to take advantage of
the strict aliasing rules.

> They don't work on all of them, and therein lies the issue I take
> with that reality existing in the C /standard/. I believe that C should
> account for those peculiar hardware forms, but they should be extensions,
> and not part of the standard, because the standard should relate more
> closely to physical hardware in its current semiconductor form, which is
> the form which has existed since at least the 1980s in 32-bit form, and
> back to the early 70s in 4-bit and larger forms.

Again, this is all irrelevant.

>
> We have a tool that is well distributed (semiconductor-based processors).
> We have a toolset designed to program those processors (languages like
> C). And there's no reason the two of them shouldn't have a more
> fundamental awareness of each other, or at least that the toolset should
> not have a more fundamental awareness of the tool itself.
>
> And for what it's worth, I honestly don't understand why people I would
> consider to be experts like Ben, do not understand this fundamental
> relationship of the toolset knowing the tool in a more directly connected
> way.
>

You are hung up on the wrong points, that don't really matter here. Ben
has tried to make you see that.

Rick C. Hodgin

unread,

Aug 13, 2015, 10:33:17 AM8/13/15

to

On Thursday, August 13, 2015 at 9:46:33 AM UTC-4, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>
> > On Wednesday, August 12, 2015 at 5:41:42 PM UTC-4, Ben Bacarisse wrote:
> >> ... The argument is over the fact that you advocate using a union when
> >> you mean a type conversion from one type to another.
> >
> > Well, Ben... I know you interpret what I mean under the context of the
> > understanding you have about what you think I mean, but you are doing
> > so with values placed in areas where I do not have values, and you are
> > doing so with no values placed in areas where I do have values. As
> > such, we will never see eye-to-eye on this until one or more of us
> > changes... But, that being said...
> >
> > I do not mean type conversion.
>
> Yes, I know. That's exactly why you are wrong. You *should* mean
> conversion because that's what's required here. You are proposing an
> alternative to conversion that is not really an alternative.

My proposal is a change to the C standard so that all pointers are the
same size, and their underlying bits reflect an address in memory, and
are therefore completely interchangeable with no ill side effects, as
per the standard. If there are machine architectures which do not
adhere to this requirement, they can either work-around it by doing
whatever fixup is required to allow the C developer to write code
which assumes all pointers are the same size, and the underlying bits
do reflect an address, or issue warnings or even errors on use.

I'm wanting a change because I look at what the hardware is doing,
should be doing, and will likely be doing from this point forward on
all new or modern architectures.

Best regards
Rick C. Hodgin

David Brown

unread,

Aug 13, 2015, 10:39:29 AM8/13/15

to

1) Such changes are never going to happen.

2) Even if such changes are realistic, they will not have a backward
effect on existing code, compilers or C standards. And they would not
help the original poster in any way.

3) The changes /do/ /not/ /matter/. It is irrelevant that the sizes of
the pointers are the same - the code should /still/ be a conversion of
types. The compiler does not need to generate any code for this
conversion, but it is still a conversion. An x86 compiler today does
not generate any code here - all your proposed change to the standard
would do is guarantee that there is no need for generated code. The
/conversion/ is a type conversion, and it is still needed.

Rick C. Hodgin

unread,

Aug 13, 2015, 10:57:33 AM8/13/15

to

It's only critical in instructing the compiler how to generate code
which accesses data relative to the location it's pointing to. It's
not critical beyond that, and the underlying address of the pointer
will change, only the relative offsets to that address are what's
important, and that's where the type comes in ... at compile time.

Once we're in runtime, any accesses through that type will have been
generated to use that location in memory, plus the offsets to access
the members. And as such, it shouldn't matter how the pointer address
was populated, which should allow it to be valid in a union.

> However, for the issues under discussion here, the fact that pointers to
> different types are the same size on most architectures is irrelevant.
> Please put it out of your mind for now - everything I wrote applies to
> x86 and ARM as much as to AVR's or "fringe" systems.

A pointer is a pointer. And I would not use any system that had different
pointer sizes for data referenced as an int, or a struct, or as any other
form. I would consider that a bad design, a fatal flaw, and something to
be avoided.

The only cases where I would use a system which did have different pointer
sizes was it was part of a heterogeneous design, which employed different
cores internally which operated on different things. But, that is a
special beast, and I would expect to do some custom programming to
properly wield its peculiar data structure ... just as I would any other
machine that didn't use the same pointer sizes.

I would not make it part of the standard.

> You have to understand how types work in C, and what they mean. Just as
> a 32-bit "int" is a completely different thing than a 32-bit "float"
> despite being the same size, an int* pointer is a completely different
> thing than a float* pointer.

They are only different to the compiler, and how the compiler would
then access the underlying data. As far as the hardware is concerned,
it doesn't matter in cache or memory if the data is from a float, an
int, a char, a pointer, or a structure containing all kinds of various
types arranged throughout. It's just data.

The purpose of the union is to allow that address in memory to be
viewed differently, as by the peculiarities of each pointer type.
This could mean looking at an IEEE-754 floating point representation
as a series of bytes, or as an unsigned integer, or as part of a
structure which actually then shows invalid data because it was
never meant to be applied to it, but through an application of that
particular pointer, it is pulled through in that invalid way.

> There are times when types are different, but near enough that you can
> usefully convert between them - and perhaps usefully do so without any
> code being generated at all. But logically, in C, it is still a conversion.

If you are wanting to migrate from one form to another form then you use
the conversion. If you are simply wanting to look at the underlying data
differently, then you do not apply the conversion, but merely reference
it through the other pointer.

> It is a good thing to understand the compilation process, and be aware
> of how things work underneath. But it is not a good thing to get
> obsessive about it, or to make the mistake of thinking that you
> understand exactly what the compiler will do - compilers have a great
> deal of flexibility in how they generate code.

#1 I'm not obsessive about it. It just happens to be THE relationship
source code has with compiled code, and then compiled code has with the
hardware.

#2 I don't particularly care what the compiler will do. When I use a
new pointer type to access data within, and I access the members of
that data, I will be reaching into something the compiler should've
known about by the fact that it was referenced in that way. This
knowing would've prevented it from making some kind of optimization
which then alters what I would see in the structure.

> First, Ben dislikes your use of a union for type conversion because it
> is unclear, and does not express the programmer's intentions well.

I disagree that it is unclear because the fact that the pointer is in
a union conveys an implicit cast, and in the case of the extra type
member, it makes it clear how it is setup by quick inspection.

> While I agree with Ben, this is a somewhat subjective opinion - and
> obviously when one has used a particular style or habit for a long time,
> it becomes clear and obvious to that person. Thus we must rely on the
> opinions of others when judging whether a style is common or easily
> understood by other programmers.
>
> Secondly, I pointed out the strict aliasing issues with your solution.
> These are, to the best of my knowledge, facts - the C standards have
> rules about aliasing, and compilers use these rules when generating
> code. If you break these rules, your code is incorrect and has
> undefined behaviour, and whether it does or does not work as intended is
> a matter of circumstance.

And my point is the standard should be changed so use of unions in this
way is allowed. It provides much cleaner code, and in the case where
all pointers are equal, an exceedingly clear transform without the need
for the actual transform in code, just the reference.

> Thus your code is wrong, /even if it appears to work/. Being bad style
> is not necessarily a serious "crime", but it is not something to
> recommend to others. (Or since this is a self-correcting newsgroup, you
> /can/ recommend it to someone else - and then listen when others say it
> is bad style. We learn by answering questions as well as posing them.)

It is not wrong on the architectures I compile for. This ability is
provided for by the compiler as an extension to the C standard today
because the underlying architectures allow for it by design. It is
only "wrong" when going back against the C standard, and my argument
is the C standard needs changed.

> And while your code may work in the circumstances that you tested it, it
> will fail in other cases. In particular, it will fail when heavy
> optimisations in a good compiler lead the compiler to take advantage of
> the strict aliasing rules.

If that's the case, I'll either find another compiler which does not
have a broken optimization mechanism in this regard, or #pragma that
block to disable optimizations.

It's much cleaner code using unions, and much easier to follow along
with. I would rather have a marginally slower runtime environment and
use that coding style, than to be forced to code around it.

> > They don't work on all of them, and therein lies the issue I take
> > with that reality existing in the C /standard/. I believe that C should
> > account for those peculiar hardware forms, but they should be extensions,
> > and not part of the standard, because the standard should relate more
> > closely to physical hardware in its current semiconductor form, which is
> > the form which has existed since at least the 1980s in 32-bit form, and
> > back to the early 70s in 4-bit and larger forms.
>
> Again, this is all irrelevant.

And I disagree, of course.

> > We have a tool that is well distributed (semiconductor-based processors).
> > We have a toolset designed to program those processors (languages like
> > C). And there's no reason the two of them shouldn't have a more
> > fundamental awareness of each other, or at least that the toolset should
> > not have a more fundamental awareness of the tool itself.
> >
> > And for what it's worth, I honestly don't understand why people I would
> > consider to be experts like Ben, do not understand this fundamental
> > relationship of the toolset knowing the tool in a more directly connected
> > way.
>
> You are hung up on the wrong points, that don't really matter here. Ben
> has tried to make you see that.

And I disagree, of course.

And, since I'm working on my own compiler, it's not really an issue
because I see very clearly that C will never change, not even to include
the basic class syntax (something I view as a fundamental and catastrophic
mistake for the language because the basic class adds that much to our
source code abilities).

It's just the way it is. RDC ALL THE WAY, BABY!

David Brown

unread,

Aug 13, 2015, 11:46:12 AM8/13/15

to

You are not accessing hardware - you are writing code in C. If
something is important to the compiler, then it is important. It's as
simple as that.

C is not a target-independent assembler, as some people seem to think.
It is a programming language, and like most programming languages it has
a type system. The type system is important, though it is weak in C
compared to many other languages. It exists for good reasons - work
with it, not against it.

And C provides an abstraction from many of the details of the hardware.
It rarely matters how arithmetic is performed. You can write "x = y *
5;" in your code, and know what it will do, with a total disregard for
how it is implemented on the target. On some targets, this will be
handled with a shift and an add. Others will "misuse" odd addressing
modes. Others with use a multiply instruction, while some will use a
library function call. It does not matter to the C programmer.

Similarly, it does not matter how pointers are implemented.

What matters is what you write in the C language, and how that works -
and types are important there.

>
> Once we're in runtime, any accesses through that type will have been
> generated to use that location in memory, plus the offsets to access
> the members. And as such, it shouldn't matter how the pointer address
> was populated, which should allow it to be valid in a union.

To the extent that this is true, it does not matter.

But since types matter to the compiler, you don't have any assurance
that the code is the same. For example, if the compiler can tell under
the strict aliasing rules that a pointer operation cannot legally affect
the outcome of an operation, then it can skip it - but if the pointer
types are such that the code is properly defined, then it will do the
operation.

(If you don't understand what I mean by "strict aliasing", please say so.)

>
>> However, for the issues under discussion here, the fact that pointers to
>> different types are the same size on most architectures is irrelevant.
>> Please put it out of your mind for now - everything I wrote applies to
>> x86 and ARM as much as to AVR's or "fringe" systems.
>
> A pointer is a pointer. And I would not use any system that had different
> pointer sizes for data referenced as an int, or a struct, or as any other
> form. I would consider that a bad design, a fatal flaw, and something to
> be avoided.

First, you are in no position to make such judgements. You are welcome
to /dislike/ such architectures, but not do label them as "bad designs"
with "fatal flaws". There are architectures with more complex pointers
than you like - they are not made that way because of poor design, or to
annoy people, but because the balance of requirements made that a good
design decision.

And secondly, /again/, it is irrelevant to the points here.

>
> The only cases where I would use a system which did have different pointer
> sizes was it was part of a heterogeneous design, which employed different
> cores internally which operated on different things. But, that is a
> special beast, and I would expect to do some custom programming to
> properly wield its peculiar data structure ... just as I would any other
> machine that didn't use the same pointer sizes.
>
> I would not make it part of the standard.
>
>> You have to understand how types work in C, and what they mean. Just as
>> a 32-bit "int" is a completely different thing than a 32-bit "float"
>> despite being the same size, an int* pointer is a completely different
>> thing than a float* pointer.
>
> They are only different to the compiler, and how the compiler would
> then access the underlying data. As far as the hardware is concerned,
> it doesn't matter in cache or memory if the data is from a float, an
> int, a char, a pointer, or a structure containing all kinds of various
> types arranged throughout. It's just data.

Again, you are programming C, not hardware.

>
> The purpose of the union is to allow that address in memory to be
> viewed differently, as by the peculiarities of each pointer type.

Unions are orthogonal to pointers as features. The union lets you
access data in the same storage as different types. That is not its
main purpose - the main point of a union is to provide a storage unit
where different types of data can be stored in the same physical space.
In the earlier versions of the C specs, type punning by writing to one
union member and then reading from another was specifically disallowed
as undefined behaviour.

> This could mean looking at an IEEE-754 floating point representation
> as a series of bytes, or as an unsigned integer, or as part of a
> structure which actually then shows invalid data because it was
> never meant to be applied to it, but through an application of that
> particular pointer, it is pulled through in that invalid way.
>
>> There are times when types are different, but near enough that you can
>> usefully convert between them - and perhaps usefully do so without any
>> code being generated at all. But logically, in C, it is still a conversion.
>
> If you are wanting to migrate from one form to another form then you use
> the conversion. If you are simply wanting to look at the underlying data
> differently, then you do not apply the conversion, but merely reference
> it through the other pointer.

No, you should use a conversion if you want to make an object of one
type appear as another type. Conversions happen all the time in C, most
of them being implicit conversions.

If you want to examine the underlying bit representation, specifically
avoiding a conversion, then a union is the way to go.

>
>> It is a good thing to understand the compilation process, and be aware
>> of how things work underneath. But it is not a good thing to get
>> obsessive about it, or to make the mistake of thinking that you
>> understand exactly what the compiler will do - compilers have a great
>> deal of flexibility in how they generate code.
>
> #1 I'm not obsessive about it. It just happens to be THE relationship
> source code has with compiled code, and then compiled code has with the
> hardware.
>
> #2 I don't particularly care what the compiler will do. When I use a
> new pointer type to access data within, and I access the members of
> that data, I will be reaching into something the compiler should've
> known about by the fact that it was referenced in that way. This
> knowing would've prevented it from making some kind of optimization
> which then alters what I would see in the structure.
>
>> First, Ben dislikes your use of a union for type conversion because it
>> is unclear, and does not express the programmer's intentions well.
>
> I disagree that it is unclear because the fact that the pointer is in
> a union conveys an implicit cast, and in the case of the extra type
> member, it makes it clear how it is setup by quick inspection.

As I noted, this is somewhat a matter of opinion.

>
>> While I agree with Ben, this is a somewhat subjective opinion - and
>> obviously when one has used a particular style or habit for a long time,
>> it becomes clear and obvious to that person. Thus we must rely on the
>> opinions of others when judging whether a style is common or easily
>> understood by other programmers.
>>
>> Secondly, I pointed out the strict aliasing issues with your solution.
>> These are, to the best of my knowledge, facts - the C standards have
>> rules about aliasing, and compilers use these rules when generating
>> code. If you break these rules, your code is incorrect and has
>> undefined behaviour, and whether it does or does not work as intended is
>> a matter of circumstance.
>
> And my point is the standard should be changed so use of unions in this
> way is allowed. It provides much cleaner code, and in the case where
> all pointers are equal, an exceedingly clear transform without the need
> for the actual transform in code, just the reference.

The issues with strict aliasing are not restricted to unions. I don't
believe you understand what the phrase means.

Even using Ben's suggestion of casts, it is easy to make mistakes with
strict aliasing.

>
>> Thus your code is wrong, /even if it appears to work/. Being bad style
>> is not necessarily a serious "crime", but it is not something to
>> recommend to others. (Or since this is a self-correcting newsgroup, you
>> /can/ recommend it to someone else - and then listen when others say it
>> is bad style. We learn by answering questions as well as posing them.)
>
> It is not wrong on the architectures I compile for. This ability is
> provided for by the compiler as an extension to the C standard today
> because the underlying architectures allow for it by design. It is
> only "wrong" when going back against the C standard, and my argument
> is the C standard needs changed.

Please learn about (or ask about) strict aliasing. This has nothing to
do with the architectures you use or any other ideas about what the C
standards /should/ say.

Unless, of course, you want to say that the standard should not have
such strict aliasing rules (and then I will explain why it does have them).

>
>> And while your code may work in the circumstances that you tested it, it
>> will fail in other cases. In particular, it will fail when heavy
>> optimisations in a good compiler lead the compiler to take advantage of
>> the strict aliasing rules.
>
> If that's the case, I'll either find another compiler which does not
> have a broken optimization mechanism in this regard, or #pragma that
> block to disable optimizations.

It is merely your understanding that is broken, not the compiler.

But don't take it badly - a large number of people have trouble fully
understanding aliasing.

>
> It's much cleaner code using unions, and much easier to follow along
> with. I would rather have a marginally slower runtime environment and
> use that coding style, than to be forced to code around it.

Note that I have nothing against unions, and I fully agree they can lead
to cleaner code in some cases - also smaller and faster code, and, most
importantly, /correct/ code. But in general, what you want is a pointer
to a union - /not/ a union of pointers.

If you don't understand the aliasing issues in C, you are bound to get
them wrong in your own language. Learn C properly first, before trying
to improve on it.

Rick C. Hodgin

unread,

Aug 13, 2015, 12:15:15 PM8/13/15

to

On Thursday, August 13, 2015 at 11:46:12 AM UTC-4, David Brown wrote:
> You are not accessing hardware - you are writing code in C.

This sums up the disagreement I have with both the C standard, and
with the position you and Ben (and many others) hold. You look to
the standard as some abstract thing away from hardware. I look at
it as a translator which conveys my English-in-C-syntax into the
binary language the machine needs to conduct the workload I've
instructed. I recognize the machine at the far end of my
communication through C.

Bottom line: Were the sun to start emitting a form of radiation
which made all CPUs on Earth unable to work, I would never write
another line of C code. It's not an efficient mechanism to convey
complex ideas. It's just a procedural one. With people, I can use
language, music, song, poetry, and pictures, much more easily. I
might write C code as some form of nostalgia recounting about what
I did before we all became laborers again, or in an academic teaching
venue about what it was like before the sun changed... but apart from
that ... it would be farming for me. Raise a few animals, plant a
few crops. I'd settle somewhere near the Amish.

-----
We write C code because there ARE machines on the other side of
that code. It's as simple as that. I place both recognition and
value on that reality. I don't look only to one side of the
equation and say, well the other side ... it's of no concern to
me.

Balanced equations are important, my friend!

10 PRINT "RDC... TO THE FUTURE!"
20 GOTO 10

Richard Heathfield

unread,

Aug 13, 2015, 12:21:54 PM8/13/15

to

On 13/08/15 17:15, Rick C. Hodgin wrote:
> On Thursday, August 13, 2015 at 11:46:12 AM UTC-4, David Brown wrote:
>> You are not accessing hardware - you are writing code in C.
>
> This sums up the disagreement I have with both the C standard, and
> with the position you and Ben (and many others) hold. You look to
> the standard as some abstract thing away from hardware. I look at
> it as a translator which conveys my English-in-C-syntax into the
> binary language the machine needs to conduct the workload I've
> instructed. I recognize the machine at the far end of my
> communication through C.

That's a perfectly valid attitude. And, as long as you never have to
move your code to another platform, it is one that will serve you well.
Those of us who can't always know in advance which platform(s) our code
will be running on are less free to take that stance. Instead, we (try
to) rely only on the guarantees that the Standard gives us. And /that's/
a perfectly valid attitude too.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

Rick C. Hodgin

unread,

Aug 13, 2015, 12:26:42 PM8/13/15

to

On Thursday, August 13, 2015 at 12:21:54 PM UTC-4, Richard Heathfield wrote:
> On 13/08/15 17:15, Rick C. Hodgin wrote:
> > On Thursday, August 13, 2015 at 11:46:12 AM UTC-4, David Brown wrote:
> >> You are not accessing hardware - you are writing code in C.
> >
> > This sums up the disagreement I have with both the C standard, and
> > with the position you and Ben (and many others) hold. You look to
> > the standard as some abstract thing away from hardware. I look at
> > it as a translator which conveys my English-in-C-syntax into the
> > binary language the machine needs to conduct the workload I've
> > instructed. I recognize the machine at the far end of my
> > communication through C.
>
> That's a perfectly valid attitude. And, as long as you never have to
> move your code to another platform, it is one that will serve you well.
> Those of us who can't always know in advance which platform(s) our code
> will be running on are less free to take that stance. Instead, we (try
> to) rely only on the guarantees that the Standard gives us. And /that's/
> a perfectly valid attitude too.

Hi, Richard!

Well there's nothing to prevent a C compiler from recognizing the use
of unions for pointers on a machine that doesn't have the same size
pointers for different types, and automatically injecting the fixup
code for you either. It could inject an implicit cast on each use
to compensate for the machine's shortcomings.

The fact that the C standard forces it the other way today is that
which I take issue with. But there's no reason why it couldn't be
switched, apart from legacy code, in which case you use the
--legacy-enabled compile switch and voila! (of course, you could
also use the --legacy-disabled switch under the same type of
consideration I advocate above.)

Oh well.

10 PRINT "RDC... ENGAGE!"

Ben Bacarisse

unread,

Aug 13, 2015, 2:40:29 PM8/13/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

<snip>

> Well there's nothing to prevent a C compiler from recognizing the use
> of unions for pointers on a machine that doesn't have the same size
> pointers for different types, and automatically injecting the fixup
> code for you either.

Yes there is -- the definition of language prevents that. You could
have a C-like language whose unions did not behave like C unions, but it
would not be C. C's unions have a long and satisfactory history of
being used to re-interpret object representations, and it would be very
much more than tweak to the language to break that.

(And it's not just about size. You do get that don't you?)

<snip>
--
Ben.

Ben Bacarisse

unread,

Aug 13, 2015, 2:41:31 PM8/13/15

to

David Brown <david...@hesbynett.no> writes:
<snip>

> First, Ben dislikes your use of a union for type conversion because it
> is unclear, and does not express the programmer's intentions well.
> While I agree with Ben, this is a somewhat subjective opinion - and
> obviously when one has used a particular style or habit for a long time,
> it becomes clear and obvious to that person. Thus we must rely on the
> opinions of others when judging whether a style is common or easily
> understood by other programmers.

That sounds like I object on rather subjective grounds of style. I've
called Rick's union wrong, a lie and a hack. This is not really the
same as it being unclear, or not expressing the programmer's intentions
well. It's wrong (rather than unclear), and it expresses the wrong
intention very well indeed!

<snip>
--
Ben.

Ben Bacarisse

unread,

Aug 13, 2015, 2:41:33 PM8/13/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Thursday, August 13, 2015 at 9:46:33 AM UTC-4, Ben Bacarisse wrote:
>> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>>
>> > On Wednesday, August 12, 2015 at 5:41:42 PM UTC-4, Ben Bacarisse wrote:
>> >> ... The argument is over the fact that you advocate using a union when
>> >> you mean a type conversion from one type to another.
>> >
>> > Well, Ben... I know you interpret what I mean under the context of the
>> > understanding you have about what you think I mean, but you are doing
>> > so with values placed in areas where I do not have values, and you are
>> > doing so with no values placed in areas where I do have values. As
>> > such, we will never see eye-to-eye on this until one or more of us
>> > changes... But, that being said...
>> >
>> > I do not mean type conversion.
>>
>> Yes, I know. That's exactly why you are wrong. You *should* mean
>> conversion because that's what's required here. You are proposing an
>> alternative to conversion that is not really an alternative.
>

> My proposal is a change to the C standard [...]

Knock yourself out. It won't be happen.

But it sounded for all the world like you were suggesting an alternative
to a cast that someone using the language as it is currently defined
should use. Provided that's *not* what you are proposing, we're good.

(I doubt I'd care about very much about anything you want to change
about C, just as I doubt I'd care much about any UN resolutions you want
to get passed. I may not agree with them, but it's as much a waste of
my time arguing about them as it of yours to propose them.)

<snip>
--
Ben.

Rick C. Hodgin

unread,

Aug 13, 2015, 2:44:09 PM8/13/15

to

On Thursday, August 13, 2015 at 2:40:29 PM UTC-4, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
> <snip>
> > Well there's nothing to prevent a C compiler from recognizing the use
> > of unions for pointers on a machine that doesn't have the same size
> > pointers for different types, and automatically injecting the fixup
> > code for you either.
>
> Yes there is -- the definition of language prevents that.

Did you read the start of the next sentence there?

"The fact that the C standard forces it the other way today is

that which I take issue with..."

Rick C. Hodgin

unread,

Aug 13, 2015, 2:55:56 PM8/13/15

to

On Thursday, August 13, 2015 at 2:41:33 PM UTC-4, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
> > My proposal is a change to the C standard [...]
>
> Knock yourself out. It won't be happen.

I know. It will be supplanted as the horse drawn carriage was,
then relegated to nights out on the town for couples, and some
displays at old technology festivals.

"And here's where you see developers using a command line to
compile programs written in an ancient language, called C."

"Why'd they call it C, teacher?"

"There was a language before that called B, and C took many of
its traits from B."

"Why'd they call that language B, teacher?"

"Well we're not sure. It was made by Bell Labs, and had some
ties to something called BCPL. It may be short for one of
those."

"B? C? Why didn't they give their languages a real name,
teacher, and not just a letter?"

"That's a good question, Billy. I think maybe they were
limited on RAM and disk space back in those early days. Okay,
class, let's move along to the next exhibit. We don't want to
keep others from having their chance to see C. Oh! Look there.
I made a funny."

"Yeah. That was a good one, teacher."

"Thank you, Billy."

-----
And did you read this part in one of my replies?

"And, since I'm working on my own compiler, it's not really an

issue because I see very clearly that C will never change ...
It's just the way it is..."

Ben Bacarisse

unread,

Aug 13, 2015, 4:30:49 PM8/13/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Thursday, August 13, 2015 at 2:40:29 PM UTC-4, Ben Bacarisse wrote:
>> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>> <snip>
>> > Well there's nothing to prevent a C compiler from recognizing the use
>> > of unions for pointers on a machine that doesn't have the same size
>> > pointers for different types, and automatically injecting the fixup
>> > code for you either.
>>

>> Yes there is -- the definition of [the] language prevents that.
(typo edited)

> Did you read the start of the next sentence there?
>
> "The fact that the C standard forces it the other way today is
> that which I take issue with..."

Of course, but I don't really know what you'd have preferred. Help me
out. How should I have correct the apparently erroneous impression
given by your first paragraph above? Should I just have left it, hoping
that readers will see (because of what follows) that you must mean that
a C-like compiler can do what you claim? But then I'd still want to
point out the problems with such a C-like compiler. What is the best
reply I could have made to clear this detail up?

<snip>
--
Ben.

Ben Bacarisse

unread,

Aug 13, 2015, 4:31:09 PM8/13/15

to

"Rick C. Hodgin" <rick.c...@gmail.com> writes:

> On Thursday, August 13, 2015 at 2:41:33 PM UTC-4, Ben Bacarisse wrote:
>> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>> > My proposal is a change to the C standard [...]
>>
>> Knock yourself out. It won't be happen.
>
> I know. It will be supplanted as the horse drawn carriage was,
> then relegated to nights out on the town for couples, and some
> displays at old technology festivals.

What is the "it"? C? I think C is over-used; and it's often used for
quasi-emotional reasons. C is sometimes seen as the language that "real
programmers" use (something a bit like "real men" -- the programming
equivalent of swinging a big axe) whilst also being simple enough for
beginners to learn. I suspect (with, I freely, admit little evidence)
that many buggy bits of software are born from this unfortunate
perception.

For many tasks there are better languages, but I doubt that C will be
totally supplanted any time soon. It occupies a particular niche in the
language ecosystem, due, in part, to some of the things you don't like
about it. Ironically, it's partly because people like you continue to
use it that it won't be supplanted. Could you not think of a better
language to use?

<snip>

> And did you read this part in one of my replies?
>
> "And, since I'm working on my own compiler, it's not really an
> issue because I see very clearly that C will never change ...
> It's just the way it is..."

Yes, I did. You've said that many times. I think it's an excellent
plan, but I hope you are planning to write the RDC compiler in RDC, then
you won't have any reason to devise new C hacks.

--
Ben.

Rick C. Hodgin

unread,

Aug 13, 2015, 5:56:37 PM8/13/15

to

On Thursday, August 13, 2015 at 4:30:49 PM UTC-4, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>
> > On Thursday, August 13, 2015 at 2:40:29 PM UTC-4, Ben Bacarisse wrote:
> >> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
> >> <snip>
> >> > Well there's nothing to prevent a C compiler from recognizing the use
> >> > of unions for pointers on a machine that doesn't have the same size
> >> > pointers for different types, and automatically injecting the fixup
> >> > code for you either.
> >>
> >> Yes there is -- the definition of [the] language prevents that.
> (typo edited)
>
> > Did you read the start of the next sentence there?
> >
> > "The fact that the C standard forces it the other way today is
> > that which I take issue with..."
>
> Of course, but I don't really know what you'd have preferred. Help me
> out. How should I have correct the apparently erroneous impression
> given by your first paragraph above?

My first paragraph was conveying the idea that there is nothing inherent
in hardware, or the universe, which would prevent it from operating in
the manner I suggest (unions for pointers working correctly).

My second paragraph conveyed the source of the constraint as it exists
today, which is only the C standard itself.

> Should I just have left it, hoping
> that readers will see (because of what follows) that you must mean that
> a C-like compiler can do what you claim? But then I'd still want to
> point out the problems with such a C-like compiler.

There is no problem with such a C-like compiler.

> What is the best
> reply I could have made to clear this detail up?

I can't help you out there, Ben. I feel a kinship to you on some of
your posts, and on others I sit back scratching my head wondering to
myself, "Huh?" :-)

Rick C. Hodgin

unread,

Aug 13, 2015, 6:15:26 PM8/13/15

to

On Thursday, August 13, 2015 at 4:31:09 PM UTC-4, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>
> > On Thursday, August 13, 2015 at 2:41:33 PM UTC-4, Ben Bacarisse wrote:
> >> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
> >> > My proposal is a change to the C standard [...]
> >>
> >> Knock yourself out. It won't be happen.
> >
> > I know. It will be supplanted as the horse drawn carriage was,
> > then relegated to nights out on the town for couples, and some
> > displays at old technology festivals.
>
> What is the "it"?

An indefinite pronoun.

> C?

The third letter of the English alphabet.

> I think C is over-used;

I think E is over-used.

> and it's often used for
> quasi-emotional reasons.

"C"rying, for example, though I would liken that to a real emotional
state, though the underlying cause(s) of the tears could stem from
quasi-real reasons... but I digress.

> C is sometimes seen as the language that "real
> programmers" use (something a bit like "real men" -- the programming
> equivalent of swinging a big axe) whilst also being simple enough for
> beginners to learn. I suspect (with, I freely, admit little evidence)
> that many buggy bits of software are born from this unfortunate
> perception.

I was with you, and then you lost me at "many buggy bits."

> For many tasks there are better languages, but I doubt that C will be
> totally supplanted any time soon. It occupies a particular niche in the
> language ecosystem, due, in part, to some of the things you don't like
> about it. Ironically, it's partly because people like you continue to
> use it that it won't be supplanted.

I use it because I don't have a choice. It is a particular flavor of a
language I haven't seen elsewhere, close to hardware, yet not overrun by
it, and to my knowledge there isn't a better language for what C does.
It just needs some tweaking (says Rick from an assembly background where
real freedom is found, along with a tremendous amount of typing).

> Could you not think of a better
> language to use?

I could think of a better language to use: RDC. It's just that at the
time it hadn't been invented yet. And even now, it's still largely on
the drawing board ... but strides are being made.

> <snip>
> > And did you read this part in one of my replies?
> >
> > "And, since I'm working on my own compiler, it's not really an
> > issue because I see very clearly that C will never change ...
> > It's just the way it is..."
>
> Yes, I did. You've said that many times. I think it's an excellent
> plan, but I hope you are planning to write the RDC compiler in RDC, then
> you won't have any reason to devise new C hacks.

Did I really devise a "new C hack?" Why, I had no idea! Is there money
in that? Maybe I could apply for a patent.

Seriously though ... it is actually my goal. I intend to write RDC in
my incorrect C/C++ usage on a machine that supports it correctly, until
I get the source code to the point where RDC is sufficiently developed
to essentially bootstrap itself. Then I'll switch over, and never look
back.

-----
You may find this interesting as well, Ben:

I also plan for RDC to provide eventual full compliance with the C
standard versions (at least up to C99), by using a compiler switch,
enabling those people who wish to migrate their C code away from the
confines of C, to swim instead the warm seas of coding freedom...

Of course I could be a little over dramatic there. :-)

I actually think C (and when I say "C" I mean "the version of C /I
think/ should exist, which is mostly C with a few things removed, and
a few things added, basically RDC" in other words) is one of the best
programming languages ever devised.

I disagree with the C standard on many points and believe there should
be a closer tie to hardware. But, that's me looking at it through my
x86/ARM/similar-architectures' eyes. I still think it's a correct path
in moving forward. Having oddities in hardware needs to have a real
payoff to make it worthwhile, and I can't help but wonder how many of
those oddities were introduced simply to work around patents, for
example, rather than to provide any real useful contribution. Just
my thinking on the matter. I could be wrong.

crisd...@gmail.com

unread,

Aug 13, 2015, 6:48:28 PM8/13/15

to

On Tuesday, August 11, 2015 at 5:51:48 PM UTC-7, Richard Heathfield wrote:
> On 12/08/15 01:35, Rick C. Hodgin wrote:
> > There's a place for the language of C,
> > where if ever you find you to be,
> > take ye heed of the locals,
> > for their "compilers" are vocal,
> > in matters of triviality.
>
> A limerick's rules are so tight
> That you have to compose them just right.
> The rhyming, the rhythm,
> You gotta stick with 'em,
> This is not haiku.
>

<golf clap>

David Brown

unread,

Aug 14, 2015, 2:51:23 AM8/14/15

to

On 13/08/15 18:21, Richard Heathfield wrote:
> On 13/08/15 17:15, Rick C. Hodgin wrote:
>> On Thursday, August 13, 2015 at 11:46:12 AM UTC-4, David Brown wrote:
>>> You are not accessing hardware - you are writing code in C.
>>
>> This sums up the disagreement I have with both the C standard, and
>> with the position you and Ben (and many others) hold. You look to
>> the standard as some abstract thing away from hardware. I look at
>> it as a translator which conveys my English-in-C-syntax into the
>> binary language the machine needs to conduct the workload I've
>> instructed. I recognize the machine at the far end of my
>> communication through C.
>
> That's a perfectly valid attitude. And, as long as you never have to
> move your code to another platform, it is one that will serve you well.
> Those of us who can't always know in advance which platform(s) our code
> will be running on are less free to take that stance. Instead, we (try
> to) rely only on the guarantees that the Standard gives us. And /that's/
> a perfectly valid attitude too.
>

As usual, there is a happy medium to be found (or at least, we can
search for it!). I think one can never really be a happy and productive
programmer if you are fanatically at one end or the other of this line -
people who are tied to the idea of a compiler being a direct
C-to-assembly translator will always be upset when the compiler
generates code that doesn't match their expectations, and people who
want to be absolute about coding strictly to the standards will find
occasional pieces of code to be particularly unpleasant or inefficient,
or even impossible, and will get frustrated with compilers which don't
stick to the standards properly.

Personally, I prefer to understand how the compiler "thinks" in terms of
the standards, /and/ consider how it generates the resulting code. It
is not enough for me that a particular run of the compiler generates
code that happens to work - I want to be sure that the compiler
generates working object code because the source code is /correct/. And
I also - for at least some of my code - need to know the object code
that is being produced, because my code may rely on the speed or size of
the code as well as the functionality.

So (to the greatest extent I am able) I write code that matches the
standards regarding defined and undefined behaviour, but it is not
unusual for me to rely on implementation-defined behaviour in order to
get the best code (in terms of neatest and clearest source code, and/or
most efficient object code).

And the reason I work to C standards is not because they are some sort
of holy book, but because they form the contract between the compiler
and the programmer - basically, they are part of the user manual for the
toolchain.

Rick C. Hodgin

unread,

Aug 14, 2015, 3:04:06 AM8/14/15

to

On Friday, August 14, 2015 at 2:51:23 AM UTC-4, David Brown wrote:
> And the reason I work to C standards is not because they are some sort

> of holy book...

David, has anyone ever shared the gospel message with you? Taught you
who Jesus Christ is? And why He is important?

David Brown

unread,

Aug 14, 2015, 3:09:17 AM8/14/15

to

I was trying to be a little diplomatic and understating your objections,
in the hope of getting Rick to concentrate on and understand the real
issues here.

In particular, if you use an idiom yourself in your own code, and only
you work with it (or you inform others of its meanings), then it is a
matter of style even if makes no sense to everyone else. Clarity, and
the meaning of code, is somewhat in the eye of the beholder, and a
matter of style. (There are some regulars in this group that post code
that appears as line noise to most of us, yet the authors think they
have a good and clear style.) Of course it is usually best to have a
style that is clear and understandable to as wide an audience as
possible. Thus Rick's unions are not a lie to himself in his own code,
but could be called a hack, or a lie to other people.

But am I right in thinking you label Rick's unions as /wrong/ primarily
because it is a bad way to write the code, hiding the intentions and
workings behind an odd construct rather than making the clear and
correct casts? (If we pretend for the moment that all compilers have
the same size and representation for all pointers, and that any pointer
casts are equivalent to reinterpretations of the same bits - as they are
on the platforms Rick uses.) In other words, the code is wrong even
though it works?

(I am quite happy with labelling code as "wrong" even when it works,
although I know some people view run-time functionality as the only
criteria for right or wrong code.)

What did you think of my concerns about aliasing, discussed in other posts?

David Brown

unread,

Aug 14, 2015, 3:30:09 AM8/14/15

to

On 13/08/15 22:11, Ben Bacarisse wrote:
> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>
>> On Thursday, August 13, 2015 at 2:41:33 PM UTC-4, Ben Bacarisse wrote:
>>> "Rick C. Hodgin" <rick.c...@gmail.com> writes:
>>>> My proposal is a change to the C standard [...]
>>>
>>> Knock yourself out. It won't be happen.
>>
>> I know. It will be supplanted as the horse drawn carriage was,
>> then relegated to nights out on the town for couples, and some
>> displays at old technology festivals.
>
> What is the "it"? C? I think C is over-used; and it's often used for
> quasi-emotional reasons. C is sometimes seen as the language that "real
> programmers" use (something a bit like "real men" -- the programming
> equivalent of swinging a big axe) whilst also being simple enough for
> beginners to learn. I suspect (with, I freely, admit little evidence)
> that many buggy bits of software are born from this unfortunate
> perception.
>
> For many tasks there are better languages, but I doubt that C will be
> totally supplanted any time soon. It occupies a particular niche in the
> language ecosystem, due, in part, to some of the things you don't like
> about it. Ironically, it's partly because people like you continue to
> use it that it won't be supplanted. Could you not think of a better
> language to use?
>

I too think C is over-used. It is certainly the best choice for some
uses (in small embedded systems, it is dominant for good reason - though
even here there are worthwhile alternatives such as C++, Ada, or Forth).
But there is a great deal of code written in C for the wrong reasons,
such as speed (because speed is often unimportant, because algorithms
are often more important than the details of the language, because many
C programmers don't understand how to write fast code, because many
alternative languages are not nearly as slow as people think, etc.).

I see a lot of questions in this newsgroup where the really useful
answer is "don't try to do this in C - use language XX instead". Often
that language is C++, but often it is a higher level language such as
Python. (It's the higher level language I know best, rather than
necessarily being the best high level language.) Look at the long
thread on "How to read a data file whose size is unknown". The answer
is simple, at least for ordinary files (which is the main consideration) :

data = file(filename).read()

No faffing around with different OS's and different API's, no messy
memory allocations and re-allocations, or loops to read the file blocks,
or figuring out how fstat works - just a single line of Python code that
works on all major platforms. Someone else has already done all the
work - with all the details handled in the Python runtime and libraries
(written in C for speed!). And they will have spent more time and
effort getting /all/ the details right, and testing on a range of
platforms and circumstances - the code will be better than a typical C
programmer would write themselves, even with the advice from this group.

C is a hammer. It's a good hammer - well-built, solid and reliable, and
can be taken anywhere. But it's still just a hammer - if you don't use
it properly, you'll hit your thumb or bend your nail. And it is not the
only tool in the toolbox - sometimes that big expensive power drill
really is a better tool for the job in hand.