struct big_struct {
int a, b;
double c;
};
struct small_struct {
int a, b;
};
int
main(void)
{
struct big_struct big;
struct small_struct *smallp = (void *)&big;
smallp->a = 11; /* is this OK? */
return big.a;
}
gcc with -ansi -pedantic -W -Wall is quiet about it...
> I have many structs, and it would be very helpful if I could read and
> write to common initial sequence of them. I know I can use union of
> structs to read common initial sequence, but I'm not sure about writing
> to common fileds, also Id like avoid union because there will be many
> small structs and few big ones, so there is no need to waste memory. To
> illustrate my question here is example what I'd like to have:
>
> struct big_struct {
> int a, b;
> double c;
> };
>
> struct small_struct {
> int a, b;
> };
>
> int
> main(void)
> {
> struct big_struct big;
> struct small_struct *smallp = (void *)&big;
>
> smallp->a = 11; /* is this OK? */
No, it is not OK.
> return big.a;
> }
>
> gcc with -ansi -pedantic -W -Wall is quiet about it...
Turn on optimisation and you get:
$ gcc -ansi -pedantic -W -Wall -O2 x.c
x.c: In function 'main':
x.c:16: warning: dereferencing pointer 'smallp' does break
strict-aliasing rules
x.c:14: note: initialized from here
x.c:17: warning: 'big' is used uninitialized in this function
You can tell gcc not to apply strict aliasing rules but that's making
your code depend on a compiler flag.
A union is almost certainly the way to go but with more detail you
might get more detailed help.
--
Ben.
Here is my thought.
I *believe* that, if you declare the union, you are covered by the guarantee
even if the individual structs are not specifically known to be in the union.
And furthermore, since that would be true even if the union were declared in
another translation unit, in practice, I am pretty sure it has to be true
even if you don't declare the union.
But if all else fails, I think you have some alternatives:
1. Create a header that has, for all pairs of struct types:
union_foo_bar {
struct foo foo;
struct bar bar;
}
Include this, and you are then (so far as I can tell), guaranteed that the
layout of foo and bar must be such that you can access a common initial
subsequence of them.
2. If you want a common initial subsequence, make it a struct type:
struct initial_subsequence {
int common;
};
struct foo {
struct initial_subsequence i;
};
struct bar {
struct initial_subsequence i;
};
3. Or make it a macro:
#define INITIAL_SUBSEQUENCE \
int common;
struct foo {
INITIAL_SUBSEQUENCE
};
struct bar {
INITIAL_SUBSEQUENCE
};
I think all of these are going to work in practice, and I'm pretty sure 2 and
3 are even guaranteed to work rather than merely inevitably going to work.
-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
> On 2010-02-26, Dominik Zaczkowski <d...@pro.wp.pl> wrote:
>> Hi,
>> I have many structs, and it would be very helpful if I could read and
>> write to common initial sequence of them. I know I can use union of
>> structs to read common initial sequence, but I'm not sure about writing
>> to common fileds, also Id like avoid union because there will be many
>> small structs and few big ones, so there is no need to waste memory. To
>> illustrate my question here is example what I'd like to have:
>
> Here is my thought.
>
> I *believe* that, if you declare the union, you are covered by the guarantee
> even if the individual structs are not specifically known to be in the union.
I am no sure what you mean by this. I assume "the guarantee" is the
special provision of 6.5.2.3 p5. That requires the structs to be
members of a signle union and for "a declaration of the complete type
of the union" to be visilible. That *sounds* as if it is at odds with
what you say here.
> And furthermore, since that would be true even if the union were declared in
> another translation unit, in practice, I am pretty sure it has to be true
> even if you don't declare the union.
This sounds like a more dramatic version of the previous statement but
I can't see what support there is for it in the language.
> But if all else fails, I think you have some alternatives:
>
> 1. Create a header that has, for all pairs of struct types:
> union_foo_bar {
> struct foo foo;
> struct bar bar;
> }
> Include this, and you are then (so far as I can tell), guaranteed that the
> layout of foo and bar must be such that you can access a common initial
> subsequence of them.
Since the access must be via the union, what advantage is there in
this pairing system? If you know you have a bar, just access it as
one. The point of the special guarantee about common initial
sequences is that you can use any of them before you know what else is
really in the union (if you see what I mean), but all the options must
be there in the /same/ union. At least that's how I read it.
> 2. If you want a common initial subsequence, make it a struct type:
>
> struct initial_subsequence {
> int common;
> };
>
> struct foo {
> struct initial_subsequence i;
> };
>
> struct bar {
> struct initial_subsequence i;
> };
>
> 3. Or make it a macro:
> #define INITIAL_SUBSEQUENCE \
> int common;
>
> struct foo {
> INITIAL_SUBSEQUENCE
> };
>
> struct bar {
> INITIAL_SUBSEQUENCE
> };
>
> I think all of these are going to work in practice, and I'm pretty sure 2 and
> 3 are even guaranteed to work rather than merely inevitably going to work.
You don't describe how to use these but I don't see anything here that
helps with the problem. Have I missed you point?
I think you are concentrating on the layout, not on what a compiler
might assume about access to an object of type T1 via an lvalue
expression of type T2. The killer here is the language's permission
to the compiler to assume "strict aliasing".
--
Ben.
Which implies that the entirety of Windows COM is bogus. Not to
mention most of the Unix networking code (If I remember correctly,
which I might not). I mean a COM class structure begins with a pointer
to a VTable (a structure containing function pointers). It is common
for any COM class to pretend it's an IUnknown, or an IDispatch given
that they share common initial functions.
In other words (I've started on this COM thing so I suppose I should
explain):
An IUnknown VTable starts with:
QueryInterface
AddRef
Release
And our IFooBar might look like:
QueryInterface
AddRef
Release
DoSomething
DoOtherthings
And an IFooBar can be referenced as if it was an IUnknown. Exactly
like big_struct above is pretending to be a small_struct. It isn't
breaking any alignment rules and it would be perfectly OK in assembly,
so my gut feeling is that gcc can go jump in a lake :-)
Conor.
> I am no sure what you mean by this. I assume "the guarantee" is the
> special provision of 6.5.2.3 p5. That requires the structs to be
> members of a signle union and for "a declaration of the complete type
> of the union" to be visilible. That *sounds* as if it is at odds with
> what you say here.
Here's the thing.
struct foo { int a, b; double d; };
struct bar { int a, b; long l; };
union foobar { struct foo f; struct bar b; };
Whether I have put a struct foo or a struct bar in foobar, I can treat it as
the other for purposes of that common initial subsequence (a and b).
What that means is that I can prove that, given a union foobar in which I
know I have either a struct foo or a struct bar:
&(fb.f.b) == &(fb.b.b);
I also know that:
&(fb.f) == &(fb);
Convert to (unsigned char *), mess around a bit, and it turns out that, given
a struct foo, I know that &(f.b) has the same address that &(b.b) would have
if it were actually a struct bar.
Which turns out to mean that the structs have to actually have a common
initial layout. Because I can convert the pointers to the various parts back
and forth, and prove that they must be the same pointers, because they point
to the same things.
Which means that the mere fact that the union exists means that the type
punning must work unless the compiler is going out of its way to detect and
break that.
Except, of course, for one trivial little thing: The compiler is allowed to
make optimization decisions based on the assumption that, if I have a function
like:
int dummy(struct foo *f, struct bar *b);
that it can assume that writes through f don't modify b.
... Or can it? Consider the case where f and b happen to be pointers to the
respective members of the same union, and dummy() never makes any reference
to fields outside the common initial sequence.
I'm pretty sure it has to work, because they ARE allowed to alias each other
for that common initial sequence of the union.
Now consider what happens if I compile dummy, and then later I write a new
module which declares union foobar, and call dummy with objects contained
in that union.
In practice, it *has* to work. I don't think it's required by the spec, but
it would require a ludicrous amount of effort to break it without breaking
something which is required by the spec.
> This sounds like a more dramatic version of the previous statement but
> I can't see what support there is for it in the language.
None directly. It's an observation based on what we know about the generated
code for dummy() above.
> Since the access must be via the union, what advantage is there in
> this pairing system?
Hmm. Imagine that I've got my union, and I call:
dummy(&fb.f, &fb.b);
Have I accessed them via the union?
> You don't describe how to use these but I don't see anything here that
> helps with the problem. Have I missed you point?
I don't know.
> I think you are concentrating on the layout, not on what a compiler
> might assume about access to an object of type T1 via an lvalue
> expression of type T2. The killer here is the language's permission
> to the compiler to assume "strict aliasing".
Ahh, I see.
I'm not sure about that. I think that once you're dealing with pointers to
objects, it seems like structs with common initial sequences probably need to
be not subject to strict aliasing rules, because they could be in a union
together. Not sure, though.
>> struct big_struct {
>> int a, b;
>> double c;
>> };
>>
>> struct small_struct {
>> int a, b;
>> };
>>
>> int
>> main(void)
>> {
>> struct big_struct big;
>> struct small_struct *smallp = (void *)&big;
>>
>> smallp->a = 11; /* is this OK? */
>
> No, it is not OK.
What's the problem with it, what could go wrong?
(Perhaps smallp->b might need extra care, but accessing the very first
field?)
--
Bartc
>> 2. If you want a common initial subsequence, make it a struct type:
>>
>> struct initial_subsequence {
>> int common;
>> };
>>
>> struct foo {
>> struct initial_subsequence i;
>> };
>>
>> struct bar {
>> struct initial_subsequence i;
>> };
> You don't describe how to use these but I don't see anything here that
> helps with the problem.
In article <87d3zsq...@kraina-oz.ath.cx>,
Dominik Zaczkowski <d...@pro.wp.pl> writes:
> I have many structs, and it would be very helpful if I could read and
> write to common initial sequence of them. I know I can use union of
> structs to read common initial sequence, but I'm not sure about writing
> to common fileds, also Id like avoid union because there will be many
> small structs and few big ones, so there is no need to waste memory. To
> illustrate my question here is example what I'd like to have:
>
> struct big_struct {
> int a, b;
> double c;
> };
>
> struct small_struct {
> int a, b;
> };
>
> int
> main(void)
> {
> struct big_struct big;
> struct small_struct *smallp = (void *)&big;
>
> smallp->a = 11; /* is this OK? */
> return big.a;
> }
>
> gcc with -ansi -pedantic -W -Wall is quiet about it...
struct small_struct {
int a, b;
};
struct big_struct {
struct small_struct small;
double c;
};
int
main(void)
{
struct big_struct big;
struct small_struct *smallp = (struct small_struct *)&big;
smallp->a = 11; /* This is OK, C99 6.7.2.1 p13 */
return big.small.a;
}
To me it seems that Seebs' second suggestion solves exactly the problem
Dominik asked about (accessing a common initial subsequence without
knowing the exact type -- the only criterion is that the first member of
all relevant structures is the base structure).
The same requirement is present in C89 6.5.2.1.
Cheers,
lacos
You aren't supposed to do this either:
typedef struct
{
uint16_t a;
uint16_t b;
uint16_t c;
} Foo;
typedef struct
{
uint16_t a;
uint16_t b;
uint16_t c;
} Bar;
Foo* foo;
Bar* bar;
The compiler will assume that *foo and *bar never refer to the same
location, even though the contents of the structures are the same.
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
There is currently a big thing going on in the GNU mailing lists about
this too:
http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html
which was linked to from:
http://davmac.wordpress.com/2010/01/08/gcc-strict-aliasing-c99/
I dunno. Linus has had various things to say on the subject too,
including that the Linux kernel uses "-fno-strict-aliasing".
Rules are meant to be borken.
> not as if gcc with -O2 generates thread safe code
Please give some examples. -O2 works alright for me. It should for you
too, unless you try to do tricks with volatile and "atomics" (I'm
talking C) and generally trying to avoid locking stuff properly.
AFAICT neither SUSv1 nor SUSv2 specifies a memory model. SUSv3 (which is
also POSIX:2004) and SUSv4 (which is also POSIX:2008) kind of do
("Memory Synchronization").
http://www.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap04.html#tag_04_10
http://www.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11
I firmly believe it is safe to expect the same synchronization
guarantees from the listed pthread_*() functions under SUSv2 as well.
A few links:
Threads Basics (Boehm) ("basic" in Boehm's terminology only)
http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/threadsintro.html
Some blog posts by Bartosz Milewski:
Multicores and Publication Safety
http://bartoszmilewski.wordpress.com/2008/08/04/multicores-and-publication-safety/
Who ordered memory fences on an x86?
http://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fences-on-an-x86/
Who ordered sequential consistency?
http://bartoszmilewski.wordpress.com/2008/11/11/who-ordered-sequential-consistency/
C++ atomics and memory ordering
http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/
Cheers,
lacos
Has this come up in comp.std.c? If this was valid (casting
pointers to structs around if they both start with the same
members), it would allow inheritance to be simulated in C.
--
Andrew Poelstra
http://www.wpsoftware.net/andrew
I just read Lacos' post explaining how to do this with
the current rules of C, so I guess this missed feature
was not missed after all!
I think it's in the spec...
Basically, the idea is, compilers can produce better code if they can safely
assume that objects don't overlap. One way of demonstrating that is to know
that they are of different types.
I think you're overstating the case. The Standard states,
explicitly, that a pointer to a struct object is interconvertible
with a pointer to the struct's first element, and that a pointer
to a union is interconvertible with pointers to any of its elements.
Hence,
struct foo { int a; char *b; } f;
struct foo *fptr = &f;
int *aptr = &f.a;
Lo! Two pointers of different (and complete) types, both to the
same spot. And no aliasing rule is broken (6.5p7).
It is also permissible to use pointers to signed and unsigned
versions of "the same" integer type to access a single object of
that type, and if the only values stored are in the common subset
of both ranges no harm will be done:
int value;
unsigned int *ptr = (unsigned int*) &value;
*ptr = 42;
assert (value == 42);
Finally, it is *always* permissible to use any of the three
kinds of character pointer to access the individual bytes of any
object of any type whatever. Plain char and signed char may need
to worry about trap representation troubles here, but unsigned char
is unafraid of even that bogeyman.
--
Eric Sosman
eso...@ieee-dot-org.invalid
Normally the compiler can assume that pointers to different types
point to different memory. If p and q point to different struct types,
and you write say p->x = 1; q->x++; then the compiler can assume that
after the second assignment p->x still equals 1. Optimising compilers
will use this knowledge; for example they can exchange the order of
the two statements.
However, if *p and *q are part of the same union, _and_ the compiler
has seen the declaration of the union type, then it is guaranteed that
changing q->x will change p->x (assuming they are common initial
fields of both struct types).
AH, now its clear to me, thanks for detailed answer.
I did say that was a deliberate troll. I had to google a bit, because
I could not remember, but I found the thread I was referring to:
http://gcc.gnu.org/ml/gcc/2007-10/msg00266.html
and then
http://gcc.gnu.org/ml/gcc/2007-10/msg00275.html
followed by a large thread! I was never sure what the conclusions
were. Suppose I should read the whole thread again.
As for the original post, I agree with yourself and Seeb (in his
second suggestion as you say) that this:
struct big_struct {
struct small_struct small;
double c;
};
is a compliant way of solving the problem. Good one.
Eric Sosman said:
> Lo! Two pointers of different (and complete) types, both to the
> same spot. And no aliasing rule is broken (6.5p7).
Yes, I was overstating the case somewhat. I got the impression that
the gcc developers were saying that was not allowed and the Standard
was wrong! To the extent that one was not allowed to reuse malloced
memory to use a different type. I'm afraid I'll have to reread those
gcc threads a little slower and try to understand what's going on.
As for COM, I suppose that will never be C99 conformant. I can't
imagine MS are sweating about it.
Conor.
I guess the cited paragraph is in both mentioned editions of the ISO C
standard precisely to keep "single inheritance" working. You would put a
pointer to a static struct of function pointers (vtbl) into the base
struct.
---- base.h ----
#include <stdio.h> /* FILE */
/* opaque (incomplete) type */
struct base;
/* constructor */
struct base *base_new(int a);
/* virtual functions for base-derived objects */
int virt_print(FILE *f, const struct base *base);
void virt_release(struct base *base);
---- base_internals.h ----
/* Could be merged into base.h for convenience. */
#include "base.h" /* struct base */
/* vtbl type */
struct ops
{
int (*print)(FILE *f, const struct base *);
void (*release)(struct base *);
};
/* base type */
struct base
{
int a;
const struct ops *ops;
};
/* initializer for base type without allocation */
void
base_init(struct base *base, int a, const struct ops *ops);
---- base.c ----
#include <stdlib.h> /* malloc() */
#include "base_internals.h" /* struct base, struct ops */
/*
implementations of "virtual methods" for the base type
-- note the internal linkage
*/
static int
base_print(FILE *f, const struct base *base)
{
return fprintf(f, "(%d)\n", base->a);
}
static void
base_release(struct base *base)
{
free(base);
}
void
base_init(struct base *base, int a, const struct ops *ops)
{
base->a = a;
base->ops = ops;
}
/* constructor */
struct base *
base_new(int a)
{
struct base *base;
base = malloc(sizeof *base);
if (0 != base) {
/* vtbl instance of static storage duration for base objects */
static const struct ops ops = { &base_print, &base_release };
base_init(base, a, &ops);
}
return base;
}
/* virtual functions */
int
virt_print(FILE *f, const struct base *base)
{
return (*base->ops->print)(f, base);
}
void
virt_release(struct base *base)
{
(*base->ops->release)(base);
}
---- derived.h ----
#include "base.h" /* struct base, virtual functions */
/* opaque (incomplete) type */
struct derived;
/* constructor */
struct derived *
derived_new(int a, const char *s);
---- derived.c ----
#include <stdlib.h> /* malloc() */
#include <string.h> /* strdup(), not ISO C */
#include "derived.h" /* struct derived */
#include "base_internals.h" /* struct base, struct ops */
struct derived
{
struct base base;
char *s;
};
/* "virtuals" -- internal linkage */
static int
derived_print(FILE *f, const struct base *base)
{
const struct derived *derived;
/*
This works because the cited paragraphs contain the phrase "vice
versa".
*/
derived = (const struct derived *)base;
return fprintf(f, "(%d, \"%s\")\n", base->a, derived->s);
}
static void
derived_release(struct base *base)
{
free(((struct derived *)base)->s);
free(base);
}
/* constructor */
struct derived *
derived_new(int a, const char *s)
{
char *tmp;
tmp = strdup(s);
if (0 != tmp) {
struct derived *derived;
derived = malloc(sizeof *derived);
if (0 != derived) {
static const struct ops ops = {
&derived_print, &derived_release
};
base_init(&derived->base, a, &ops);
derived->s = tmp;
return derived;
}
free(tmp);
}
return 0;
}
---- main.c ----
#include <stdlib.h> /* size_t, EXIT_FAILURE */
#include "derived.h"
static void
try_dump(const struct base * const *objects, size_t num)
{
size_t idx;
for (idx = 0u; idx < num; ++idx) {
(void)virt_print(stdout, objects[idx]);
}
}
int
main(void)
{
int ret;
struct base *base;
ret = EXIT_FAILURE;
base = base_new(42);
if (0 != base) {
struct derived *derived;
derived = derived_new(-1, "derived");
if (0 != derived) {
const struct base *objects[2];
objects[0] = base;
objects[1] = (const struct base *)derived;
try_dump(objects, sizeof objects / sizeof objects[0]);
ret = EXIT_SUCCESS;
virt_release((struct base *)derived);
}
virt_release(base);
}
return ret;
}
---- Makefile ----
# Watch out for tabs!
.POSIX:
CC=gcc
CFLAGS=-ansi -pedantic -Wall -Wextra -D _XOPEN_SOURCE=500 -g3 -O2
LDFLAGS=
LIBS=
derive: base.o derived.o main.o
$(CC) -o derive $(LDFLAGS) base.o derived.o main.o $(LIBS)
base.o: base.c base_internals.h base.h
$(CC) $(CFLAGS) -c base.c
derived.o: derived.c derived.h base.h base_internals.h
$(CC) $(CFLAGS) -c derived.c
main.o: main.c derived.h base.h
$(CC) $(CFLAGS) -c main.c
clean:
rm -f *.o derive
$ make
gcc -ansi -pedantic -Wall -Wextra -D _XOPEN_SOURCE=500 -g3 -O2 -c base.c
gcc -ansi -pedantic -Wall -Wextra -D _XOPEN_SOURCE=500 -g3 -O2 -c derived.c
gcc -ansi -pedantic -Wall -Wextra -D _XOPEN_SOURCE=500 -g3 -O2 -c main.c
$ ./derive
(42)
(-1, "derived")
$ echo $?
0
Cheers,
lacos
> "christian.bau" <christ...@cbau.wanadoo.co.uk> writes:
Just to be clear, you can point to a common initial structure (at the
expense of more complex access code, of course) but that does not
remove the problem if you try to treat one kind of object as if it
were another. The "common struct" version of something similar to
your initial example would be:
#include <stdio.h>
struct common { int a, b; };
struct big_struct { struct common com; double c; };
struct small_struct { struct common com; };
int main(void)
{
struct big_struct big;
struct small_struct *otherp = (void *)&big;
big.com.a = 0;
otherp->com.a = 11;
printf("big.com.a = %d\n", big.com.a);
return 0;
}
and this can go wrong (my gcc with -O2 prints "big.com.a = 0"). The
correct code would have
struct common *otherp = &big.com;
(or the version with the cast that Laszlo wrote).
I hope I am not labouring the point when you've already got the idea
but I was concerned that might think that using an initial struct
solved the exact problem as you originally posed it (treating a struct
big_struct * as if it were a struct small_struct *).
--
Ben.
Thanks, very interesting. This may be a good example why "Threads Cannot
be Implemented as a Library"
(http://www.hpl.hp.com/techreports/2004/HPL-2004-209.html). On the other
hand, I would surely have written the code with locking in place (within
the "if"), and that would have given gcc a chance to see that a
speculative(?) store would be dangerous.
Cheers,
lacos
It means that the struct layouts must match but that (to me) was
never the issue.
I've left everything because I am no sure what is key anymore and I
agree with almost everything you've said. The effect of your argument
seems to be that the mere possibility of a union means that pointers to
structs that share initial sequences must always be considered as
possible aliases to each other and that this, in practise, is enough
for the OP to reply on. Is gcc wrong here:
#include <stdio.h>
struct big_struct { int a, b; double c; };
struct small_struct { int a, b; };
int main(void)
{
struct big_struct big;
struct small_struct *otherp = (void *)&big;
big.a = 0;
otherp->a = 11;
printf("big.a = %d\n", big.a);
return 0;
}
After complaining about the aliasing, the program prints 0 (with -O2
or above). This seems to me entirely permissible and exactly the kind
of thing the OP needs to avoid.
[Small point: the other thing that bothers me is that your persuasive
arguments about what must happen even when the union is not present
seems to be at odds with an example in the standard (6.5.2.3 p8
Example 3, second part). Are you sure that example is pointless and
that there are no circumstances in which a compiler might reasonably
cause such code to fail? I can't see any such circumstances, but I
never like to reply on my lack of imagination!]
--
Ben.
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/63f6360d939612b3
One example of a bug in GCC that breaks POSIX rules wrt mutex.
[...]
Greets
Then they would be wrong! Thanks for explaining this. I was aware that
optimisation was very difficult in the presence of pointers that can
point anywhere. I assumed optimisers were obliged to take the cautious
route unless sure.
Now I understand why some compilers don't optimise by default. I
thought it was just a speed issue.
James
> Dominik Zaczkowski <d...@pro.wp.pl> writes:
>
>> "christian.bau" <christ...@cbau.wanadoo.co.uk> writes:
>>
One final thing, Laszlo example is valid only when big_struct type is
complete, right?
If compiler can assume that pointers to different types point to
different memory locations, and at the point where I'm casting
big_struct pointer to common struct pointer the internals of big_struct
aren't known, behaviour is still undefined?
Now I see that best way to solve my problem will be Ben's example,
however I was hoping that I can keep big_struct incomplete. Well I can't
have all I want. :)
> #include <stdio.h>
>
> struct big_struct { int a, b; double c; };
> struct small_struct { int a, b; };
>
> int main(void)
> {
> struct big_struct big;
> struct small_struct *otherp = (void *)&big;
> big.a = 0;
> otherp->a = 11;
> printf("big.a = %d\n", big.a);
> return 0;
> }
> After complaining about the aliasing, the program prints 0 (with -O2
> or above). This seems to me entirely permissible and exactly the kind
> of thing the OP needs to avoid.
Hmm.
Well, here's the interesting case: What happens if you do it through a union?
Perhaps more interesting, what happens if the assignment to 11 goes in another
file, that sees those two declarations, and a union declaration, but then
you don't actually use the union when setting up the aliasing?
> [Small point: the other thing that bothers me is that your persuasive
> arguments about what must happen even when the union is not present
> seems to be at odds with an example in the standard (6.5.2.3 p8
> Example 3, second part). Are you sure that example is pointless and
> that there are no circumstances in which a compiler might reasonably
> cause such code to fail? I can't see any such circumstances, but I
> never like to reply on my lack of imagination!]
I'm not sure at all. It may well be that the intent is that, unless you can
actually see the union, you are allowed to assume that there's no aliasing --
thus, if you wanted to pass either of these to a function, and allow it to
involve aliasing, you'd have to do it by storing a copy in a union. That
seems a bit odd, though.
> for the OP to reply on. Is gcc wrong here:
>
> #include <stdio.h>
>
> struct big_struct { int a, b; double c; };
> struct small_struct { int a, b; };
>
> int main(void)
> {
> struct big_struct big;
> struct small_struct *otherp = (void *)&big;
> big.a = 0;
> otherp->a = 11;
> printf("big.a = %d\n", big.a);
> return 0;
> }
>
> After complaining about the aliasing, the program prints 0 (with -O2
> or above). This seems to me entirely permissible and exactly the kind
> of thing the OP needs to avoid.
Why does it print 0? (I could only get 11 with several compilers including
mingw 3.4.5)
And what happens to the 11? Is it stored in the right place but gcc assumes
that location still contains 0? (Despite the fact that the address of big
has been taken just a couple of lines before, so there must be some pointers
to it floating around.)
I'd be more concerned with gcc producing the wrong results. The programmer
shouldn't need to jump through hoops to achieve something so basic.
--
Bartc
> "Ben Bacarisse" <ben.u...@bsb.me.uk> wrote in message
> news:0.d243db41f5155047f9fa.2010...@bsb.me.uk...
>
>> for the OP to reply on. Is gcc wrong here:
>>
>> #include <stdio.h>
>>
>> struct big_struct { int a, b; double c; };
>> struct small_struct { int a, b; };
>>
>> int main(void)
>> {
>> struct big_struct big;
>> struct small_struct *otherp = (void *)&big;
>> big.a = 0;
>> otherp->a = 11;
>> printf("big.a = %d\n", big.a);
>> return 0;
>> }
>>
>> After complaining about the aliasing, the program prints 0 (with -O2
>> or above). This seems to me entirely permissible and exactly the kind
>> of thing the OP needs to avoid.
>
> Why does it print 0? (I could only get 11 with several compilers
> including mingw 3.4.5)
It looks to me that the compiler is noting that big.a has been set to
zero, so "knows" that when it comes to printing big.a it can print 0.
And yet, it's spotting that things are happening, and complaining.
I wonder what it would do if the second line of main had read:
struct small_struct *otherp = (struct big_struct *)&big;
> And what happens to the 11? Is it stored in the right place but gcc
> assumes that location still contains 0? (Despite the fact that the
> address of big has been taken just a couple of lines before, so there
> must be some pointers to it floating around.)
That's my guess. But as nothing is ever done with the otherp->a,
there's no way of knowing.
> I'd be more concerned with gcc producing the wrong results. The
> programmer shouldn't need to jump through hoops to achieve something
> so basic.
Yes, this feels over-enthusiastic to me. But if it's producing a
warning you presumably can change the code until the warning goes away.
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk
> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/63f6360d939612b3
>
> One example of a bug in GCC that breaks POSIX rules wrt mutex.
Thanks, very instructive thread.
Cheers,
lacos
> One final thing, Laszlo example is valid only when big_struct type is
> complete, right?
I can't recall such a requirement in the standard.
C99 6.7.2.1 "Structure and union specifiers", p13:
----v----
Within a structure object, the non-bit-field members and the units in
which bit-fields reside have addresses that increase in the order in
which they are declared. A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There
may be unnamed padding within a structure object, but not at its
beginning.
----^----
> [If] compiler can assume that pointers to different types point to
> different memory locations,
I don't think this could be asserted in general.
C99 6.3.2.3 "Pointers", p7:
----v----
A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer is
not correctly aligned [...] for the pointed-to type, the behavior is
undefined. Otherwise, when converted back again, the result shall
compare equal to the original pointer. [...]
----^----
{
void *p;
double *d;
int *i;
p = malloc(sizeof *d > sizeof *i ? sizeof *d : sizeof *i);
i = p;
d = p;
}
Cheers,
lacos
Looking at these two passages simultaneously, Dominik may be right,
because the upper citation talks about a pointer to a structure
*object*, and that is different from a pointer to an incomplete type, in
the wording of the lower passage. In that case, my example code in
<TQDJXwuFBlNP@ludens> is incorrect, and the public headers must make the
complete struct type definitions visible to the client translation
units.
lacos
> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>
>> Dominik Zaczkowski <d...@pro.wp.pl> writes:
>>
>>> "christian.bau" <christ...@cbau.wanadoo.co.uk> writes:
>>>
No, I don't think that matters, but I am having a hard time getting to
grips with why.
> If compiler can assume that pointers to different types point to
> different memory locations, and at the point where I'm casting
> big_struct pointer to common struct pointer the internals of big_struct
> aren't known, behaviour is still undefined?
I think it is OK because the incomplete type might indeed start with
common struct so the compiler must behave as it is it does.
> Now I see that best way to solve my problem will be Ben's example,
I didn't post and example solution (I know, so many words, and no real
help!) I just pointed out a problem. I think you have a solution: put
the common parts in a struct of their own and point to that. It
slightly complicates the access and can affect the layout of the data
but if these are not a problem go with that.
> however I was hoping that I can keep big_struct incomplete.
I think that will work, but I'd wait and see what other people think.
<snip>
--
Ben.
> On 2010-02-27, Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
>> I've left everything because I am no sure what is key anymore and I
>> agree with almost everything you've said. The effect of your argument
>> seems to be that the mere possibility of a union means that pointers to
>> structs that share initial sequences must always be considered as
>> possible aliases to each other and that this, in practise, is enough
>> for the OP to reply on. Is gcc wrong here:
>
>> #include <stdio.h>
>>
>> struct big_struct { int a, b; double c; };
>> struct small_struct { int a, b; };
>>
>> int main(void)
>> {
>> struct big_struct big;
>> struct small_struct *otherp = (void *)&big;
>> big.a = 0;
>> otherp->a = 11;
>> printf("big.a = %d\n", big.a);
>> return 0;
>> }
>
>> After complaining about the aliasing, the program prints 0 (with -O2
>> or above). This seems to me entirely permissible and exactly the kind
>> of thing the OP needs to avoid.
>
> Hmm.
>
> Well, here's the interesting case: What happens if you do it through a union?
Unsurprisingly, that works and there is no warning because the
possibility of aliasing is now explicit.
> Perhaps more interesting, what happens if the assignment to 11 goes in another
> file, that sees those two declarations, and a union declaration, but then
> you don't actually use the union when setting up the aliasing?
Like this:
#include <stdio.h>
struct big_struct { int a, b; double c; };
struct small_struct { int a, b; };
union either { struct big_struct b; struct small_struct s; };
extern void f(struct small_struct *sp);
int main(void)
{
struct big_struct big;
struct small_struct *otherp = (void *)&big;
big.a = 0;
f(otherp);
printf("big.a = %d\n", big.a);
return 0;
}
along with:
struct big_struct { int a, b; double c; };
struct small_struct { int a, b; };
union either { struct big_struct b; struct small_struct s; };
void f(struct small_struct *sp) { sp->a = 11; }
That works: no alias complaints and 11 is printed. This shows (to my
mind) that your arguments about what an optimiser must be prepared for
are correct, but it does not show that the mere presence of the union
forbids all aliasing assumptions in all cases. In fact, the half-way
house between using the union and separate compilation shows that gcc
at least does not share your view of 6.5.2.3 p5:
#include <stdio.h>
struct big_struct { int a, b; double c; };
struct small_struct { int a, b; };
union either { struct big_struct b; struct small_struct s; };
int main(void)
{
struct big_struct big;
struct small_struct *otherp = (void *)&big;
big.a = 0;
otherp->a = 11;
printf("big.a = %d\n", big.a);
return 0;
}
This prints 0 and I feel it is permitted to. The fact that separate
compilation makes the optimiser less confident about it's aliasing
assumptions does mean it is wrong to make then when it has more
information to go on.
<snip>
--
Ben.
It most definitely can' be asserted.
signed char a;
signed char *b = &a;
unsigned char *c = (signed char *)&a;
Both (b) and (c) point to the same byte.
> >
> > C99 6.3.2.3 "Pointers", p7:
> >
> > ----v----
> > A pointer to an object or incomplete type
> > may be converted to a pointer
> > to a different object or incomplete type.
> > If the resulting pointer is
> > not correctly aligned [...] for the pointed-to type, the behavior is
> > undefined. Otherwise, when converted back again, the result shall
> > compare equal to the original pointer. [...]
> > ----^----
>
> Looking at these two passages simultaneously, Dominik may be right,
> because the upper citation talks about a pointer to a structure
> *object*, and that is different from a pointer to an incomplete type,
I disagree with that.
A pointer to void, is a pointer to an incomplete type.
A pointer to void, can point to a struct object.
> in
> the wording of the lower passage. In that case, my example code in
> <TQDJXwuFBlNP@ludens> is incorrect,
> and the public headers must make the
> complete struct type definitions visible to the client translation
> units.
>
> lacos
--
pete
> "Ben Bacarisse" <ben.u...@bsb.me.uk> wrote in message
> news:0.d243db41f5155047f9fa.2010...@bsb.me.uk...
>
>> for the OP to reply on. Is gcc wrong here:
>>
>> #include <stdio.h>
>>
>> struct big_struct { int a, b; double c; };
>> struct small_struct { int a, b; };
>>
>> int main(void)
>> {
>> struct big_struct big;
>> struct small_struct *otherp = (void *)&big;
>> big.a = 0;
>> otherp->a = 11;
>> printf("big.a = %d\n", big.a);
>> return 0;
>> }
>>
>> After complaining about the aliasing, the program prints 0 (with -O2
>> or above). This seems to me entirely permissible and exactly the kind
>> of thing the OP needs to avoid.
>
> Why does it print 0? (I could only get 11 with several compilers
> including mingw 3.4.5)
>
> And what happens to the 11? Is it stored in the right place but gcc
> assumes that location still contains 0? (Despite the fact that the
> address of big has been taken just a couple of lines before, so there
> must be some pointers to it floating around.)
I don't know what happens to the 11 but the compiler can assume that
access though 'otherp' can't change 'big'. This is a limited special
case where gcc can both make the assumption and warn you about it at
the same time. If you don't get a warning, then it probably means
that the gcc you have does not do optimisations based on strict
aliasing rules.
> I'd be more concerned with gcc producing the wrong results. The
> programmer shouldn't need to jump through hoops to achieve something
> so basic.
That's debatable. Is messing with the types of pointers basic? Is
using a union or putting the common elements into a struct to which
one can point "jumping through hoops"? Maybe. You are not alone in
thinking that this is and error in the language but I don't think it
is an error in gcc (for one thing, you can turn it off).
--
Ben.
<SNIP>
>> >> [If] compiler can assume that pointers to different types point to
>> >> different memory locations,
>> >
>> > I don't think this could be asserted in general.
>
> It most definitely can' be asserted.
>
> signed char a;
> signed char *b = &a;
> unsigned char *c = (signed char *)&a;
>
> Both (b) and (c) point to the same byte.
>
I said "If compiler can assume". In above example it can't but in case
of different struct types?
If it can, then I see no way how this could be OK:
#include "new_a.h"
struct A;
struct B {
int a, b;
};
struct A *aptr = new_A();
struct B *bptr = (struct B *)aptr;
Or maybe compiler is permitted to assume that struct pointers with
different types point to different locations only when both struct types
are complete and B is not first member of A?
>> > C99 6.7.2.1 "Structure and union specifiers", p13:
>> >
>> > ----v----
>> > Within a structure object,
>> > the non-bit-field members and the units in
>> > which bit-fields reside have addresses that increase in the order in
>> > which they are declared. A pointer to a structure object, suitably
>> > converted, points to its initial member (or if that member is a
>> > bit-field, then to the unit in which it resides),
>> > and vice versa. There
>> > may be unnamed padding within a structure object, but not at its
>> > beginning.
>> > ----^----
>> > C99 6.3.2.3 "Pointers", p7:
>> >
>> > ----v----
>> > A pointer to an object or incomplete type
>> > may be converted to a pointer
>> > to a different object or incomplete type.
>> > If the resulting pointer is
>> > not correctly aligned [...] for the pointed-to type, the behavior is
>> > undefined. Otherwise, when converted back again, the result shall
>> > compare equal to the original pointer. [...]
>> > ----^----
>>
>> Looking at these two passages simultaneously, Dominik may be right,
>> because the upper citation talks about a pointer to a structure
>> *object*, and that is different from a pointer to an incomplete type,
>
> I disagree with that.
> A pointer to void, is a pointer to an incomplete type.
> A pointer to void, can point to a struct object.
I like this. Perhaps:
- when 6.7.2.1p13 says "pointer to a structure object", it means that
the memory pointed to by the pointer actually holds a structure object,
- while when 6.3.2.3p7 says "pointer to an (object|incomplete) type",
ie. the word "object" is not standalone but part of the expression
"object type", it talks about the type of the pointer.
Or something like that.
Thanks,
lacos
If A and B are different complete struct types,
then pointers to each type
can both point to the same object allocated by malloc,
as long as the object is big enough.
--
pete
> Seebs <usenet...@seebs.net> writes:
>
> [snip,snip,snip]
>
> In fact, the half-way
> house between using the union and separate compilation shows that gcc
> at least does not share your view of 6.5.2.3 p5:
>
> #include <stdio.h>
>
> struct big_struct { int a, b; double c; };
> struct small_struct { int a, b; };
> union either { struct big_struct b; struct small_struct s; };
>
> int main(void)
> {
> struct big_struct big;
> struct small_struct *otherp = (void *)&big;
> big.a = 0;
> otherp->a = 11;
> printf("big.a = %d\n", big.a);
> return 0;
> }
>
> This prints 0 and I feel it is permitted to. [snip]
Right. There is no union object that contains big, and the
compiler knows that, so the provisions of 6.5.2.3p5 don't come
into play. Notice the exact wording: "if a union contains
several structures [...], and if the union object currently
contains one of these structures [...]". That's an actual
object, not just a type; hence, no union object means what
follows doesn't apply.
> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>
>> Dominik Zaczkowski <d...@pro.wp.pl> writes:
>>
>>> "christian.bau" <christ...@cbau.wanadoo.co.uk> writes:
>>>
I don't see the point of the question. If one pointer points to an
incomplete type, no accesses can be made through that pointer, so
there needn't be any concern about aliasing. How can this question
be relevant to what you're trying to do? More specifically, can
you give a short (but complete) example that illustrates the
situation you're asking about?
Clearly you're having difficulty expressing the question
you really want to ask. Certainly the compiler is _not_
permitted to assume that 'aptr' and 'bptr' point to
different locations, since they do in fact point to
the same location (modulo the usual qualifications
about alignment). May I suggest phrasing your question
in terms of "how is code <X> allowed to behave?" rather
than "what is the compiler permitted to assume?"? I
think that would sharpen your sense of just what it
is you're trying to ask.
Right on both counts. (And in 6.7.2.1p13, that the pointer
actually does point to some object, ie, is not a null pointer.)
> (And in 6.7.2.1p13, that the pointer
> actually does point to some object, ie, is not a null pointer.)
Thank you for your help with interpreting this (and the topic of the
other thread as well).
Cheers,
lacos