Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Struct tm ?

144 views
Skip to first unread message

Waranun Bunjongsat

unread,
Aug 17, 1999, 3:00:00 AM8/17/99
to

Hi,

I have read in gnu standard C that the data in "struct tm" can be in any
order as long as it covers all of the data in the specification. That is it
can be

struct tm {
int sec;
int min;
...
int mon;
int year;
};

or it can be

struct tm {
int year;
int mon;
....
int min;
int sec;
};

In my understanding, this will be depended on each implementation, e.g. gnu
C and turbo C may implement struct tm differently, as above example. (ANSI
allowed it so.) However, I would doubt that there could be compatibility
problems. Such as, if user defines a data structure or a data manipulator
which will interact with "struct tm" in a binary level, the data structure
will not work the same across the compiler. Or the simplest case is when
user want to use the data that was processed and stored by program compiled
on one compiler, with another version of the same program compiled on
another compiler. The two same programs compiled on different compilers may
not work the same on the same data.

I do not know whether I misunderstood the ANSI C in this topic.

Any insight and comment would be very appreciated,

Regards,
--
Waranun,


[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]


Steve Clamage

unread,
Aug 17, 1999, 3:00:00 AM8/17/99
to
"Waranun Bunjongsat" <bunj...@earthlink.net> writes:

>I have read in gnu standard C that the data in "struct tm" can be in any
>order as long as it covers all of the data in the specification.

Right.

>In my understanding, this will be depended on each implementation, e.g. gnu
>C and turbo C may implement struct tm differently, as above example.

Right.

>... if user defines a data structure or a data manipulator


>which will interact with "struct tm" in a binary level, the data structure
>will not work the same across the compiler. Or the simplest case is when
>user want to use the data that was processed and stored by program compiled
>on one compiler, with another version of the same program compiled on
>another compiler. The two same programs compiled on different compilers may
>not work the same on the same data.

Right.

But the format of the tm struct is only one of the problems
in linking code from different compilers.

Here is a small sample of the things than can differ among C
compilers regarding generated code:

The sizes of the basic types
The representation of the basic types, especially floating point
Alignment of the basic types
Alignment of struct types (sometimes structs have special alignment)
Calling sequence for functions, especially how parameters are
passed and values returned
Conventions for external names (e.g., added leading underscore)

For C++ (you cross-posted to a C++ newsgroup) the list is much,
much, much longer.

If you intend to link code from different compilers, they must
aggree on what is usually called an Application Binary Interface,
which specifies all of those details.

The C standard library would be part of the ABI, and compilers
that conform to the ABI would use all the same interfaces --
including the layout of struct tm.

--
Steve Clamage, stephen...@sun.com
---

David R Tribble

unread,
Aug 18, 1999, 3:00:00 AM8/18/99
to

Assuming that the code produced by all of those vendors calls
the same (system-supplied) library, which is not typically the
case. This is especially true for things like struct tm, which are
implemented by each vendor as a library shipped with its compiler;
it is entirely plausible that every vendor implements struct tm
differently, just as they are free to implement the FILE and time_t
types as they see fit. Only system interface types (such as the
arguments to a system-specific open() function, for example) need to
adhere to any kind of ABI.

(This assumes that you're not attempting to run a program compiled
by vendor A with vendor B's interface libraries. Most C++ compilers,
in fact, will not let you do this.)

-- David R. Tribble, da...@tribble.com --

Waranun Bunjongsat

unread,
Aug 18, 1999, 3:00:00 AM8/18/99
to

Hmmm, it is interesting about ABI. However, I'm looking so far
until the binary interface. Simply, at the programming level, I can't
touch struct tm at all. This is because I do not know the specific order
of its data.

For example, if I want to operate on the data only year, month and
date, to save the space, I would not want to store all the struct tm into
the memory and storage. In this case, if I know the order of the struct
tm's data element. I can just store and retrieve the year, month, date
data directly into my dummy struct tm and send it to the standard stream
or locale to manipulate it, as normal. However, since I don't know even
how struct tm would be written in the C struct syntax, because I do not
know the order of its data. I would have to copy my data to my dummy
struct tm one by one, for example tm.year=mystruct.year and etc ....

However, if I have more combinations of year, month, date, hour,
min, sec, weekday, yearday, .... etc., I would have to do the same for all
of them. I would have to create the program (classes/objects) differently
for all of them. However, if I can expect the order of the data in struct
tm, I would just be able to use the offset of my data related with the
head of struct tm to store and retrieve data directly into struct tm.
Therefore, I can only have one program for, say, 20 combinations of my
date/time data. Or else, I would have to do a more inefficient conversion
in order to be able to use one class for all the data, instead of having
to create 20 classes for all of them. I was just surprised that I can't
even write struct tm which is the standard C data into the C struct
format.

Regards,

Waranun,

Paul Jarc

unread,
Aug 18, 1999, 3:00:00 AM8/18/99
to

Waranun Bunjongsat <bunj...@trek.CS.ORST.EDU> writes:
> However, if I can expect the order of the data in struct tm, I would
> just be able to use the offset of my data related with the head of
> struct tm to store and retrieve data directly into struct tm.

You can do that with `offsetof(struct tm, tm_year)', etc. But the
value of this offset may be different from one platform to another.


paul

Greg Brewer

unread,
Aug 18, 1999, 3:00:00 AM8/18/99
to

Paul Jarc <p...@po.cwru.edu> wrote in message
news:m3emh04...@multivac.student.cwru.edu...


> You can do that with `offsetof(struct tm, tm_year)', etc. But the
> value of this offset may be different from one platform to another.

That's an interesting one that I never heard of before. I'll file it away
for furture use.

Greg Brewer

Steve Clamage

unread,
Aug 19, 1999, 3:00:00 AM8/19/99
to
Waranun Bunjongsat <bunj...@trek.CS.ORST.EDU> writes:

>On 17 Aug 1999, Steve Clamage wrote:

>> "Waranun Bunjongsat" <bunj...@earthlink.net> writes:
>>
>> >I have read in gnu standard C that the data in "struct tm" can be in any
>> >order as long as it covers all of the data in the specification.
>>
>> Right.

>> ...


>>
>> If you intend to link code from different compilers, they must
>> aggree on what is usually called an Application Binary Interface,
>> which specifies all of those details.
>>
>> The C standard library would be part of the ABI, and compilers
>> that conform to the ABI would use all the same interfaces --
>> including the layout of struct tm.

> Hmmm, it is interesting about ABI. However, I'm looking so far
>until the binary interface. Simply, at the programming level, I can't
>touch struct tm at all. This is because I do not know the specific order
>of its data.

You don't need to know the order. On all implementations that
conform to the same ABI the order is the same. You can dump the
binary contents of a tm object to a file, and read it back in.
As long as the reading program conforms to the same ABI as the
writing program, everything is fine.

By definition, an ABI assures binary compatibility of object code
and data.

If you need to read the data using something else, you have more
than differing struct offsets to worry about. You also need to
worry about the sizes, alignment, representation, and byte order
of values stored in each field of the tm object.

--
Steve Clamage, stephen...@sun.com
---

Steve Clamage

unread,
Aug 19, 1999, 3:00:00 AM8/19/99
to
David R Tribble <da...@tribble.com> writes:

>Steve Clamage wrote:
>>
>> If you intend to link code from different compilers, they must
>> aggree on what is usually called an Application Binary Interface,
>> which specifies all of those details.
>>
>> The C standard library would be part of the ABI, and compilers
>> that conform to the ABI would use all the same interfaces --
>> including the layout of struct tm.

>Assuming that the code produced by all of those vendors calls


>the same (system-supplied) library, which is not typically the
>case.

No, by definition, an ABI assures binary compatibility of
object code and data. The C ABI must include all of the C
standard library interfaces.

>This is especially true for things like struct tm, which are
>implemented by each vendor as a library shipped with its compiler;
>it is entirely plausible that every vendor implements struct tm
>differently, just as they are free to implement the FILE and time_t
>types as they see fit.

If they conform to the same ABI, they must be identical at the
binary level -- by definition of ABI.

For example, there is a published ABI for Solaris on each supported
platform. Sun provides ABI-conforming C implementations, and so do
various third-party vendors. You can take object code modules
compiled by any combination of these compilers and link them
together without difficulty.

> Only system interface types (such as the
>arguments to a system-specific open() function, for example) need to
>adhere to any kind of ABI.

You are thinking of an API, an Application Programming Interface.

>(This assumes that you're not attempting to run a program compiled
>by vendor A with vendor B's interface libraries. Most C++ compilers,
>in fact, will not let you do this.)

That is because there aren't any ABIs for C++ implementations yet.

David R Tribble

unread,
Aug 19, 1999, 3:00:00 AM8/19/99
to
Paul Jarc wrote:
>
> Waranun Bunjongsat <bunj...@trek.CS.ORST.EDU> writes:
>> However, if I can expect the order of the data in struct tm, I would
>> just be able to use the offset of my data related with the head of
>> struct tm to store and retrieve data directly into struct tm.
>
> You can do that with `offsetof(struct tm, tm_year)', etc. But the
> value of this offset may be different from one platform to another.

Here's a little program I whipped up which deduces the ordering of
the members of struct tm, then prints out a compatible structure;
it uses the offsetof() macro to do its magic. You can use it if
you truly need to know in what order the members of struct tm occur
for the given platform at hand.

It's questionable, though, of what value this will be to you, since
there's not a lot of usefulness in knowing the order, even when
writing/reading the struct to files. If I were writing the contents
of a struct to a (binary) file that was meant to be portable across
various systems, I'd use a function that wrote the members out
(as octets) in a hardcoded order using hardcoded sizes and endianness.

Anyway, here is the program:

/*=================================================================
* tm.c
* Determines the ordering of the members of the standard 'tm'
* struct.
*
* Written by David R. Tribble, 1999-08-19.
* This code is in the public domain.
*/

/* System includes */

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

/*-----------------------------------------------------------------
* Local types
*/

struct Memb
{
int off;
const char * name;
};

/*-----------------------------------------------------------------
* cmpf()
*/

static int cmpf(const void *ap, const void *bp)
{
const struct Memb * a = (const struct Memb *)ap;
const struct Memb * b = (const struct Memb *)bp;

return (a->off - b->off);
}


/*-----------------------------------------------------------------
* main()
*/

int main(void)
{
struct Memb m[20];
int c;

/* Get the individual struct member offsets */
{
c = 0;

m[c].name = "tm_year";
m[c].off = offsetof(struct tm, tm_year);
c++;

m[c].name = "tm_mon";
m[c].off = offsetof(struct tm, tm_mon);
c++;

m[c].name = "tm_mday";
m[c].off = offsetof(struct tm, tm_mday);
c++;

m[c].name = "tm_wday";
m[c].off = offsetof(struct tm, tm_wday);
c++;

m[c].name = "tm_yday";
m[c].off = offsetof(struct tm, tm_yday);
c++;

m[c].name = "tm_hour";
m[c].off = offsetof(struct tm, tm_hour);
c++;

m[c].name = "tm_min";
m[c].off = offsetof(struct tm, tm_min);
c++;

m[c].name = "tm_sec";
m[c].off = offsetof(struct tm, tm_sec);
c++;

m[c].name = "tm_isdst";
m[c].off = offsetof(struct tm, tm_isdst);
c++;
}

/* Print the struct members in order of their offsets */
{
int i;
int o;

/* Sort the members by their offsets */
qsort(&m, c, sizeof(m[0]), &cmpf);

/* Print the sorted members */
printf("struct tm\n");
printf("{\n");

o = 0;
for (i = 0; i < c; i++)
{
while (o < i * sizeof(int))
{
printf(" int\t\t_%d;\n", o/sizeof(int));
o += sizeof(int);
}

printf(" int\t\t%s;\t/* +%d\t*/\n",
m[i].name, m[i].off);
o += sizeof(int);
}

while (o < sizeof(struct tm))
{
printf(" int\t\t_%d;\n", o/sizeof(int));
o += sizeof(int);
}

printf("};\n\n");
}

return (0);
}

/* End tm.c */

Enjoy.

Douglas A. Gwyn

unread,
Aug 20, 1999, 3:00:00 AM8/20/99
to
David R Tribble wrote:
> Here's a little program I whipped up which deduces the ordering of
> the members of struct tm, then prints out a compatible structure;
> it uses the offsetof() macro to do its magic.

One could also do this without offsetof(), by declaring a
struct tm object and differencing (char *)-case pointers
between a selected reference member and each member.
Sort the differences (along with attached labels) into
ascending numerical order and you have reproduced the
order of the known members of struct tm.

Kai Henningsen

unread,
Aug 23, 1999, 3:00:00 AM8/23/99
to
da...@tribble.com (David R Tribble) wrote on 19.08.99 in <37BC73F1...@tribble.com>:

> while (o < i * sizeof(int))
> {
> printf(" int\t\t_%d;\n", o/sizeof(int));
> o += sizeof(int);
> }

Better use char; I don't think the standard guarantees all known members
are aligned to sizeof(int).

Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
- Russ Allbery (r...@stanford.edu)

David R Tribble

unread,
Aug 23, 1999, 3:00:00 AM8/23/99
to
Kai Henningsen wrote:
>
> da...@tribble.com (David R Tribble) wrote
>> while (o < i * sizeof(int))
>> {
>> printf(" int\t\t_%d;\n", o/sizeof(int));
>> o += sizeof(int);
>> }
>
> Better use char; I don't think the standard guarantees all known
> members are aligned to sizeof(int).

It guarantees that they are all int, so they'd better be aligned
on int boundaries.

But yes, I probably was overly clever using 'o/sizeof(int)' instead
of simply 'o'. But in any case, the names produced will still be
unique.

Paul Jarc

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
David R Tribble <da...@tribble.com> writes:
> Kai Henningsen wrote:
> > Better use char; I don't think the standard guarantees all known
> > members are aligned to sizeof(int).
>
> It guarantees that they are all int, so they'd better be aligned
> on int boundaries.

They'd better be aligned however ints need to be aligned. Padding
could consist of arbitrary numbers of bytes, if that wouldn't cause
problems. (It would cause problems on some implementations, but
others can use weird offsets.)


paul

David R Tribble

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to

A struct that contains int members must, by definition, be aligned
on a boundary at least as restrictive as int alignment. Its int
members must, by definition, be aligned on int boundaries.

So dividing the offset of an int member within its struct by
sizeof(int), e.g., 'offsetof(struct tm, tm_year) / sizeof(int)',
is well-defined, and should always result in a quotient with no
remainder. Padding bytes following the int members within the
struct don't affect this, nor do the types and alignments of any
members that may precede them.

Paul Jarc

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
David R Tribble <da...@tribble.com> writes:
> A struct that contains int members must, by definition, be aligned
> on a boundary at least as restrictive as int alignment. Its int
> members must, by definition, be aligned on int boundaries.

`int boundaries', yes, if you mean `whatever alignment is required for
int *on each particular implementation*'.

> So dividing the offset of an int member within its struct by
> sizeof(int), e.g., 'offsetof(struct tm, tm_year) / sizeof(int)',
> is well-defined, and should always result in a quotient with no
> remainder.

There can be a non-zero remainder. If you're using C to implement
structs yourself, and you want to do it portably, then you have to
align objects on multiples of their size. But an implementation is
not bound by this, if it knows that its platform (hardware, OS, or
whatever's involved) does not require it. Assuming greater-than-1-
byte ints, and assuming the hardware does not have any alignment
restrictions, and assuming struct tm contains
{ char extra; int tm_year; /*...*/ }
then offsetof(struct tm, tm_year) can be 1 *on this implementation*.
Other platforms may require it to be sizeof(int), but that doesn't
apply to this implementation.

I wonder - suppose I do:
struct align { char pad; int align_int; };
Now, if offsetof(struct align, align_int)==1, am I free to put ints
anywhere? (I.e., *(int*)((char*)malloced_ptr+1)=42; and such?)
More generally:
struct align { char pad; T align_T; };
*(T*)((char*)malloced_ptr+n*offsetof(struct align, align_T))=/*...*/;
Is this strictly conforming?


paul

Dave Hansen

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
On Tue, 24 Aug 1999 13:03:44 -0500, David R Tribble
<da...@tribble.com> wrote:

[...]


>A struct that contains int members must, by definition, be aligned
>on a boundary at least as restrictive as int alignment. Its int
>members must, by definition, be aligned on int boundaries.

But what does that mean, really? Consider a system that has no
restrictions on data alignment (i.e., all data is "byte-aligned"). If
you can't think of any, I can. In such a system, given

struct foo{
char f1;
int f2;
};

offsetof(struct foo, f1) is 1.

Regards,

-=Dave
Just my (10-010) cents
I can barely speak for myself, so I certainly can't speak for B-Tree.
Change is inevitable. Progress is not.

Dave Hansen

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
On Tue, 24 Aug 1999 19:48:07 GMT, dha...@btree.com (Dave Hansen)
wrote:

>On Tue, 24 Aug 1999 13:03:44 -0500, David R Tribble
><da...@tribble.com> wrote:
>
>[...]
>>A struct that contains int members must, by definition, be aligned
>>on a boundary at least as restrictive as int alignment. Its int
>>members must, by definition, be aligned on int boundaries.
>
>But what does that mean, really? Consider a system that has no
>restrictions on data alignment (i.e., all data is "byte-aligned"). If
>you can't think of any, I can. In such a system, given
>
> struct foo{
> char f1;
> int f2;
> };
>
>offsetof(struct foo, f1) is 1.

Aargh. TSB offsetof(struct foo, f2). Sorry.


>
>Regards,
>
> -=Dave
>Just my (10-010) cents
>I can barely speak for myself, so I certainly can't speak for B-Tree.
>Change is inevitable. Progress is not.

ditto

Nick Maclaren

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
In article <37C2DE80...@tribble.com>,

David R Tribble <da...@tribble.com> wrote:
>Paul Jarc wrote:
>>
>> David R Tribble <da...@tribble.com> writes:
>>> Kai Henningsen wrote:
>>> > Better use char; I don't think the standard guarantees all known
>>> > members are aligned to sizeof(int).
>>>
>>> It guarantees that they are all int, so they'd better be aligned
>>> on int boundaries.
>>
>> They'd better be aligned however ints need to be aligned. Padding
>> could consist of arbitrary numbers of bytes, if that wouldn't cause
>> problems. (It would cause problems on some implementations, but
>> others can use weird offsets.)
>
>A struct that contains int members must, by definition, be aligned
>on a boundary at least as restrictive as int alignment. Its int
>members must, by definition, be aligned on int boundaries.

All right, here goes Nick the Nit Picker. Consider a machine
where int is size 4 and alignment 0 mod 4. Is there anything in
the standard that forbids

typedef struct {char a; int b;} fred;

to be size 8 and alignment 3 mod 4 with offsetof(fred,b) 1?
Note that this does NOT break the array packing rules.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email: nm...@cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679

David R Tribble

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
Steve Clamage wrote:
>
> David R Tribble <da...@tribble.com> writes:
>
> >Steve Clamage wrote:
> >>
> >> If you intend to link code from different compilers, they must
> >> aggree on what is usually called an Application Binary Interface,
> >> which specifies all of those details.
> >>
> >> The C standard library would be part of the ABI, and compilers
> >> that conform to the ABI would use all the same interfaces --
> >> including the layout of struct tm.
>
> >Assuming that the code produced by all of those vendors calls
> >the same (system-supplied) library, which is not typically the
> >case.
>
> No, by definition, an ABI assures binary compatibility of
> object code and data. The C ABI must include all of the C
> standard library interfaces.

This is surprising, sincs this means that the contents of struct tm,
FILE, and others can't be improved upon by other compiler/library
vendors. But after some thought, I see that makes sense - it
allows programs compiled by various compilers to all make use of the
same standard libraries.

(And yes, I was confusing "ABI" with "API".)

-- David R. Tribble, da...@tribble.com --

Larry Jones

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
Nick Maclaren (nm...@cus.cam.ac.uk) wrote:
>
> All right, here goes Nick the Nit Picker. Consider a machine
> where int is size 4 and alignment 0 mod 4. Is there anything in
> the standard that forbids
>
> typedef struct {char a; int b;} fred;
>
> to be size 8 and alignment 3 mod 4 with offsetof(fred,b) 1?

I think you'd have a hard time implementing malloc in such a system.

-Larry Jones

It must be sad being a species with so little imagination. -- Calvin

Paul Jarc

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
nm...@cus.cam.ac.uk (Nick Maclaren) writes:
> All right, here goes Nick the Nit Picker. Consider a machine
> where int is size 4 and alignment 0 mod 4. Is there anything in
> the standard that forbids
>
> typedef struct {char a; int b;} fred;
>
> to be size 8 and alignment 3 mod 4 with offsetof(fred,b) 1?
> Note that this does NOT break the array packing rules.

I think it's forbidden for Useful Implementations. :) There would be
implications for malloc() and friends: they need not ever succeed, but
if they do, they return a pointer suitably aligned for any type,
including int (here, 0 mod 4) and fred (here, 3 mod 4). This is
impossible, so malloc cannot succeed on this implementation.
Hence the implementation is not Useful, Q.E.D. Note that the
definition of Usefulness is subject to my whim. :)

More generally, for malloc() to be able to succeed: for all types T
with alignment R(T) mod N(T), there must be an address A (hopefully,
there will be many) equivalent to R(T) modulo N(T). I.e., malloc can
return only those A such that for all T, there exists K(T) such that
A=R(T)+K(T)*N(T). If R(T)=0 for all T, then malloc can return any
multiple of the least common multiple of all N(T). Otherwise, I think
the Chinese Remainder Theorem might be useful, but I don't remember
exactly. A is unique modulo the least common multiple of all N(T), or
something like that.


paul

Nick Maclaren

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
In article <7pv24k$3...@nfs0.sdrc.com>,
Larry Jones <larry...@sdrc.com> wrote:

>Nick Maclaren (nm...@cus.cam.ac.uk) wrote:
>>
>> All right, here goes Nick the Nit Picker. Consider a machine
>> where int is size 4 and alignment 0 mod 4. Is there anything in
>> the standard that forbids
>>
>> typedef struct {char a; int b;} fred;
>>
>> to be size 8 and alignment 3 mod 4 with offsetof(fred,b) 1?
>
>I think you'd have a hard time implementing malloc in such a system.

A good point (also made by Paul Jarc.) But would it be forbidden
in a free-standing implementation? I don't know.

James Russell Kuyper Jr.

unread,
Aug 24, 1999, 3:00:00 AM8/24/99
to
David R Tribble wrote:
...

> A struct that contains int members must, by definition, be aligned
> on a boundary at least as restrictive as int alignment. Its int
> members must, by definition, be aligned on int boundaries.

But they need not be aligned on multiples of sizeof(int). The alignment
requirement for 'int' can be smaller than sizeof(int).

> So dividing the offset of an int member within its struct by
> sizeof(int), e.g., 'offsetof(struct tm, tm_year) / sizeof(int)',
> is well-defined, and should always result in a quotient with no
> remainder.

There's no particular reason for that. For example, an implementation
which has no alignment restrictions on 'int' could put one byte of
padding just about anywhere in the struct (except the beginning), and
your code wouldn't work.

Steve Clamage

unread,
Aug 25, 1999, 3:00:00 AM8/25/99
to
David R Tribble <da...@tribble.com> writes:

>Steve Clamage wrote:
>>
>> No, by definition, an ABI assures binary compatibility of
>> object code and data. The C ABI must include all of the C
>> standard library interfaces.

>This is surprising, sincs this means that the contents of struct tm,
>FILE, and others can't be improved upon by other compiler/library
>vendors. But after some thought, I see that makes sense - it
>allows programs compiled by various compilers to all make use of the
>same standard libraries.

But let's keep in mind that an ABI applies to a single platform --
which I'll define as a combination of instruction set
architecture and operating system.

Typically, the OS vendor specifies the ABI for each supported
platform. Sometimes there is a de facto ABI due to the dominance
of one compiler on a platform.

Any compiler or library implementor can choose to follow the
(formal or informal) ABI or not. Where there is an accepted ABI
it usually makes sense to follow it.

Where chaos reigns, as was the case with MSDOS, different compiler
implementors can provide different and incompatible ABIs.

--
Steve Clamage, stephen...@sun.com

Chris Torek

unread,
Aug 26, 1999, 3:00:00 AM8/26/99
to
In article <m3so59t...@multivac.student.cwru.edu> Paul Jarc

<p...@po.cwru.edu> writes:
>I wonder - suppose I do:
> struct align { char pad; int align_int; };
>Now, if offsetof(struct align, align_int)==1, am I free to put ints
>anywhere? (I.e., *(int*)((char*)malloced_ptr+1)=42; and such?)
>More generally:
> struct align { char pad; T align_T; };
> *(T*)((char*)malloced_ptr+n*offsetof(struct align, align_T))=/*...*/;
>Is this strictly conforming?

I can offer a practical counterexample that depends on extensions.

Consider the following:

#include <stddef.h>

/* #pragma pack(1) */ /* or #pragma pack etc */
struct s {
char c;
int i;
} __attribute__((packed)); /* this example is for gcc */

int f(struct s *sp) { return sp->i; }
ing g(void) { return offsetof(struct s, i); }

When we (that is the editorial "we" :-) ) compile this on a SPARC,
we get this code:

f:
ldub [%o0+1],%o1
ldub [%o0+2],%g3
ldub [%o0+3],%g2
sll %o1,24,%o1
sll %g3,16,%g3
or %g3,%o1,%g3
sll %g2,8,%g2
ldub [%o0+4],%o0
or %g2,%g3,%g2
retl
or %o0,%g2,%o0

g:
retl
mov 1,%o0

In other words, the offset of "i" is 1, but the alignment required
for "int"s in general is 4 (hence the complicated code in f() to
read member "i" using four separate byte-loads).

In gcc's case, the "struct align { ... };" has to have the explicit
__attribute__ attached to it, so one could be sure that one's own
code does not fall afoul of this simply by avoiding __attribute__.
When one tries to apply the offsetof() trick to some system-supplied
"struct", however, if the system supplied that particular struct
as "packed", the offsetof trick might not work.

In other compilers, if some system header leaves "#pragma pack"
turned on, it is conceivable (barring wording in the Standard --
I have not checked) that even strictly conforming source code
could run into the same problem. That is, perhaps structure
members are packed, but ordinary objects are not.

(A similar real-life example occurs with at least some SPARC
compilers and "double" parameters. All doubles must be 8-byte
aligned, except for some "double"s that are function parameters.
GCC and at least one Sun compiler have conflicting ideas about how
to accomplish this.)
--
In-Real-Life: Chris Torek, Berkeley Software Design Inc
El Cerrito, CA Domain: to...@bsdi.com +1 510 234 3167
http://claw.bsdi.com/torek/ (not always up) I report spam to abuse@.

Paul Jarc

unread,
Aug 26, 1999, 3:00:00 AM8/26/99
to
to...@elf.bsdi.com (Chris Torek) writes:
> /* #pragma pack(1) */ /* or #pragma pack etc */
> struct s {
> char c;
> int i;
> } __attribute__((packed)); /* this example is for gcc */
>
> int f(struct s *sp) { return sp->i; }
> ing g(void) { return offsetof(struct s, i); }
...

> In other words, the offset of "i" is 1, but the alignment required
> for "int"s in general is 4 (hence the complicated code in f() to
> read member "i" using four separate byte-loads).

So the int member is not aligned as ints are required to be? So
*(unoptimizeable_identity_function(&s.i))=42;
Would fail, since the pointer is bad? I think this renders the
implementation nonconforming. Unless I've missed something.


paul

David R Tribble

unread,
Aug 26, 1999, 3:00:00 AM8/26/99
to
Nick Maclaren wrote:
>
> David R Tribble <da...@tribble.com> wrote:

> >Paul Jarc wrote:
> >>
> >> David R Tribble <da...@tribble.com> writes:
> >>> Kai Henningsen wrote:
> >>> > Better use char; I don't think the standard guarantees all known
> >>> > members are aligned to sizeof(int).
> >>>
> >>> It guarantees that they are all int, so they'd better be aligned
> >>> on int boundaries.
> >>

> >> They'd better be aligned however ints need to be aligned. Padding
> >> could consist of arbitrary numbers of bytes, if that wouldn't cause
> >> problems. (It would cause problems on some implementations, but
> >> others can use weird offsets.)
> >
> >A struct that contains int members must, by definition, be aligned
> >on a boundary at least as restrictive as int alignment. Its int
> >members must, by definition, be aligned on int boundaries.
>
> All right, here goes Nick the Nit Picker. Consider a machine
> where int is size 4 and alignment 0 mod 4. Is there anything in
> the standard that forbids
>
> typedef struct {char a; int b;} fred;
>
> to be size 8 and alignment 3 mod 4 with offsetof(fred,b) 1?
> Note that this does NOT break the array packing rules.

Lack of sleep due to the recent arrival of a newborn can do strange
things to one's mind. I mispoke. int struct members are indeed
required to be aligned on int boundaries, but of course this can be
as small as 1, and has no relationship, necessarily, to sizeof(int).

OTOH, my code example was simply designed to generate a unique
name (number) for "holes" in the tm struct, so I used
'offset/sizeof(int)' to produce just such a number. Which is just
as correct as using simply 'offset' as the number, in that both will
produce unique numbers.

Sorry for starting a useless thread...

Mark Brader

unread,
Aug 27, 1999, 3:00:00 AM8/27/99
to
Nick Maclaren writes:
> All right, here goes Nick the Nit Picker. Consider a machine
> where int is size 4 and alignment 0 mod 4. Is there anything in
> the standard that forbids
>
> typedef struct {char a; int b;} fred;
>
> to be size 8 and alignment 3 mod 4 with offsetof(fred,b) 1?

Yes, both in terminology and in fact.

In terminology because "alignment" is defined in terms of requiring
things to be on "particular multiples of a byte address"; so there's
no such thing as "alignment 3 mod 4". See signature quote.

And more importantly, in fact because the type seen in

union {fred f; int i;} fifi;

would not be possible. Given the specification that Nick puts forward,
the pointer conversion rules for structs and unions would require ints
fifi.i and fifi.f.b to overlap by 3 bytes, violating the alignment
requirement on int.
--
Mark Brader, Toronto Well, somebody had to be the pedant here!
msbr...@interlog.com -- David Keldsen

My text in this article is in the public domain.

Nick Maclaren

unread,
Aug 27, 1999, 3:00:00 AM8/27/99
to

In article <7q5jdl$4...@shell1.interlog.com>, msbr...@interlog.com (Mark Brader) writes:
|> Nick Maclaren writes:
|> > All right, here goes Nick the Nit Picker. Consider a machine
|> > where int is size 4 and alignment 0 mod 4. Is there anything in
|> > the standard that forbids
|> >
|> > typedef struct {char a; int b;} fred;
|> >
|> > to be size 8 and alignment 3 mod 4 with offsetof(fred,b) 1?
|>
|> Yes, both in terminology and in fact.
|>
|> In terminology because "alignment" is defined in terms of requiring
|> things to be on "particular multiples of a byte address"; so there's
|> no such thing as "alignment 3 mod 4". See signature quote.

Actually, we discussed that a while back (either on the reflector
or here), and it is unclear what that definition means, if anything.
For example, many architectures do not have the concept of "address
zero" (e.g. some segmented ones) and so it is perfectly possible to
claim that the base address for int alignment calculations is 3.

|> And more importantly, in fact because the type seen in
|>
|> union {fred f; int i;} fifi;
|>
|> would not be possible. Given the specification that Nick puts forward,
|> the pointer conversion rules for structs and unions would require ints
|> fifi.i and fifi.f.b to overlap by 3 bytes, violating the alignment
|> requirement on int.

Ah. Good point. Now, THIS one I accept!

Mark Brader

unread,
Aug 27, 1999, 3:00:00 AM8/27/99
to
Nick Maclaren writes:
> Actually, we discussed that a while back (either on the reflector
> or here), and it is unclear what that definition means, if anything.
> For example, many architectures do not have the concept of "address
> zero" (e.g. some segmented ones)...

Er, well, if an alignment requirement as defined in the standard is
meaningless, then a requirement of "0 mod 4" or "3 mod 4" as perd
Nick's previous posting must be meaningless as well.
--
Mark Brader "Oh, I'm a programmer and I'm O.K....
Toronto I work all night and I sleep all day"
msbr...@interlog.com -- Trygve Lode (after Monty Python)

Nick Maclaren

unread,
Aug 27, 1999, 3:00:00 AM8/27/99
to

In article <7q5nrm$5...@shell1.interlog.com>, msbr...@interlog.com (Mark Brader) writes:
|> Nick Maclaren writes:
|> > Actually, we discussed that a while back (either on the reflector
|> > or here), and it is unclear what that definition means, if anything.
|> > For example, many architectures do not have the concept of "address
|> > zero" (e.g. some segmented ones)...
|>
|> Er, well, if an alignment requirement as defined in the standard is
|> meaningless, then a requirement of "0 mod 4" or "3 mod 4" as perd
|> Nick's previous posting must be meaningless as well.

Oh, yes, I quite agree. But that's not what I said. I said that
it was unclear what it means, if anything, and some of the possible
interpretations include "3 mod 4" as being a valid alignment
constraint. Others don't.

I could go into the arguments again, but the executive summary is
that the definitions in the standard are effectively meaningless
and what meaning is given to "alignment" is given by common practice
rather than the standard.

Saroj Mahapatra

unread,
Aug 29, 1999, 3:00:00 AM8/29/99
to
Mark Brader wrote:
>
> Nick Maclaren writes:
> > All right, here goes Nick the Nit Picker. Consider a machine
> > where int is size 4 and alignment 0 mod 4. Is there anything in
> > the standard that forbids
> >
> > typedef struct {char a; int b;} fred;
> >
> > to be size 8 and alignment 3 mod 4 with offsetof(fred,b) 1?
>
> Yes, both in terminology and in fact.
>
> In terminology because "alignment" is defined in terms of requiring
> things to be on "particular multiples of a byte address"; so there's
> no such thing as "alignment 3 mod 4". See signature quote.
>
> And more importantly, in fact because the type seen in
>
> union {fred f; int i;} fifi;
>
> would not be possible. Given the specification that Nick puts forward,
> the pointer conversion rules for structs and unions would require ints
> fifi.i and fifi.f.b to overlap by 3 bytes, violating the alignment
> requirement on int.

Can somebody please describe the standard rules for alignment
(and their relation to 'array packing rule')?

I know that the first member of a structure/union must have offset 0.
There may be holes between members or at the end. There can not
be holes between array elements(is it the exact rule???).

Did I leave out anything?

What can I infer from these declarations (about alignment)?

(Those who have seen Knuth vol. 1 Boundary Tag method Dynamic Memory
Allocation will understand this better.)

typedef double Align; // we are considering machines with double
// word alignment

union MemoryWord {
MemoryWord *link;
long tsize; // tag & size
};

union Header {
MemoryWord words[2];
Align s;
};

// In many machines Align is 8 bytes, but MemoryWord is 4 bytes, so
// I packed 2 MemoryWords' and Align in one Header.

void *my_malloc(int nbytes) // nbytes > 0
{
// Free blocks have 3 words for tsize, rlink and llink at the front
// of the block and one word for tag at the end of the block.

// Reserved blocks have 1 word for tsize at the front of the block
// and one word for tag at the end of the block.

int nunits = (nbytes + sizeof(Header) - sizeof(MemoryWord)
+ sizeof(Header) - 1) / sizeof(Header)
+ 1 /* one control unit at the front */;

...
}

Can anybody find any flaw in the calculation of nunits? The reason
for asking in this group is that people in lang.* groups will have
no clue on the exact standard rules for alignment.

Thanks,
Saroj Mahapatra

Francis Glassborow

unread,
Aug 30, 1999, 3:00:00 AM8/30/99
to
In article <37C9A8...@worldnet.att.net>, Saroj Mahapatra <saroj-
tam...@worldnet.att.net> writes

>Can somebody please describe the standard rules for alignment
>(and their relation to 'array packing rule')?
>
>I know that the first member of a structure/union must have offset 0.
>There may be holes between members or at the end. There can not
>be holes between array elements(is it the exact rule???).
>
>Did I leave out anything?
>
>What can I infer from these declarations (about alignment)?

All you can infer is that there must be sufficient packing in any object
so that if the first of two contiguous instances is correctly aligned so
will the second. Note this imposes requirements on the size of some
objects. E.g.

struct X {
int i;
char c;
};

on a machine with 4-byte alignment for int must add packing so that the
sizeof(X) is a multiple of 4.


Francis Glassborow Journal Editor, Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

Nick Maclaren

unread,
Aug 30, 1999, 3:00:00 AM8/30/99
to
In article <37C9A8...@worldnet.att.net>,

Saroj Mahapatra <saroj-...@worldnet.att.net> wrote:
>
>Can somebody please describe the standard rules for alignment
>(and their relation to 'array packing rule')?

No. Seriously. We discussed this, and came to that conclusion.

The rule is ROUGHLY that each data type has an alignment, that is
a strictly positive integer decided no later than program loading
time, and is the same for each datum of a single type. It also
(in my best guess - I dare not say opinion) obeys the following
rules:

The alignment of an array of type is the same as that of type,
which also implies that the size of a type is a multiple of its
alignment.

The alignment of a struct or union of types is a multiple of
the LCM of the alignments of all the member types.

The alignment of char is 1.

Qualifiers are ignored when considering alignments, though I am
not certain whether 'unsigned int *' and 'int *' need have the same
alignment just because 'unsigned int' and 'int' do.

A pointer to type1 can be converted to a pointer to type2
and back again without loss of information only if the alignment
of type1 is a multiple of that of type2.

There is NO requirement for the value of a pointer converted to
an integer by type punning to be the same as that of one converted
by a cast to be the same as that used for calculating alignment.

There is no requirement for an implementation to document the
alignments of any datum, what the rules are, or even to keep them
the same in different runs of the same program.

0 new messages