Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

64 bit porting

0 views
Skip to first unread message

Mohanasundaram

unread,
Jul 9, 2004, 8:58:53 AM7/9/04
to
Hi All,

We are working on porting a product written in C and C++ from 32 bit
to 64 bit. We need to maintain both 32 bit and 64 bit versions in the
future. We took the 32 bit source code and copiled it using a 64 bit
compiler and fixed all the compilation warnings. Compilation went
through fine but the product breaks in lots of places. We understood
that porting a 32 bit code to 64 bit platform is not just a matter of
compilation. We have to handle the problems which will not be caught
by the compiler due to the change from ILP32 to LP64. So we are trying
to list out all the possible problems that might accour due to the
change from size of long and pointer from 4 bytes to 8 bytes with int
still being 4 bytes. We have listed out few possibilites of the places
where bugs might creep up. We want to validate the correctness of the
points and would like to add more in to this list. Please help us.

1. Change all the long to integers blindly. But not integers to long.
We think this might solve the following problems
(a) Code written using bitwise operators assuming that the size of
long is 4 will create problems
(b) Getting the offsets of the fields in structures by not using
OFFSET macro will create problems when the structe has longs
(c) Manipulating the long data bytewise by breaking them using
pointers like long i = 1; char a = ((char *)&i)[0];
etc.....

2. Check for all the library functions which returns long, like atol
and make sure no code is written assuming that the reaturn value
is of 4 bytes. Or consider changing atol to atoi or simlar functions.

3. If C style memory allocation is used insted of "new" then there are
possibility for bugs.
long *ptr = malloc(4*2);
in 32 bit compilation the above statement will allocate 8 bytes of
memory and ptr can be used as an array of two elements.
But in 64 bit compilation it will allocate 8 bytes and the number of
elements in the array is one. So if code is written
assuming that the number of elements is two then it will break. So
all "malloc"s "calloc"s and "realloc"s should be checked.

4. If the pointers are casted to integers anywhere it has to be
checked.
For example
int a = 10;
int *ptr = &a;
int b = reinterpret_cast<int>(a);
The above code will crete problems in 64 bit compilation since
pointer is 8 bytes and int is 4 bytes. So it has to be
changed to
long b = reinterpret_cast<long>(a);

5. Getting the offsets of the structure fields by assuming the size of
the fields and not using OFFSET macro. How does this stuff
work in case of unions or classes

6. size_t is a 32 bit quantity in 32 bit compilation wherein it grows
to 64 in 64 bit compilation.

7. The format specifiers should be checked for example
in printfs and scanfs
long i = <some expression>;
printf("%d",i);
will not be a big problem as far as the result is converned but it
will print wrong values when the value is i is very
big and exceeds the limit of integer.

Thanks a lot for your time.

Regards,
Mohan.

Dan Pop

unread,
Jul 9, 2004, 9:38:49 AM7/9/04
to

> We are working on porting a product written in C and C++ from 32 bit
>to 64 bit. We need to maintain both 32 bit and 64 bit versions in the
>future.

If you do your job right, you will have only version to maintain, that
will work on both 32 and 64-bit platforms. This is usually called
64-bit clean code.

>1. Change all the long to integers blindly. But not integers to long.

Don't change *anything* blindly. Try to understand *all* the implications
of *each and every* change you make.

> We think this might solve the following problems
> (a) Code written using bitwise operators assuming that the size of
>long is 4 will create problems

Fix such code instead.

> (b) Getting the offsets of the fields in structures by not using
>OFFSET macro will create problems when the structe has longs

Doing that is sheer stupidity in the first place. If you need such
offsets, offsetof() or pointer arithmetic are the ONLY ways to go.

> (c) Manipulating the long data bytewise by breaking them using
>pointers like long i = 1; char a = ((char *)&i)[0];
> etc.....

This is not affected by 32 vs 64 bit issues, but may be affected by byte
order issues. Switching from long to int buys you nothing. And you
really want to use unsigned char for this purpose.

>2. Check for all the library functions which returns long, like atol
>and make sure no code is written assuming that the reaturn value
>is of 4 bytes. Or consider changing atol to atoi or simlar functions.

Much better, remove *all* the dependencies of the C types sizes in the
code, if reasonably possible.

>3. If C style memory allocation is used insted of "new" then there are
>possibility for bugs.
> long *ptr = malloc(4*2);
> in 32 bit compilation the above statement will allocate 8 bytes of
>memory and ptr can be used as an array of two elements.
> But in 64 bit compilation it will allocate 8 bytes and the number of
>elements in the array is one. So if code is written
> assuming that the number of elements is two then it will break. So
>all "malloc"s "calloc"s and "realloc"s should be checked.

Indeed, and the proper fix is:

long *ptr = malloc(2 * sizeof *ptr);

which is correct *everywhere*.

>4. If the pointers are casted to integers anywhere it has to be
>checked.
> For example
> int a = 10;
> int *ptr = &a;
> int b = reinterpret_cast<int>(a);
> The above code will crete problems in 64 bit compilation since
>pointer is 8 bytes and int is 4 bytes. So it has to be
> changed to
> long b = reinterpret_cast<long>(a);

^^^^^^^^^^^^^^^^^^^^^^
This is not valid C syntax, so I don't know what you're talking about.
If you need to convert pointers to integers, the type unsigned long is
the best choice on both 32 (ILP32) and 64-bit (I32LP64) platforms.

>5. Getting the offsets of the structure fields by assuming the size of
>the fields and not using OFFSET macro.

Deja vu (point 1b above).

>How does this stuff work in case of unions or classes

It is not needed for unions (all members have offset 0) and there are
no classes in C.

>6. size_t is a 32 bit quantity in 32 bit compilation wherein it grows
>to 64 in 64 bit compilation.

Why should your code care about the size of size_t?

>7. The format specifiers should be checked for example
> in printfs and scanfs
> long i = <some expression>;
> printf("%d",i);
> will not be a big problem as far as the result is converned but it
>will print wrong values when the value is i is very
> big and exceeds the limit of integer.

This code is already broken and it works by pure accident. If i has type
long, %d is NOT an option. %ld will correctly work on both 32 and 64-bit
platforms.

It looks like your code was severely broken even on 32-bit platforms and
it worked by luck/accident. Once you fix it, if you do the job right,
it will work equally well on both 32 and 64-bit platforms, without needing
separate versions.

If you need to share binary files between 32 and 64-bit platforms, pay
extra attention to the definition of the data that gets written into the
files.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de

Tim Prince

unread,
Jul 10, 2004, 12:02:17 AM7/10/04
to

"Dan Pop" <Dan...@cern.ch> wrote in message
news:ccm759$6mn$1...@sunnews.cern.ch...

> >6. size_t is a 32 bit quantity in 32 bit compilation wherein it grows
> >to 64 in 64 bit compilation.
>
> Why should your code care about the size of size_t?
>

Some "C/C++" (sic) customers are adamant that there has to be a way to bury
size_t stuff in the middle of a struct without padding or breaking
alignments between platforms, or that any compiler which barfs at storing
int and size_t interchangeably is broken. When that comes up, it's a
probable sign that I should go back to projects which don't have C++ in
them.


Igmar Palsenberg

unread,
Jul 10, 2004, 12:20:31 PM7/10/04
to
Mohanasundaram wrote:

> 1. Change all the long to integers blindly. But not integers to long.
> We think this might solve the following problems
> (a) Code written using bitwise operators assuming that the size of
> long is 4 will create problems
> (b) Getting the offsets of the fields in structures by not using
> OFFSET macro will create problems when the structe has longs
> (c) Manipulating the long data bytewise by breaking them using
> pointers like long i = 1; char a = ((char *)&i)[0];
> etc.....

If you're on a *NIX system, use sys/types.h, and use things like
u_int32_t, int32_t, etc, etc. If you're on Windows, use something that
looks like it.



> 2. Check for all the library functions which returns long, like atol
> and make sure no code is written assuming that the reaturn value
> is of 4 bytes. Or consider changing atol to atoi or simlar functions.

Fix the library I would say.

> 3. If C style memory allocation is used insted of "new" then there are
> possibility for bugs.
> long *ptr = malloc(4*2);
> in 32 bit compilation the above statement will allocate 8 bytes of
> memory and ptr can be used as an array of two elements.
> But in 64 bit compilation it will allocate 8 bytes and the number of
> elements in the array is one. So if code is written
> assuming that the number of elements is two then it will break. So
> all "malloc"s "calloc"s and "realloc"s should be checked.

use malloc(sizeof(long) * 2) for that.

> 4. If the pointers are casted to integers anywhere it has to be
> checked.
> For example
> int a = 10;
> int *ptr = &a;
> int b = reinterpret_cast<int>(a);
> The above code will crete problems in 64 bit compilation since
> pointer is 8 bytes and int is 4 bytes. So it has to be
> changed to
> long b = reinterpret_cast<long>(a);

Fix the code. Casting pointers to ints is a sign of problems in the design.



> 5. Getting the offsets of the structure fields by assuming the size of
> the fields and not using OFFSET macro. How does this stuff
> work in case of unions or classes

The compiler knows the offset. Since you can use members of structs
directly, I hardly see a reason to use them.

> 6. size_t is a 32 bit quantity in 32 bit compilation wherein it grows
> to 64 in 64 bit compilation.

size_t can be a 64 bits variable in 32 bits platforms. It is common
these days, since offsets need 64 bits when dealing with large files.


> 7. The format specifiers should be checked for example
> in printfs and scanfs
> long i = <some expression>;
> printf("%d",i);
> will not be a big problem as far as the result is converned but it
> will print wrong values when the value is i is very
> big and exceeds the limit of integer.

Replace long by an type that indicates what variable and length you
actually mean. That saves tons of headaches, an make the code better to
read.


> Thanks a lot for your time.
>
> Regards,
> Mohan.

Igmar

Stephen Sprunk

unread,
Jul 10, 2004, 5:59:19 PM7/10/04
to
"Igmar Palsenberg" <ig...@non-existant.local> wrote in message
news:40f01837$0$48959$e4fe...@news.xs4all.nl...

> > 6. size_t is a 32 bit quantity in 32 bit compilation wherein it grows
> > to 64 in 64 bit compilation.
>
> size_t can be a 64 bits variable in 32 bits platforms. It is common
> these days, since offsets need 64 bits when dealing with large files.

Why would size_t be 64b on a 32b platform? size_t is the maximum size of a
single allocated object _in memory_, so where is the need for it to exceed
the size of the address space?

There's at least one well-known case where size_t is smaller than the
address space size, but on what implementations can it be larger?

S

--
Stephen Sprunk "Those people who think they know everything
CCIE #3723 are a great annoyance to those of us who do."
K5SSS --Isaac Asimov

jacob navia

unread,
Jul 10, 2004, 6:35:27 PM7/10/04
to

"Stephen Sprunk" <ste...@sprunk.org> a écrit dans le message de
news:0af4f619f23b619d...@news.teranews.com...

> "Igmar Palsenberg" <ig...@non-existant.local> wrote in message
> There's at least one well-known case where size_t is smaller than the
> address space size, but on what implementations can it be larger?

In the bloated ones :-)

Igmar Palsenberg

unread,
Jul 11, 2004, 6:31:58 AM7/11/04
to
Stephen Sprunk wrote:

> Why would size_t be 64b on a 32b platform? size_t is the maximum size of a
> single allocated object _in memory_, so where is the need for it to exceed
> the size of the address space?

Never mind the size_t remark : That should be off_t.

> There's at least one well-known case where size_t is smaller than the
> address space size, but on what implementations can it be larger?

None it seems :)

Igmar

Randy Howard

unread,
Jul 11, 2004, 1:15:25 PM7/11/04
to
In article <40f1180d$0$93324$e4fe...@news.xs4all.nl>, ig...@non-existant.local
says...

I suppose if you had a magic compiler that supported PAE extension for
Intel, it would be possible to have "magic pointers" and offsets that
were outside the range of 32-bit registers by themselves. As may, or
may not be known, 32-bit operating systems such as the W2K and W2K3
server platforms plus Linux distributions can see much more than the
expected 4GB limitation, in some cases as much as 64GB of RAM using
the "PAE" hack Intel came up with. However, most of the time, there
is still a 2GB per process address space limitation, so I'm not sure
how this magical compiler could get around such a limit in all cases,
perhaps using something akin to the MS "AWE API" on your behalf.

I suspect such a compiler would always be buggy, and cost more than
anyone cares to imagine.

It's quite a bit easier to buy a motherboard and AMD 64-bit CPU
for < $500 and go on your merry way. :-)

--
Randy Howard
To reply, remove FOOBAR.

Rupert Pigott

unread,
Jul 14, 2004, 10:14:30 AM7/14/04
to
Dan Pop wrote:

[SNIP]

>>3. If C style memory allocation is used insted of "new" then there are
>>possibility for bugs.
>> long *ptr = malloc(4*2);

Weird. Size my very first malloc program I've been using sizeof() to
work out how big I want stuff.

/* Single ptr */
long* ptr = malloc( sizeof( long* ) );

>> in 32 bit compilation the above statement will allocate 8 bytes of
>>memory and ptr can be used as an array of two elements.
>> But in 64 bit compilation it will allocate 8 bytes and the number of
>>elements in the array is one. So if code is written
>> assuming that the number of elements is two then it will break. So
>>all "malloc"s "calloc"s and "realloc"s should be checked.
>
>
> Indeed, and the proper fix is:
>
> long *ptr = malloc(2 * sizeof *ptr);

Erm... Wouldn't something like the following be a little safer and
easier explain for an array of two or more pointers ?

/* Array of 2 ptrs */
long* ptr = malloc( sizeof( long*[2] ) );

The rationale for this approach is that you're taking into account
any weird array element alignment stuff that the compiler might
want to do. I did come unstuck with this many moons ago when I
wrote the following :

short* array = malloc( sizeof( short ) * 42 );

... The compiler liked to pad shorts up to a word boundary (so
there was practically zero point in them). Therefore I had not
allocated enough memory. When I wrote to that array I ended up
corrupting other stuff and the program died a firey death after
some entertaining but wrong results... After much hair pulling
I changed the line to take this into account, it became :

short* array = malloc( sizeof( short[42] ) );

> which is correct *everywhere*.

The big question is : Does ANSI C permit compilers to pad array
elements up to some other size like it does with structures ?
My guess is *no* given the amount of code that runs fine with
the more intuitive "sizeof( type ) * n" approach.

Cheers,
Rupert

Richard Bos

unread,
Jul 14, 2004, 10:27:39 AM7/14/04
to
Rupert Pigott <r...@try-removing-this.darkboong.demon.co.uk> wrote:

> Dan Pop wrote:
>
> > Indeed, and the proper fix is:
> >
> > long *ptr = malloc(2 * sizeof *ptr);
>
> Erm... Wouldn't something like the following be a little safer and
> easier explain for an array of two or more pointers ?
>
> /* Array of 2 ptrs */
> long* ptr = malloc( sizeof( long*[2] ) );
>
> The rationale for this approach is that you're taking into account
> any weird array element alignment stuff that the compiler might
> want to do.

The implementation is not allowed to do any weird array alignment stuff,
unless it also does it in an array of one element, aka the base type.
I.e., sizeof (long*[2]) _must_ be 2*sizeof (long)

> I did come unstuck with this many moons ago when I
> wrote the following :
>
> short* array = malloc( sizeof( short ) * 42 );
>
> ... The compiler liked to pad shorts up to a word boundary (so
> there was practically zero point in them). Therefore I had not
> allocated enough memory.

If it did that for the array, but _not_ for individual shorts, it was
not a C compiler.

> The big question is : Does ANSI C permit compilers to pad array
> elements up to some other size like it does with structures ?

No. Not more so than the individual elements.

Richard

Eric Sosman

unread,
Jul 14, 2004, 10:37:34 AM7/14/04
to
Rupert Pigott wrote:
> Dan Pop wrote:
>
> [SNIP]
>
>>> 3. If C style memory allocation is used insted of "new" then there are
>>> possibility for bugs.
>>> long *ptr = malloc(4*2);
>
> Weird. Size my very first malloc program I've been using sizeof() to
> work out how big I want stuff.
>
> /* Single ptr */
> long* ptr = malloc( sizeof( long* ) );

If this sample is representative, you've been using
it incorrectly ...

>>> in 32 bit compilation the above statement will allocate 8 bytes of
>>> memory and ptr can be used as an array of two elements.
>>> But in 64 bit compilation it will allocate 8 bytes and the number of
>>> elements in the array is one. So if code is written
>>> assuming that the number of elements is two then it will break. So
>>> all "malloc"s "calloc"s and "realloc"s should be checked.
>>
>> Indeed, and the proper fix is:
>>
>> long *ptr = malloc(2 * sizeof *ptr);
>
> Erm... Wouldn't something like the following be a little safer and
> easier explain for an array of two or more pointers ?
>
> /* Array of 2 ptrs */
> long* ptr = malloc( sizeof( long*[2] ) );

Same error as in the first sample. The snippet described
as "the proper fix" may be less easy to explain (to some), but
it has the virtue of being correct. "Things should be as
simple as possible, and no simpler."

> The rationale for this approach is that you're taking into account
> any weird array element alignment stuff that the compiler might
> want to do. I did come unstuck with this many moons ago when I
> wrote the following :
>
> short* array = malloc( sizeof( short ) * 42 );

This one's correct.

> ... The compiler liked to pad shorts up to a word boundary (so
> there was practically zero point in them). Therefore I had not
> allocated enough memory.

The snippet you've shown allocates enough memory for
forty-two `short's, padding or no. If `sizeof(short)'
failed to include the padding, the compiler was broken --
and broken so badly that it's hard to imagine it surviving
even the most rudimentary set of tests. Although you were
there and I wasn't, it seems more likely that you've mis-
remembered some aspect of the problem than that the compiler
could be so seriously and obviously defective.

> When I wrote to that array I ended up
> corrupting other stuff and the program died a firey death after
> some entertaining but wrong results... After much hair pulling
> I changed the line to take this into account, it became :
>
> short* array = malloc( sizeof( short[42] ) );

This has exactly the same meaning as the previous line.
If the implementation behaved differently, it was broken.

>> which is correct *everywhere*.
>
> The big question is : Does ANSI C permit compilers to pad array
> elements up to some other size like it does with structures ?
> My guess is *no* given the amount of code that runs fine with
> the more intuitive "sizeof( type ) * n" approach.

The `sizeof' an array of N elements is N times the
`sizeof' a single element. The `sizeof' an array element
of type T is equal to the `sizeof' a free-standing object
of that type. If there's any padding involved, it's part
of each and every T object.

--
Eric....@sun.com

Message has been deleted

Rupert Pigott

unread,
Jul 14, 2004, 12:43:24 PM7/14/04
to
Arthur J. O'Dwyer wrote:

> On Wed, 14 Jul 2004, Richard Bos wrote:
>
>>Rupert Pigott <r...@try-removing-this.darkboong.demon.co.uk> wrote:
>>
>>>Dan Pop wrote:
>>>
>>>>Indeed, and the proper fix is:
>>>>
>>>> long *ptr = malloc(2 * sizeof *ptr);
>>>
>>>Erm... Wouldn't something like the following be a little safer and
>>>easier explain for an array of two or more pointers ?
>>>
>>> /* Array of 2 ptrs */
>>> long* ptr = malloc( sizeof( long*[2] ) );
>
>
> Nope. Two problems. The obvious one is that you now have one more
> type dependency in your program, and one more place that will need to
> be changed if you decide that really '*ptr' ought to be a 'long long'
> or a 'ptrdiff_t' or something.
> The less obvious mistake is the more serious: you're allocating the
> wrong amount of space! You meant
>
> long* ptr = malloc(sizeof (long[2]));

Bugger. My fault for trusting on-the-hoof thinking and not double
checking what I was *actually* typing. I was thinking of some code
I fixed back in 97 that allocated an array of pointers. :/

> ...Well, it *could*, but it would have to very carefully hide that
> fact from the programmer. In your case, I'd say that 'malloc' was
> buggy --- it ought to have realized that when you said you wanted
> room for an array of a funny size, it needed to give you a little
> extra to account for that invisible padding.

Nah, the malloc implementation was correct (one of the things I
checked). What wasn't correct was my concept of an array of
shorts being 'packed' (in PASCAL parlance) and the actual reality
of them being padded.

Thanks for the correction.

More proof that code-review works. :)

Cheers,
Rupert

Rupert Pigott

unread,
Jul 14, 2004, 12:54:36 PM7/14/04
to
Eric Sosman wrote:
> Rupert Pigott wrote:

[SNIP]

>> /* Array of 2 ptrs */
>> long* ptr = malloc( sizeof( long*[2] ) );
> Same error as in the first sample. The snippet described

The extra asterisk in the sizeof( long*[2] ) has been pointed
out to me. Combination of thinko and typo I'm afraid. :(

[SNIP]

> forty-two `short's, padding or no. If `sizeof(short)'
> failed to include the padding, the compiler was broken --
> and broken so badly that it's hard to imagine it surviving
> even the most rudimentary set of tests. Although you were

It was a nearly 15 years ago, standards were different back
then. C compilers have come on a long way during that time.

> there and I wasn't, it seems more likely that you've mis-
> remembered some aspect of the problem than that the compiler
> could be so seriously and obviously defective.

It was a "one-off" completed before the ink of C89 had time
to dry.

[SNIP]

> The `sizeof' an array of N elements is N times the
> `sizeof' a single element. The `sizeof' an array element
> of type T is equal to the `sizeof' a free-standing object
> of that type. If there's any padding involved, it's part
> of each and every T object.

That's what I thought, it seemed like "common sense" to me.

Cheers,
Rupert

Eric Sosman

unread,
Jul 14, 2004, 2:26:31 PM7/14/04
to
Rupert Pigott wrote:
> Eric Sosman wrote:
> [...]

>> forty-two `short's, padding or no. If `sizeof(short)'
>> failed to include the padding, the compiler was broken --
>> and broken so badly that it's hard to imagine it surviving
>> even the most rudimentary set of tests. Although you were
>
> It was a nearly 15 years ago, standards were different back
> then. C compilers have come on a long way during that time.

"Nearly 15 years," eh?

2004 or since 2004
- 15 you said - 15-
==== "nearly" ====
1989 1989+

Something about that date seems vaguely familiar ;-)

My own experience of C started in 1978, and I'm quite
aware that things were pretty wild and wooly before, oh,
about 1992 or so. (For all its peculiarities, most of
them probably historical, the Standard has made things
far easier for C programmers than beforehand. Sometimes
we forget just how bad it was.) But even in the Bad Old
Days it would have been passing strange to find a C compiler
for which sizeof(T[N]) != N * sizeof(T). It would have been
akin to finding a C compiler that didn't support arrays (in
fact, s/akin/equivalent/ might state the case better).

--
Eric....@sun.com

Rupert Pigott

unread,
Jul 14, 2004, 5:20:24 PM7/14/04
to
Eric Sosman wrote:
> Rupert Pigott wrote:
>
>> Eric Sosman wrote:
>> [...]
>>
>>> forty-two `short's, padding or no. If `sizeof(short)'
>>> failed to include the padding, the compiler was broken --
>>> and broken so badly that it's hard to imagine it surviving
>>> even the most rudimentary set of tests. Although you were
>>
>>
>> It was a nearly 15 years ago, standards were different back
>> then. C compilers have come on a long way during that time.
>
>
> "Nearly 15 years," eh?
>
> 2004 or since 2004
> - 15 you said - 15-
> ==== "nearly" ====
> 1989 1989+
>
> Something about that date seems vaguely familiar ;-)

Does the phrase "draft-ANSI" come to mind ? ;)

> about 1992 or so. (For all its peculiarities, most of
> them probably historical, the Standard has made things
> far easier for C programmers than beforehand. Sometimes

I welcomed the standard myself. However, some time later I
actually got hold of a copy of the standard and read it
with dismay...

Being a head-strong young utopian I felt that it gave far
too much wiggle room to vendors. That said I did know *why*
it did that...

> we forget just how bad it was.) But even in the Bad Old
> Days it would have been passing strange to find a C compiler
> for which sizeof(T[N]) != N * sizeof(T). It would have been

Compilers targetted at machines with vector units might
have had that 'feature'.

> akin to finding a C compiler that didn't support arrays (in
> fact, s/akin/equivalent/ might state the case better).


Cheers,
Rupert

Mohanasundaram

unread,
Jul 15, 2004, 3:31:04 AM7/15/04
to
Hi All,

Thanks a lot for your wonderful inputs. Can you suggest us some
possible problems which I have not listed.

Regards,
Mohan.

0 new messages