Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How big is an unsigned long in Linux compiled by GCC for a 64-bit machine?

1,227 views
Skip to first unread message

Alan Mackenzie

unread,
Mar 28, 2021, 9:40:48 AM3/28/21
to
Hello, c.l.c.

I'm trying to hack a bit of the Linux kernel. In some places, an
unsigned long gets cast to a pointer.

This is fine as long as an unsigned long is 64-bits. It's less fine if
it's 32-bits, and the casting zero fills the top 32 bits of the pointer.

I've tried looking for the answer in the GCC manual, but not found the
information.

So, would somebody please tell me how many bits there are in an unsigned
long in GCC, or alternatively tell me where I should have looked to find
the answer.

Thanks!

--
Alan Mackenzie (Nuremberg, Germany).

Lew Pitcher

unread,
Mar 28, 2021, 10:14:38 AM3/28/21
to
On Sun, 28 Mar 2021 13:40:40 +0000, Alan Mackenzie wrote:

> Hello, c.l.c.
[snip]
> So, would somebody please tell me how many bits there are in an unsigned
> long in GCC, or alternatively tell me where I should have looked to find
> the answer.

Look in /usr/include/limits.h for the definition of ULONG_MAX

It may be conditional on the word size (32bit or 64bit) of your compile
target, and look something like...
# if __WORDSIZE == 64
# define ULONG_MAX 18446744073709551615UL
# else
# define ULONG_MAX 4294967295UL
# endif

HTH
--
Lew Pitcher
"In Skills, We Trust"

David Brown

unread,
Mar 28, 2021, 10:28:18 AM3/28/21
to
It is not generally a good idea to cast pointers back and forth to
"unsigned long". If you need to cast pointers to an integer type,
prefer to use C99 standard types "uintptr_t" (or possibly "intptr_t") -
that's what those types are for. Linux headers may also provide other
types for the purpose, in which case they may be preferable.

However, the ABI's for Linux always require "unsigned long" to match the
size of a pointer, AFAIK. I believe that applies to all 64-bit Linux
systems. Be slightly wary of unusual ABI's like x32 which provide for
full 64-bit arithmetic but have 32-bit pointers and long int.

For systems other than Linux, details may vary. 64-bit Windows has
32-bit long, for example. This is one of the reasons the gcc manual
doesn't give details here (though I would prefer it if did) - the sizes
depend on the target ABI, not just the processor.


James Kuyper

unread,
Mar 28, 2021, 10:47:13 AM3/28/21
to
The total number of bits in an unsigned long object is sizeof(unsigned
long)*CHAR_BIT. This expression works on every conforming implementation
of C. That includes both value bits and padding bits, which means it's
generally not the most directly relevant thing to check. You should be
checking ULONG_MAX instead.

However, even that's not right. What you really need to know is whether
or not conversion between pointers and integers of a given size is
guaranteed to be reversible, in the sense that conversion of a pointer
value to the integer type and back again is guaranteed to result in a
pointer value that will compare equal to the original.
That's guaranteed to be the case for intptr_t and uintptr_t; it's not
guaranteed for any other type, not even uintmax_t. Those typedefs are
optionally defined in <stdint.h>. Any implementation where they aren't
defined is one where such conversions are not guaranteed to be
reversible for ANY integer type. You can determine whether or not
they've been defined by checking #ifdef INTPTR_MAX.

Alan Mackenzie

unread,
Mar 28, 2021, 11:07:16 AM3/28/21
to
David Brown <david...@hesbynett.no> wrote:
> On 28/03/2021 15:40, Alan Mackenzie wrote:
>> Hello, c.l.c.

>> I'm trying to hack a bit of the Linux kernel. In some places, an
>> unsigned long gets cast to a pointer.

>> This is fine as long as an unsigned long is 64-bits. It's less fine if
>> it's 32-bits, and the casting zero fills the top 32 bits of the pointer.

>> I've tried looking for the answer in the GCC manual, but not found the
>> information.

>> So, would somebody please tell me how many bits there are in an unsigned
>> long in GCC, or alternatively tell me where I should have looked to find
>> the answer.

>> Thanks!


> It is not generally a good idea to cast pointers back and forth to
> "unsigned long". If you need to cast pointers to an integer type,
> prefer to use C99 standard types "uintptr_t" (or possibly "intptr_t") -
> that's what those types are for. Linux headers may also provide other
> types for the purpose, in which case they may be preferable.

This isn't something I'd do myself, except under very severe provocation.
This is part of Linux which has needed some love for quite a long time.
I think the unsigned long came in when 32-bit machines were the latest
fashion. Maybe the author wanted to be able to do arithmetic on it
without the danger of the offset being multiplied by the element size,
which could have happened if he'd declared it u16 *.

> However, the ABI's for Linux always require "unsigned long" to match the
> size of a pointer, AFAIK. I believe that applies to all 64-bit Linux
> systems. Be slightly wary of unusual ABI's like x32 which provide for
> full 64-bit arithmetic but have 32-bit pointers and long int.

OK, thanks, that's good to know. In any amendments I succeed in making,
I certainly won't be introducing any new "unsigned long"s which are
really pointers.

> For systems other than Linux, details may vary. 64-bit Windows has
> 32-bit long, for example. This is one of the reasons the gcc manual
> doesn't give details here (though I would prefer it if did) - the sizes
> depend on the target ABI, not just the processor.

OK, <sigh>. It's just one of these historical things, I suppose, where
it's a lot easier to criticise in hindsight than to do the right thing at
the right time.

Alan Mackenzie

unread,
Mar 28, 2021, 11:23:14 AM3/28/21
to
Thanks! Then, of course, I need to find __WORDSIZE, which is

#define __WORDSIZE (__SIZEOF_LONG__ * 8)

. That's where the trail runs cold. I can't find any definition of
__SIZEOF_LONG__, neither in a .h file nor in a manual.

All these definitions of one thing in terms of another are great for
maintaining consistency, but utterly unhelpful when one needs the actual
information, for whatever reason.

> --
> Lew Pitcher
> "In Skills, We Trust"

David Brown

unread,
Mar 28, 2021, 11:47:22 AM3/28/21
to
There are certainly reasons for wanting to cast a pointer to an integer
type, or the reverse. Sometimes there are even /good/ reasons for doing
so. But /if/ you are going to do that, in portable code, it is best to
use uintptr_t (or an OS-specific type, if there is one) rather than
using unsigned int.

>> However, the ABI's for Linux always require "unsigned long" to match the
>> size of a pointer, AFAIK. I believe that applies to all 64-bit Linux
>> systems. Be slightly wary of unusual ABI's like x32 which provide for
>> full 64-bit arithmetic but have 32-bit pointers and long int.
>
> OK, thanks, that's good to know. In any amendments I succeed in making,
> I certainly won't be introducing any new "unsigned long"s which are
> really pointers.
>
>> For systems other than Linux, details may vary. 64-bit Windows has
>> 32-bit long, for example. This is one of the reasons the gcc manual
>> doesn't give details here (though I would prefer it if did) - the sizes
>> depend on the target ABI, not just the processor.
>
> OK, <sigh>. It's just one of these historical things, I suppose, where
> it's a lot easier to criticise in hindsight than to do the right thing at
> the right time.
>

Yes. In particular, there was no "right" type for converting pointers
to an arithmetic type until C99. Prior to C99, the "right" thing would
be for the OS to provide a type in its compatibility layers (along with
things like the endianness of the system, and anything else that needs
to be adjusted for particular targets). But it was certainly common to
use "unsigned long" for the job.

The disadvantages of using "unsigned long" here were clear when moving
to 64-bit systems. For Linux (and indeed, *nix in general, AFAIK), the
size of long was picked to be 64-bit on the basis that the most common
unwarranted assumption about "long" is that it is the same size as a
pointer. In 64-bit Windows, "long" was set at 32-bit on the basis that
of the common unwarranted assumption that "long" is always 32 bits.
Neither of these was always correct, of course, and some code was
written assuming "long" was /both/ 32-bit /and/ the size of a pointer.

As you say, it is a historical thing - all we can do is try to avoid
repeating the same mistakes.

Richard Damon

unread,
Mar 28, 2021, 12:53:05 PM3/28/21
to
I think the issue here is that the Linux code base predates C99 where
uintptr_t was created, so they did what they could and specified that
the ABI required that long be big enough for a pointer.

While it might make sense to go back and change the usage to now be
uintptr_t or intptr_t, this might create a possible backwards
incompatibility, as intptr_t might be some other type than long (maybe
int or long long) if that type was also the size of long, and then
pointers to these types would be incompatible.

David Brown

unread,
Mar 28, 2021, 1:01:35 PM3/28/21
to
Yes.

> While it might make sense to go back and change the usage to now be
> uintptr_t or intptr_t, this might create a possible backwards
> incompatibility, as intptr_t might be some other type than long (maybe
> int or long long) if that type was also the size of long, and then
> pointers to these types would be incompatible.
>

Indeed. One must always be careful in dealing with existing code, and
"fixing" old "mistakes" does not necessarily fix things. uintptr_t
might be a typedef for "unsigned long", or for "unsigned long long" on a
64-bit system. (It is unlikely to be anything else for 64-bit gcc.)

Vir Campestris

unread,
Mar 28, 2021, 4:20:01 PM3/28/21
to
check sizeof(unsigned long) which will give you the size in chars.

char is almost certainly 8 bits but you can check that by looking at
UCHAR_MAX== 255 (or possible SCHAR_MAX=127).

I don't think I've ever used C on a machine that doesn't have 8 bit
chars, although I've used them with other languages. They're quite rare.

Andy

Lew Pitcher

unread,
Mar 28, 2021, 4:42:21 PM3/28/21
to
/*
** this should help
*/
#include <stdio.h>
#include <limits.h>

int main(void)
{
printf("unsigned long is %ld bits wide\n",
sizeof(unsigned long) * CHAR_BIT);
return 0;

Barry Schwarz

unread,
Mar 28, 2021, 5:28:01 PM3/28/21
to
On Sun, 28 Mar 2021 20:42:13 -0000 (UTC), Lew Pitcher
<lew.p...@digitalfreehold.ca> wrote:

>/*
>** this should help
>*/
>#include <stdio.h>
>#include <limits.h>
>
>int main(void)
>{
> printf("unsigned long is %ld bits wide\n",
> sizeof(unsigned long) * CHAR_BIT);
> return 0;
>}

Wouldn't %zu be better than %ld?

--
Remove del for email

Lew Pitcher

unread,
Mar 28, 2021, 5:52:32 PM3/28/21
to
Good point.

Keith Thompson

unread,
Mar 28, 2021, 6:32:14 PM3/28/21
to
Alan Mackenzie <a...@muc.de> writes:
> I'm trying to hack a bit of the Linux kernel. In some places, an
> unsigned long gets cast to a pointer.
>
> This is fine as long as an unsigned long is 64-bits. It's less fine if
> it's 32-bits, and the casting zero fills the top 32 bits of the pointer.
>
> I've tried looking for the answer in the GCC manual, but not found the
> information.
>
> So, would somebody please tell me how many bits there are in an unsigned
> long in GCC, or alternatively tell me where I should have looked to find
> the answer.

The Linux kernel is not written in portable C. It's intended to be
compiled with gcc (or possibly clang or icc) with a certain set of
options.

In effect, it's written in a dialect that imposes some requirements in
addition to those defined by the C standard.

I believe there is code in the kernel that relies on the assumption that
pointers are the same size as unsigned long, and that converting between
them does not lose information. I believe it may also assume that all
pointers are the same size.

The Linux kernel supports both 32-bit and 64-bit systems, so both
unsigned int and pointer types can be 32 or 64 bits.

I suggest looking at existing code in the kernel, particularly in the
section you're working on, and try to follow its existing conventions
(unless there's a serious existing problem with them).

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

anti...@math.uni.wroc.pl

unread,
Mar 29, 2021, 9:49:56 AM3/29/21
to
It is usually much easier to create small program that prints
quantities of interest (ULONG_MAX in this case). Just compile
the program and run it. Compiler is much better than you at
getting trough maze of definitions.

BTW: If you are cross compiling and can not run the program
you still may use preprocessor and look at expansion of
your program.

--
Waldek Hebisch

Alan Mackenzie

unread,
Mar 29, 2021, 1:08:07 PM3/29/21
to
But that may involve a compile time directive setting ULONG_MAX, whose
name I don't know. How would I know I'd compiled it the same way as
Linux compiles? It's a long trawl through the Makefiles. :-(

> BTW: If you are cross compiling and can not run the program
> you still may use preprocessor and look at expansion of
> your program.

Yes.

> --
> Waldek Hebisch

Siri Cruise

unread,
Mar 29, 2021, 2:15:17 PM3/29/21
to
In article <s3q10o$umu$1...@news.muc.de>,
Alan Mackenzie <a...@muc.de> wrote:

> This is fine as long as an unsigned long is 64-bits. It's less fine if
> it's 32-bits, and the casting zero fills the top 32 bits of the pointer.

In your makefile you can do something like

longwidth:
(echo '#include <stdio.h>'; echo '#include <limits.h>'; \
echo 'int main(int n,char**p) {'; \
echo 'printf("%d", (int)(sizeof(long unsigned)*CHAR_BIT; '; \
echo 'return 0;}') > /tmp/longwidth.c
cc -o longwidth /tmp/longwidth.c

xyz.o: xyz.c longwidth
cc -Dlongwidth=$$(longwidth) -o xyz.o xyz.c

This also allows you to refer to longwidth in #ifs.


Knowing is better than guessing.

--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Discordia: not just a religion but also a parody. This post / \
I am an Andrea Doria sockpuppet. insults Islam. Mohammed

Alan Mackenzie

unread,
Apr 2, 2021, 5:07:42 PM4/2/21
to
Alan Mackenzie <a...@muc.de> wrote:
> David Brown <david...@hesbynett.no> wrote:

[ .... ]

>> It is not generally a good idea to cast pointers back and forth to
>> "unsigned long". If you need to cast pointers to an integer type,
>> prefer to use C99 standard types "uintptr_t" (or possibly "intptr_t") -
>> that's what those types are for. Linux headers may also provide other
>> types for the purpose, in which case they may be preferable.

> This isn't something I'd do myself, except under very severe provocation.
> This is part of Linux which has needed some love for quite a long time.
> I think the unsigned long came in when 32-bit machines were the latest
> fashion. Maybe the author wanted to be able to do arithmetic on it
> without the danger of the offset being multiplied by the element size,
> which could have happened if he'd declared it u16 *.

>> However, the ABI's for Linux always require "unsigned long" to match the
>> size of a pointer, AFAIK. I believe that applies to all 64-bit Linux
>> systems. Be slightly wary of unusual ABI's like x32 which provide for
>> full 64-bit arithmetic but have 32-bit pointers and long int.

> OK, thanks, that's good to know. In any amendments I succeed in making,
> I certainly won't be introducing any new "unsigned long"s which are
> really pointers.

And I'm now eating my words, having done precisely that. ;-) The
reason these things are defined as unsigned longs is because they are
frequently subtracted from eachother, then divided by 2, and things like
that. With pointers declared as pointers, they have to be cast
repeatedly to UL just to do the arithmetic. I hate casts!. So implicit
conversions it is.

At least the amended SW is working, after a fashion.

[ .... ]

James Kuyper

unread,
Apr 2, 2021, 7:47:32 PM4/2/21
to
On 4/2/21 5:07 PM, Alan Mackenzie wrote:
> Alan Mackenzie <a...@muc.de> wrote:
>> David Brown <david...@hesbynett.no> wrote:
>
> [ .... ]
>
>>> It is not generally a good idea to cast pointers back and forth to
>>> "unsigned long". If you need to cast pointers to an integer type,
>>> prefer to use C99 standard types "uintptr_t" (or possibly "intptr_t") -
>>> that's what those types are for. Linux headers may also provide other
>>> types for the purpose, in which case they may be preferable.
...
>> OK, thanks, that's good to know. In any amendments I succeed in making,
>> I certainly won't be introducing any new "unsigned long"s which are
>> really pointers.
>
> And I'm now eating my words, having done precisely that. The
> reason these things are defined as unsigned longs is because they are
> frequently subtracted from eachother, then divided by 2, and things like
> that. With pointers declared as pointers, they have to be cast
> repeatedly to UL just to do the arithmetic. I hate casts!. So implicit
> conversions it is.

As he said above, that's still not a justification for using unsigned
long. You should use uint_ptr_t for such work.
0 new messages