passing uint64_t integer to a function that takes uint64_t converts it to 32 bit

893 views
Skip to first unread message

Sarvi Shanmugham

unread,
Oct 5, 2012, 8:36:10 PM10/5/12
to pytho...@googlegroups.com

This is what I tried

import nova.cffitest as cffi
>>>
>>> x=cffi.ffi.new('mytype_t *',0x100000000)  # create uint64_t integer with 0x100000000
>>> print '%x'%x[0]    #try to print it and see, prints fine
100000000
>>> print '\n%x'%cffi.test_int64(x[0])    # calling the test function with x[0] passing by value

Data=0, size=8
0
>>>

>>> print '\n%x'%cffi.test_int64(x) pass by reference causes this
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int() not supported on cdata 'unsigned long long *'

cffitest.h
--------------
typedef uint64_t u_int64_t;
typedef u_int64_t mytype_t;
mytype_t test_int64(mytype_t);

cffitest.c
-------------
mytype_t test_int64(mytype_t data)
{
    printf("\nData=%llx, size=%d", data, sizeof(data));
    return data;
}

I notice that the code generated has, shouldnt this be "unsigned long long" OR "uint64_t" instead of "unsigned long"
unsigned long _cffi_f_test_int64(unsigned long x0)
{
  return test_int64(x0);
}

Thx,
Sarvi

Armin Rigo

unread,
Oct 6, 2012, 3:14:16 AM10/6/12
to pytho...@googlegroups.com
Hi Sarvi,

On Sat, Oct 6, 2012 at 2:36 AM, Sarvi Shanmugham <sarv...@gmail.com> wrote:
>>>> print '\n%x'%cffi.test_int64(x) pass by reference causes this
> TypeError: int() not supported on cdata 'unsigned long long *'

That's correct (note the '*' at the end of the error message): you're
trying to pass a pointer when the function expects a uint64_t (which
is the same as "unsigned long long").


A bientôt,

Armin.

Armin Rigo

unread,
Oct 6, 2012, 3:22:44 AM10/6/12
to pytho...@googlegroups.com
Hi Sarvi,

On Sat, Oct 6, 2012 at 2:36 AM, Sarvi Shanmugham <sarv...@gmail.com> wrote:
> I notice that the code generated has, shouldnt this be "unsigned long long"
> OR "uint64_t" instead of "unsigned long"
> unsigned long _cffi_f_test_int64(unsigned long x0)

Ah, then the issue is likely a mistake when declaring "test_int64" in
ffi.cdef(...). You have likely written something like:

ffi.cdef("""
unsigned long test_int64(unsigned long);
""")

I tried to reproduce your bug with a correct declaration in the cdef()
but failed. If you really have a correct cdef, then maybe it's a
platform-specific issue. It works for me on Linux32 and Linux64.


A bientôt,

Armin.

Sarvi Shanmugham

unread,
Oct 6, 2012, 10:24:01 AM10/6/12
to pytho...@googlegroups.com
The cdef contains exactly what u see in the cffitest.h file

So mytype_t is defined as u_int64_t which is defined as uint64_t

Yet I see the generated wrapper code use unsigned long instead of unsigned long long

Could this be due to a cross build environment?

How do I get cffi to generate the right code?

Thx
Sarvi

Sarvi Shanmugham

unread,
Oct 6, 2012, 1:21:40 PM10/6/12
to pytho...@googlegroups.com
Ok. I tried this in a non-cross build environment directly on my build host.
Here is my results
>>> import cheese
>>> x=cheese.ffi.new('mytype_t *', 0x100000000)
>>> cheese.test_int64(x[0])

Data=100000000, size=8

Sizes int=4, long=8, long long=8
4294967296L
>>>                     

It seems like everything is working fine in a non-cross build environment.
I had my library print out the size of different types, and long is the same as "long long" on the build host

But I noticed this, the generated code is still
unsigned long _cffi_f_test_int64(unsigned long x0)
{
  return test_int64(x0);
}
But since long is the same as long long which is 64 bits, we are good.

This same code in the cross build environment produces the following
>>> print '\n%x'%cffi.test_int64(x[0])

Data=0, size=8
Sizes int=4, long=4, long long=8

0
>>>

So as I understand it using 'unsigned long' in the generated code is not guaranteed to be 64 bit and is machine dependent.

My build machine is 64 bit
bash-3.00$ uname -a
Linux sjc-ads-1549 2.6.9-89.0.11.ELlargesmp #1 SMP Mon Aug 31 11:05:24 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

My target machines is also 64 bit
[128:~]$ uname -a
Linux 128.107.163.149 2.6.32.39-x8664-fsm.cge #1 SMP PREEMPT Thu Jul 5 12:49:31 PDT 2012 x86_64 GNU/Linux


But I suspect the reason for this difference is that all libraries are being compile for 32 bit
[128:~]$ file /usr/lib/libcffitest.so
/usr/lib/libcffitest.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped

I use the -m32 flag to gcc for all the libraries that we compile and so that same is being passed to the cffi compiling as well

Sarvi

Sarvi Shanmugham

unread,
Oct 6, 2012, 1:46:26 PM10/6/12
to pytho...@googlegroups.com
To summarize, the way to reproduce the problem I think is

1. use a 64 bit build host and build python that is compiled for 64 bit
2. The target machine is 64 bit too
3. The target python is compiled for 32 bit, the target library is also compled for 32 bit
4. Compile the CFFI library also for 32 bit
5. run the 32bit compiled python on the 64 bit machine

Sarvi

Sarvi Shanmugham

unread,
Oct 6, 2012, 4:32:19 PM10/6/12
to pytho...@googlegroups.com
http://www.ibm.com/developerworks/library/l-port64/index.html

From what I understand uint64_t should always be mapped to "long long" to get 64 bits consistently.
What do you think?


Sarvi

Armin Rigo

unread,
Oct 7, 2012, 4:57:55 AM10/7/12
to pytho...@googlegroups.com
Hi Sarvi,

I'm a bit confused by the settings you're describing. I think that
you're using the host "_cffi_backend" module in your
cross-compilation, which might appear to work at first but won't work
all the way:

On Sat, Oct 6, 2012 at 10:32 PM, Sarvi Shanmugham <sarv...@gmail.com> wrote:
> From what I understand uint64_t should always be mapped to "long long" to
> get 64 bits consistently.

This would only solve the current issue but not the more general mess.
I'd recommend that you change the way you do cross-compilation. You
should run on the host machine a "_cffi_backend" compiled for the
target. This probably requires running an emulator. For the little I
understand about cross-compilation, having an emulator is anyway more
or less required for other kinds of projects, like "configure; make;
make install"-style projects: you really need to run "configure" in an
emulator.

It may be possible to hack differently, e.g.
_cffi_backend.c:b_nonstandard_integer_types() and
b_new_primitive_types(), but you're going to end up with a delicate
mess...


A bientôt,

Armin.

Sarvi Shanmugham

unread,
Oct 7, 2012, 12:10:49 PM10/7/12
to pytho...@googlegroups.com
Yes. I am using the _cffi_backend built for the build host during the compilation.
Which is why a test load of the shared so cross built for the target host generates trace back that I have to ignore cross builds.

We are compiling some 12 other python extensions without any emulators the same way I am compiling these cffi extensions.

I can't find anything through google on autoconf and emulations for cross compilation.
And we don't use it for any of the more then 100 or some components that we cross compile.


May be I don't understand the issue with data types and sizes
http://en.wikipedia.org/wiki/C_data_types

From what I can tell, int, long sizes vary with platforms. If you really want fixed length data types it looks like you need to use intN_t from stdint.h. And considering that cffi has basic data types thare infact intN_t, shouldn't that be the right choice?

Sarvi

Sarvi Shanmugham

unread,
Oct 7, 2012, 12:15:57 PM10/7/12
to pytho...@googlegroups.com
For cross compile autoconf takes options like --host=<host-type> --target=<target-type> during the configure stage.
Haven't heard of the emulation you refer to.

Sarvi

Sarvi Shanmugham

unread,
Oct 7, 2012, 10:51:56 PM10/7/12
to pytho...@googlegroups.com
Though I think cffi intN_t should map to stdint.h intN_t

I understand u will still have problem with basic int/ long and what size they should map to on the target.

Is there simpler way, without resorting to emulators.
http://www.mesa3d.org/autoconf.html

According to the above link, we should be using ./configure --build=..... --host=... --target=... --enable-32-bit
In autoconf, the --enable-32-bit allows you to force the -m32 flag to gcc get 32 bit compatible binaries even though the target is 64 bit

Shouldn't we have something equivalent to --target and --enable-32-bit for cffi code generation and compilation.

Sarvi

Sarvi Shanmugham

unread,
Oct 9, 2012, 12:34:46 AM10/9/12
to pytho...@googlegroups.com
I did a further bit of digging into the _cffi_backend.c code.

Correct me if I am wrong here.

I realize there should be no problem when the cdefs use char, short, int, long or long long as the sizes are figured when the code actually runs on the target from the _cffi_backend.so. And since _cffi_backend.so is compiled for the target in 64 or 32 bit mode, they map to all the right sizes for the target platform and 64 Vs 32 bit mode as well.

That leaves us with the problem at hand limited to the use of intN_t and uintN_t relating to how the generated C code looks like.
And this generated C code is where the problem is. Because this generated code is all about the size of these types on the target.

But here, if the cdef used int or long or such standard types, then the generated code can use those same exact types without any problem I would think.
The compiler and the compiler flags such as -m32 will take care of this when that code gets compiled.

Its only for the intN_t and the uintN_t that you need to figure what the right size/type would be on the target.
And that too only because you are trying to map intN_t to one of the basic types, like char, short, int/long, long/longlong

I see 2 ways to make this work. I don't understand that the lateral issues with these different approaches will be

Option 1:
   1. for intN_t and uintN_t the generated code uses intN_t and uintN_t from stdint.h. And everything else can stay the same.
   2. Have and ENVIRONMENT variable or command setup.py/build command line option like --enable-32-bit that tells the cffi code generated how intN_t should be mapped to basic types.

If you can give me some pointers on which way you prefer I could give it a shot and see how it turns out.

Sarvi

Sarvi Shanmugham

unread,
Oct 10, 2012, 2:11:55 AM10/10/12
to pytho...@googlegroups.com
I realized fixing/using just intN_t would not address the size of pointers, size_t, etc
So I went with an approach of defining a cross_target driven dictionary of primitve types and sizes

Here is a patch proposal to fix the cross compilation issue
It uses a CFFI_CROSSTARGET environment variable to  specify what the target platform is
and defines the dictionary nonstandard_integer_types according to the target.

This cross dictionary can be expanded to more CPU architectures in the futures.

And since this is the portion that addresses what primitive types are used in the generated code

I have talked to people who understand cross compilation better than me, and so far as I can tell haven't heard anyone use emulators.
I have heard of cases where they use target specific headers or other files are specific organized by target that used based on the specific target being built
But non on emulators.

The solution below fits that category.

It solves my cross compilation problem and seems like right solution for cross compilation. 
Hopefully this will help others doing cross compilation as well. 

Let me know if you see any holes in this.

Thx,
Sarvi

bash-3.00$ hg diff
diff -r 4f6eec10c1b2 cffi/api.py
--- a/cffi/api.py       Thu Sep 27 19:02:00 2012 +0200
+++ b/cffi/api.py       Tue Oct 09 22:40:47 2012 -0700
@@ -1,5 +1,55 @@
+import os
 import types

+cross_by_size={
+    'linux-x86_64': {
+        1 : 'char',
+        2 : 'short',
+        4 : 'int',
+        8 : 'long',
+    },
+    'linux-i686':   {
+        1 : 'char',
+        2 : 'short',
+        4 : 'int',
+        8 : 'long long',
+    },
+}
+
+UNSIGNED = 0x1000
+cross_nonstandard_integer_types ={
+    'linux-x86_64': {
+        'int8_t'    : 1,
+        'uint8_t'   : 1 | UNSIGNED,
+        'int16_t'   : 2,
+        'uint16_t'  : 2 | UNSIGNED,
+        'int32_t'   : 4,
+        'uint32_t'  : 4 | UNSIGNED,
+        'int64_t'   : 8,
+        'uint64_t'  : 8 | UNSIGNED,
+        'intptr_t'  : 8,
+        'uintptr_t' : 8 | UNSIGNED,
+        'ptrdiff_t' : 8,
+        'size_t'    : 8 | UNSIGNED,
+        'ssize_t'   : 8,
+    },
+    'linux-i686':   {
+        'int8_t'    : 1,
+        'uint8_t'   : 1 | UNSIGNED,
+        'int16_t'   : 2,
+        'uint16_t'  : 2 | UNSIGNED,
+        'int32_t'   : 4,
+        'uint32_t'  : 4 | UNSIGNED,
+        'int64_t'   : 8,
+        'uint64_t'  : 8 | UNSIGNED,
+        'intptr_t'  : 4,
+        'uintptr_t' : 4 | UNSIGNED,
+        'ptrdiff_t' : 4,
+        'size_t'    : 4 | UNSIGNED,
+        'ssize_t'   : 4,
+    },
+}
+
 class FFIError(Exception):
     pass

@@ -59,10 +109,16 @@
                 setattr(self, name, getattr(backend, name))
         #
         lines = []
-        by_size = {}
-        for cname in ['long long', 'long', 'int', 'short', 'char']:
-            by_size[self.sizeof(cname)] = cname
-        for name, size in self._backend.nonstandard_integer_types().items():
+        crosstarget=os.environ.get('CFFI_CROSSTARGET', None)
+        if crosstarget:
+            by_size=cross_by_size[crosstarget]
+            nonstandard_integer_types=cross_nonstandard_integer_types[crosstarget]
+        else:
+            by_size = {}
+            for cname in ['long', 'long long', 'int', 'short', 'char']:
+                by_size[self.sizeof(cname)] = cname
+            nonstandard_integer_types=self._backend.nonstandard_integer_types()
+        for name, size in nonstandard_integer_types.items():
             if size & 0x1000:   # unsigned
                 equiv = 'unsigned %s'
                 size &= ~0x1000

Hakan Ardo

unread,
Oct 10, 2012, 3:52:44 AM10/10/12
to pytho...@googlegroups.com
Hi,
cross compilation support would be nice :) For what it's worth, the
hack used by configure to figure out the sizes from a cross compiler
is to try to compile snippets like:

int main() {
static int test_array [1 - 2 * !(sizeof(long) >= 8)];
test_array [0] = 0;
return 0;
}

and check if the compilation was successful. It will fail to compile
if sizeof(long) is less than 8 as the length of test_array then
becomes negative. Configure performs a binary search to find the exact
size from such compilation tests.
> --
> -- python-cffi: To unsubscribe from this group, send email to
> python-cffi...@googlegroups.com. For more options, visit this group
> at https://groups.google.com/d/forum/python-cffi?hl=en
>
>



--
Håkan Ardö

Sarvi Shanmugham

unread,
Oct 10, 2012, 11:12:27 AM10/10/12
to pytho...@googlegroups.com
Is the patch I have provided a good way of doing this or do you suggest these kind of compile time snippets?

Hakan Ardo

unread,
Oct 11, 2012, 2:49:01 AM10/11/12
to pytho...@googlegroups.com
On Wed, Oct 10, 2012 at 5:12 PM, Sarvi Shanmugham <sarv...@gmail.com> wrote:
> Is the patch I have provided a good way of doing this or do you suggest these kind
> of compile time snippets?

I'm not familiar with the internals of cffi so I can't judge your
patch from that angle, but in general hardcoding the sizes sounds
dangerous and not very general to me. I don't know if the configure
approach is the right way to go. It's quite a hack, but at least it
circumvents the need for an emulator.

--
Håkan Ardö

Sarvi Shanmugham

unread,
Oct 11, 2012, 2:26:59 PM10/11/12
to pytho...@googlegroups.com, ha...@debian.org
The patch codes the data size according to the target. 
But I agree the less places these target data sizes are hard coded the better.

But since the target host specific cross compiler already knows the target hosts data sizes
I am not sure why extracting this information from it would be considered a hack?

And from the extensive number of cross compiled components, 
I am yet to come across a component that we use an emulator to compile.

I will try implementing a cross compiling patch that uses the approach you have suggested.

Thx,
Sarvi

Sarvi Shanmugham

unread,
Oct 11, 2012, 2:32:26 PM10/11/12
to pytho...@googlegroups.com, ha...@debian.org
Armin,
      Apart from defining the dictionaries nonstandard_integer_types and by_size based on the cross compilation data
is there any other dictionary and function that needs modification in this way?

Thx,
Sarvi

Sarvi Shanmugham

unread,
Oct 12, 2012, 5:27:06 AM10/12/12
to pytho...@googlegroups.com, ha...@debian.org
Have created an issue for this and provided my patch through it. 
Let me know if you need me to do it differently.

Thx,
Sarvi

Armin Rigo

unread,
Oct 13, 2012, 4:06:58 AM10/13/12
to pytho...@googlegroups.com, ha...@debian.org
Hi Sarvi,

Sorry the delay in answering this thread.

An "emulator" is sometimes used when cross-compiling to some devices.
See for example http://www.scratchbox.org/ . It gives an environment
in which we can not only compile for a foreign target architecture,
but also run snippets. For example it avoids the hacky solution of
'configure' presented by Hakan above: with an emulator, it is possible
to compile and then run anything, as opposed to just check if it
compiles.

The current set of issues comes from the fact that, without an
emulator, we have to run with _cffi_backend.so compiled for the local
architecture. So the primitive types have the wrong size for the
target architecture. Any solution to that problem looks like a hack
somehow, or at best a workaround. If you use a table of sizes that
matches the target architecture, however you compute it, then you get
a half-broken local version that probably only works as far as doing
the cross-compilation --- but would for example crash randomly if the
Python code actually tries to do anything more on the local machine,
e.g. initialize some module globals with cffi (or of course actually
calling some setup function in C).

I don't know of a clean solution. Maybe we could run on the local
machine using a different backend (not _cffi_backend) which would be
very partial: it would have the tables of primitive sizes, and enough
to run a typical verify(), but not more --- e.g. not enough to
actually build data structures or do any sort of call. You would get
clean tracebacks if you try to do anything unsupported.


A bientôt,

Armin.

Sarvi Shanmugham

unread,
Oct 13, 2012, 4:51:19 AM10/13/12
to pytho...@googlegroups.com, ha...@debian.org, ar...@tunes.org
Hi Armin,
      Most embedded linux distributions, such as redhat, riverbed, montavista do not uses emulators for cross compiling.
So I think having a non-emulator based cross compilation solution would be very useful.

Regarding the solution, I have opened and issue and attached a proposed patch based on Hakan's proposal.
Interestingly my proposed implementation does use a new type of backend called cross_backend
but builds on the existing C backend.

If CFFI_CROSSTARGET environment variable is set,  
then cross_backend essentially updated the C backend to augment some of its functions

Namely,
    backend.nonstandard_integer_types = nonstandard_integer_types
    backend.sizeof = sizeof

It uses Hakan's approach to compile a set of C test programs to figure out the sizes of different primitive data types on the target.
It then uses this information to populate a type_size table which it uses to populate the backend.nonstandard_integer types() and
the backend.sizeof() functions. For the rest it falls back to the C backend.

Yes, I agree the limitation will be that on the build host things will be limited to code generation and compilation. 
But that would be the reasonable thing to expect in a cross compilation environment anyway.

Can you take a look at the patch and let me know what you think. 
I have kept it as non-intrusive as possible to the regular non-cross compilation work flow as possible.

Thx,
Sarvi 

Hakan Ardo

unread,
Oct 17, 2012, 5:39:31 AM10/17/12
to pytho...@googlegroups.com, ar...@tunes.org
On Sat, Oct 13, 2012 at 10:51 AM, Sarvi Shanmugham <sarv...@gmail.com> wrote:
> It uses Hakan's approach to compile a set of C test programs to figure out
> the sizes of different primitive data types on the target.
> It then uses this information to populate a type_size table which it uses to
> populate the backend.nonstandard_integer types() and
> the backend.sizeof() functions. For the rest it falls back to the C backend.

I think the idea was to not fall back to the C backend in situations
where it would misbehave or crash with some strange error message.
Instead some helpful error message could be generated.

> On Saturday, October 13, 2012 1:07:39 AM UTC-7, Armin Rigo wrote:
>> An "emulator" is sometimes used when cross-compiling to some devices.
>> See for example http://www.scratchbox.org/ . It gives an environment
>> in which we can not only compile for a foreign target architecture,
>> but also run snippets. For example it avoids the hacky solution of
>> 'configure' presented by Hakan above: with an emulator, it is possible
>> to compile and then run anything, as opposed to just check if it
>> compiles.

Scratchbox is a great approach and it works mostly out of the box for
arm targets. However I have not had much luck setting it up for less
common targets.

--
Håkan Ardö

Sarvi Shanmugham

unread,
Oct 17, 2012, 7:36:45 PM10/17/12
to pytho...@googlegroups.com, ar...@tunes.org, ha...@debian.org


On Wednesday, October 17, 2012 2:39:32 AM UTC-7, Hakan Ardo wrote:
On Sat, Oct 13, 2012 at 10:51 AM, Sarvi Shanmugham <sarv...@gmail.com> wrote:
> It uses Hakan's approach to compile a set of C test programs to figure out
> the sizes of different primitive data types on the target.
> It then uses this information to populate a type_size table which it uses to
> populate the backend.nonstandard_integer types() and
> the backend.sizeof() functions. For the rest it falls back to the C backend.

I think the idea was to not fall back to the C backend in situations
where it would misbehave or crash with some strange error message.
Instead some helpful error message could be generated.

I didn't see reason to rewrite or duplicate everything already in the c_backend.
So I just override only the pieces that depend and need to change in a cross compilation environment.

Just to avoid code duplication. 

Right now, I have focusses on the sizes of primitive data types, which were the ones the c_backend was getting wrong.

If I have missed portions that need rewriting, do let me know and I will look into it.


 

> On Saturday, October 13, 2012 1:07:39 AM UTC-7, Armin Rigo wrote:
>> An "emulator" is sometimes used when cross-compiling to some devices.
>> See for example http://www.scratchbox.org/ .  It gives an environment
>> in which we can not only compile for a foreign target architecture,
>> but also run snippets.  For example it avoids the hacky solution of
>> 'configure' presented by Hakan above: with an emulator, it is possible
>> to compile and then run anything, as opposed to just check if it
>> compiles.

Scratchbox is a great approach and it works mostly out of the box for
arm targets. However I have not had much luck setting it up for less
common targets.

I am sure it is. Emulators are not available for all new CPU architecture variants, and is difficult to keep up.
Most of the embedded industry and the mainstream linux distributions, like debian, redhat, 
ubuntu, windriver and montavista don't seem to be using emulators for cross compilation.

Well, what I am saying applies for pypy as well.  
Just as CPython does not need an emulator to cross compile for a target, 
If PyPy wants to target the embedded environment, it would help if its cross buildable without emulators.

 My 2c,
Sarvi
 

--
Håkan Ardö

Armin Rigo

unread,
Oct 19, 2012, 4:31:48 AM10/19/12
to pytho...@googlegroups.com, ha...@debian.org
Hi Sarvi,

On Thu, Oct 18, 2012 at 1:36 AM, Sarvi Shanmugham <sarv...@gmail.com> wrote:
> Well, what I am saying applies for pypy as well.
> Just as CPython does not need an emulator to cross compile for a target,
> If PyPy wants to target the embedded environment, it would help if its cross
> buildable without emulators.

About PyPy, it depends on someone that wants to step up and do the
(possibly major) work needed. In the meantime, we are happily
building for ARM using emulators, PowerPC 64 is definitely big enough
to not require cross-compilation, and for any less common CPU, we
don't have a serious story anyway.


A bientôt,

Armin.

Armin Rigo

unread,
Oct 19, 2012, 4:37:07 AM10/19/12
to pytho...@googlegroups.com
Hi,

On Wed, Oct 17, 2012 at 11:39 AM, Hakan Ardo <ha...@debian.org> wrote:
>> the backend.sizeof() functions. For the rest it falls back to the C backend.
>
> I think the idea was to not fall back to the C backend in situations
> where it would misbehave or crash with some strange error message.
> Instead some helpful error message could be generated.

Yes: a sane solution looks like it would not use the existing
_cffi_backend.c at all, but instead use a pure Python equivalent.
Something similar to how backend_ctypes.py is an alternative to
_cffi_backend.c using ctypes. But much smaller: it would only have a
limited subset of functionality. The idea is to assume that in a
cross-compilation setting you cannot safely do anything at all, apart
from what you specifically support.


A bientôt,

Armin.
Reply all
Reply to author
Forward
0 new messages