CFFI and Pandas module: array integration

49 views
Skip to first unread message

M P

unread,
Feb 15, 2017, 4:09:54 PM2/15/17
to python-cffi
I am part of a small team using CFFI to bridge C code to python so that developers can use pandas module to calculate statistics. Pandas arrays are not pure arrays, but rather complex objects called DataFrames that are 'array-like'.

We can be clever and convert C arrays to python arrays (and vice versa) with a little help from CFFI like so:

# Our DLL contains only one function that sets all array elements to 1
ffi
.cdef('void ones(float *in)')
C_Lib
= ffi.dlopen('/mylib.dll')
 
# create a numpy array
np_arr
= np.array([4.0, 2.0, 3.0], dtype=np.float32)
 
# convert numpy array to CData
cffi_arr
= ffi.cast('float*', np_arr.ctypes.data)
 
# call C function
C_Lib
.ones(cffi_arr)

but there's one thing about python arrays that is not easily mimicked: string key indices. For example:

bars['Close'][-1:]

In python this is legitimate because python arrays support negative indexing (C arrays do not) and this array has a string key 'Close'. In C we cannot mix and mash data types in one single array so we have to design it differently. DataFrames are designed to look like tables when you print them and most Pandas modules expects a format like this:

  A B C
0 1 2 3
1 4 5 6
2 7 8 9

Again, mixing char or string keys with in keys:

# Using `iloc[]`
print(df.iloc[0][0])

# Using `loc[]`
print(df.loc[0]['A'])

# Using `at[]`
print(df.at[0,'A'])

# Using `iat[]`
print(df.iat[0,0])

# Using `get_value(index, column)`
print(df.get_value(0, 'A'))

So we have a few questions to help us figure this out:

- Will CFFI ever have support for mixed-key arrays?
- What can we do to bridge this gap between the types?

Armin Rigo

unread,
Feb 19, 2017, 7:17:25 AM2/19/17
to pytho...@googlegroups.com
Hi,

On 15 February 2017 at 22:09, M P <bmpe...@gmail.com> wrote:
> - Will CFFI ever have support for mixed-key arrays?
> - What can we do to bridge this gap between the types?

In C, there are no string keys and no negative indexes. So that means
that CFFI does not support them either, and never will. For example,
there is no single standard for how string-keys can work in a C-like
setting.

The usual way is to add a pure Python class around the raw cdata
objects, and have your own ``__getitem__()`` method that does whatever
is needed. In general, you should consider cffi's cdata objects as
"internal", and not expose them to the user of the library; instead,
you expose an API that is more Pythonic. In this case, supporting
more methods, string keys, etc.


A bientôt,

Armin.
Reply all
Reply to author
Forward
0 new messages