Use of static arrays in Cython

2,369 views
Skip to first unread message

Ian Bell

unread,
Jul 7, 2012, 7:30:32 PM7/7/12
to cython...@googlegroups.com
I have a static double array defined in a c++ file that I am trying to port over to Cython.

I followed the recommendation from http://stackoverflow.com/questions/8320951/can-i-create-a-static-c-array-with-cython to use a hard-coded array of coefficients since I don't want to have to index a list if there is any way I can avoid it (very performance-sensitive segment of code so speed is quite important).  Otherwise I could just do a list or something like that.  Simple but too slow.

My array code implementation (in the .PXD file) looks like

cdef double *a_radial = [25932.1070099 , 0.914825434095 , -177.588568125 , -0.237052788124 , -172347.610527 , -12.0687599808 , -0.0128861161041 , -151.202604262 , -0.999674457769 , 0.0161435039267 , 0.825533456725]

and when I go to reference an element of the array (in the pure-python-mode .py file) like a_radial[0] I get a crash and no exception is produced.  Is this a bug? Or am I doing something stupid?

Thanks, Ian

Robert Bradshaw

unread,
Jul 8, 2012, 1:11:15 AM7/8/12
to cython...@googlegroups.com
It looks like this is not supported in pure mode, the declaration
needs to be in the pyx (or py) file.

- Robert

Ian Bell

unread,
Jul 8, 2012, 1:47:28 AM7/8/12
to cython...@googlegroups.com
For fun I have finally fully converted my module over to Cython code (not so happy about saying goodbye to the pure-Python-mode file for debugging purposes), but even when I do that I cannot create a static array and then index it.  For now I have just created a module scoped python list instead of the static array, but I surmise that the performance penalty for indexing is probably pretty strong?  In c++, compiler optimization with std::vector can bring std::vector indexing performance to very near the c-array solution, but since I can't get the static array working, I can't profile its indexing performance in Cython :-/

In theory, should the following code work and print '2.0'?:

    cdef double *dbls = [1.0, 2.0, 3.0, 4.0]
    print dbls[1]

Thanks,
Ian
   
- Robert

Stefan Behnel

unread,
Jul 8, 2012, 1:54:38 AM7/8/12
to cython...@googlegroups.com
Ian Bell, 08.07.2012 07:47:
> In theory, should the following code work and print '2.0'?:
>
> cdef double *dbls = [1.0, 2.0, 3.0, 4.0]
> print dbls[1]

Yes, it definitely works (we have a test for it).

What version of Cython are you using?

Stefan

Ian Bell

unread,
Jul 8, 2012, 2:45:42 AM7/8/12
to cython...@googlegroups.com
Ok, so there are a few issues with static arrays.  It seems that in pure-Python mode you cannot use static arrays at all.  Definitely not in the implementation .py file, and not in the .pxd file either.

In Cython mode, you are correct that local (function-scope) definitions of static arrays works fine.

But it seems impossible to define a static array at module-level scope of a Cython .pyx file. When I try to put a static array at module scope, I get the following compilation error:

Error compiling Cython file:
------------------------------------------------------------
...

"""
This is the module that contains all the flow models
"""

cdef double *a_radial = [25932.1070099, 0.914825434095, -177.588568125, -0.237052788124, -172347.610527, -12.0687599808, -0.0128861161041, -151.202604262, -0.999674457769, 0.0161435039267, 0.825533456725]
    ^
------------------------------------------------------------

PDSim\flow\flow_models.pyx:14:5: Literal list must be assigned to pointer at time of declaration

Since I am so hamstrung performance-wise, I could go with an array.array or something like that, but it would be far and away the simplest to just get the static array working.

Many thanks,
Ian

Robert Bradshaw

unread,
Jul 8, 2012, 3:54:39 AM7/8/12
to cython...@googlegroups.com
How about just doing

from libc.stdlib cimport malloc, free

py_data = [25932.1070099, 0.914825434095, -177.588568125,
-0.237052788124, -172347.610527, -12.0687599808, -0.0128861161041,
-151.202604262, -0.999674457769, 0.0161435039267, 0.825533456725]
ptr = <double*> malloc(sizeof(double) * len(py_data))
py_data[:len(py_data)] = py_data

Is there a reason it needs to be static?

- Robert

Stefan Behnel

unread,
Jul 8, 2012, 6:09:39 AM7/8/12
to cython...@googlegroups.com
Ian Bell, 08.07.2012 08:45:
> Ok, so there are a few issues with static arrays. It seems that in
> pure-Python mode you cannot use static arrays at all. Definitely not in
> the implementation .py file, and not in the .pxd file either.
>
> In Cython mode, you are correct that local (function-scope) definitions of
> static arrays works fine.
>
> But it seems impossible to define a static array at module-level scope of a
> Cython .pyx file.

Correct.

http://trac.cython.org/cython_trac/ticket/113

Stefan

Stefan Behnel

unread,
Jul 8, 2012, 6:10:51 AM7/8/12
to cython...@googlegroups.com
Robert Bradshaw, 08.07.2012 09:54:
> On Sat, Jul 7, 2012 at 11:45 PM, Ian Bell wrote:
>> Since I am so hamstrung performance-wise, I could go with an array.array or
>> something like that, but it would be far and away the simplest to just get
>> the static array working.
>
> How about just doing
>
> from libc.stdlib cimport malloc, free
>
> py_data = [25932.1070099, 0.914825434095, -177.588568125,
> -0.237052788124, -172347.610527, -12.0687599808, -0.0128861161041,
> -151.202604262, -0.999674457769, 0.0161435039267, 0.825533456725]
> ptr = <double*> malloc(sizeof(double) * len(py_data))
> py_data[:len(py_data)] = py_data

The last line should read

ptr[:len(py_data)] = py_data


> Is there a reason it needs to be static?

Very good question.

Stefan

Ian Bell

unread,
Jul 8, 2012, 7:15:43 PM7/8/12
to cython...@googlegroups.com

Thanks for the the malloc code - I'd like to avoid going that route since malloc/calloc is a quick route to memory leaks.  I don't have much control over when the memory is freed because I need to keep the instances of the classes in my module active as long as the module is imported.  And the class instances reference these constants, so the constants need to stay in memory as well.  Cython will not garbage-collect these malloc-ed arrays will they?  Is there a hook somewhere else that I can employ to free the malloc-ed array when the module goes out of scope and is garbage collected?  You have __dealloc__ for extension types, anything similar for modules?

But all this said, I would still prefer to just use a static array.  So much simpler - and no risk off memory leaks.

Thanks,
Ian

Feng Yu

unread,
Jul 9, 2012, 12:03:37 AM7/9/12
to cython...@googlegroups.com

Put it up as a ndarray and borrow the data pointer. Memory management is delegated to python/cython. But they are probably leaked anyways.

The simplifying assumption is a module is never unloaded.

Another catch i saw was you can write initialized variables in pxd files but the initializatiob is never performed. No warnings from the compiler. You have to initialize them in corresponding pyx files and it magically works.

Stefan Behnel

unread,
Jul 9, 2012, 3:16:55 AM7/9/12
to cython...@googlegroups.com
Ian Bell, 09.07.2012 01:15:
You can use the atexit module for cleaning up, but apart from that,
extension modules currently cannot be unloaded, so the system exit is the
only time where this gets interesting.

If we ever support reloading extension modules in Py3 (which has an
infrastructure for it), then module global variables would become module
instance local and we'd have to find a way to let users register module
cleanup code. But as long as no-one actually puts money into such a
project, I don't think this is going to happen any time soon.

Stefan

Robert Bradshaw

unread,
Jul 9, 2012, 12:12:29 PM7/9/12
to cython...@googlegroups.com
If you're talking about 11*8 bytes, allocated once at startup, I
really don't see how this is a significant issue. A statically
allocated array would be exactly the same amount of memory, just stuck
in a slightly different place (and guaranteed not to be freed until
(if ever) the module is unloaded.

For anything dynamic, I would suggest, e.g., numpy arrays and letting
Python do the memory management.

- Robert

Ian Bell

unread,
Jul 9, 2012, 12:45:19 PM7/9/12
to cython...@googlegroups.com

Robert,

Will Cython take care of free-ing the memory when the module is unloaded when the python process is killed?  88 bytes is not a large memory leak, but I'd rather avoid memory leaks if at all possible.  If cython will free the memory, I have my solution.  Indexing numpy arrays isn't super fast.

Regards,
Ian

Chris Barker

unread,
Jul 9, 2012, 1:20:01 PM7/9/12
to cython...@googlegroups.com
On Mon, Jul 9, 2012 at 9:45 AM, Ian Bell <ian.h...@gmail.com> wrote:
>> For anything dynamic, I would suggest, e.g., numpy arrays and letting
>> Python do the memory management.

> Will Cython take care of free-ing the memory when the module is unloaded
> when the python process is killed?

when the process is killed, memory is freed - wouldn't that be a OS bug if not?

> Indexing numpy arrays isn't super fast.

It can be, depending on the array and how declare it -- and grabbing
the pointer from it will then be as fast as C ('cause it is C...)

see the recent thread:

Best Practices for passing numpy data pointer to C


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris....@noaa.gov

Robert Bradshaw

unread,
Jul 9, 2012, 1:40:47 PM7/9/12
to cython...@googlegroups.com
Yes, when a process exits all memory it ever requested is returned to
the OS. Allocating some memory when the module loads and releasing it
when the process exits is precisely what happens for a static array
(though through slightly different mechanisms).

> If cython will free the
> memory, I have my solution. Indexing numpy arrays isn't super fast.

As Chris Barker mentioned, it can be.

- Robert
Reply all
Reply to author
Forward
0 new messages