CPython and a C extension using Boehm GC

2 views
Skip to first unread message

malkarouri

unread,
Dec 25, 2007, 6:34:28 AM12/25/07
to
Hi everyone,

Is it possible to write a Python extension that uses the Boehm garbage
collector?
I have a C library written that makes use of boehm-gc for memory
management. To use that, I have to call GC_INIT() at the start of the
program that uses the library. Now I want to encapsulate the library
as a CPython extension. The question is really is that possible? And
will there be conflicts between the boehm-gc and Python memory
management? And when should I call GC_INIT?

Best Regards,

Muhammad Alkarouri

MrJean1

unread,
Dec 25, 2007, 3:19:52 PM12/25/07
to
Perhaps, you can pre-load the extension library when Python is
invoked. It is probably trying.

Pre-loading is commonly done done for memory management and profiling
libraries and it may (or may not) work for libraries including the
Boehm-GC. And if it does work, call GC_INIT inside the initialization
function of the extension library. The latter will be called just
before Python's main is.

If you are using the GNU C, writing the initialization function could
be as simple as

void __attribute__((constructor))
_initializer (void) /* any name */
{
call GC_INIT();
}

For more details, see <http://gcc.gnu.org/onlinedocs/gcc/Function-
Attributes.html> under 'constructor'. Other compilers may support a
#pragma like init for this purpose.

Pre-loading a (shared) library on Linux is typically done using the
env command, e.g.:

$ env LD_PRELOAD=<path_to_the_library> python ....

Some command shells support other ways and the name LD_PRELOAD may be
different on other O/S's.

HTH, /Jean Brouwers

MrJean1

unread,
Dec 25, 2007, 3:40:38 PM12/25/07
to
Correction. The second line should be ... It is probably worth
trying.

/Jean Brouwers

Andrew MacIntyre

unread,
Dec 25, 2007, 10:35:23 PM12/25/07
to pytho...@python.org
malkarouri wrote:

It probably should be possible with some caveats:
- memory allocated by Python is never passed into the library such that
it also ends up being subject to boehm-gc;
- memory allocated by the library is never used by Python objects.

So memcpy()ing between library allocated and Python allocated memory
would seem to be a way to achieve this.

I would call GC_INIT in the extension's import routine
(init<module_name>()) for a C extension, and immediately after loading
the library if using ctypes.

--
-------------------------------------------------------------------------
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: and...@bullseye.apana.org.au (pref) | Snail: PO Box 370
and...@pcug.org.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia

MrJean1

unread,
Dec 26, 2007, 10:14:47 AM12/26/07
to
It depends on how the GC inside the extension is built. If it is a
drop-in replacement for malloc, then GC *must* be loaded and
initialized upfront if possible. There is no need to memcpy anything
between Python and the extension.

However, if GC does not replace malloc, etc., then GC-ed memory is
only used within the extension. GC_INIT can be called when the
extension is loaded and memcpy-ing between Python and the extension is
mandatory.

There are other details to consider. For example, on some platforms,
GC *must* be initialized from the main executable. That may preclude
both scenarios, altogether.

/Jean Brouwers

On Dec 25, 7:35 pm, Andrew MacIntyre <andy...@bullseye.apana.org.au>
wrote:


> malkarouri wrote:
> > Is it possible to write a Python extension that uses the Boehm garbage
> > collector?
> > I have a C library written that makes use of boehm-gc for memory
> > management. To use that, I have to call GC_INIT() at the start of the
> > program that uses the library. Now I want to encapsulate the library
> > as a CPython extension. The question is really is that possible? And
> > will there be conflicts between the boehm-gc and Python memory
> > management? And when should I call GC_INIT?
>
> It probably should be possible with some caveats:
> - memory allocated by Python is never passed into the library such that
>    it also ends up being subject to boehm-gc;
> - memory allocated by the library is never used by Python objects.
>
> So memcpy()ing between library allocated and Python allocated memory
> would seem to be a way to achieve this.
>
> I would call GC_INIT in the extension's import routine
> (init<module_name>()) for a C extension, and immediately after loading
> the library if using ctypes.
>
> --
> -------------------------------------------------------------------------
> Andrew I MacIntyre                     "These thoughts are mine alone..."

> E-mail: andy...@bullseye.apana.org.au  (pref) | Snail: PO Box 370
>         andy...@pcug.org.au             (alt) |        Belconnen ACT 2616
> Web:    http://www.andymac.org/              |        Australia

MrJean1

unread,
Dec 26, 2007, 3:28:24 PM12/26/07
to
FWIIW, I built GC 6.7 on a RHEL 3 (Opteron) system using

./configure --prefix=... --enable-redirect-malloc --enable-
threads=posix --enable-thread-local-alloc
make; make check; make install

Then, I tried running a few examples with 3 different, existing Python
binaries each pre-loaded with the libgc.so library

env LD_PRELOAD=.../libgc.so <python> ....

One is Python 2.2.3 included in RHEL 3, one is a Python 2.5.1 build
and is a Python 3.0a2 build, all 64-bit. All seemed to work OK.

These are 3 existing Python binaries without any call to GC_INIT().
AFAICT, on Linux, GC_INIT is a no-op anyway.

/Jean Brouwers


On Dec 26, 7:14 am, MrJean1 <MrJe...@gmail.com> wrote:
> It depends on how the GC inside the extension is built.  If it is a
> drop-in replacement for malloc, then GC *must* be loaded and
> initialized upfront if possible.  There is no need to memcpy anything
> between Python and the extension.
>
> However, if GC does not replace malloc, etc., then GC-ed memory is
> only used within the extension.  GC_INIT can be called when the
> extension is loaded and memcpy-ing between Python and the extension is
> mandatory.
>
> There are other details to consider.  For example, on some platforms,
> GC *must* be initialized from the main executable.  That may preclude
> both scenarios, altogether.
>
> /Jean Brouwers
>
> On Dec 25, 7:35 pm, Andrew MacIntyre <andy...@bullseye.apana.org.au>
> wrote:
>
> > malkarouri wrote:
> > > Is it possible to write a Python extension that uses the Boehm garbage
> > > collector?

> > > I have a C library written that makes use ofboehm-gcfor memory


> > > management. To use that, I have to call GC_INIT() at the start of the
> > > program that uses the library. Now I want to encapsulate the library
> > > as a CPython extension. The question is really is that possible? And

> > > will there be conflicts between theboehm-gcand Python memory

Reply all
Reply to author
Forward
0 new messages