Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Safe to call Py_Initialize() frequently?

601 views
Skip to first unread message

roschler

unread,
Mar 20, 2009, 1:20:10 PM3/20/09
to
I've created a Python server that embeds Python 2.5 and runs Python
jobs. I want to be able to completely "flush" the interpreter between
each job. That means resetting all variables, stopping all user
created threads, and resetting the interpreter sys module path. If it
does not cause memory leaks, slowdowns, or other problems I would like
to call Py_Initialize() before running each job. I expect to run a
job about once a second. Are there any known issues with doing this
or anything else that would make this a bad approach?

If it is a safe approach, do I have to pair each Py_Initialize() call
with a Py_Finalize() call?

If it is not a safe approach, is there another way to get what I want?

Thanks.

Mark Hammond

unread,
Mar 20, 2009, 7:27:13 PM3/20/09
to roschler, pytho...@python.org
On 21/03/2009 4:20 AM, roschler wrote:
> I've created a Python server that embeds Python 2.5 and runs Python
> jobs. I want to be able to completely "flush" the interpreter between
> each job. That means resetting all variables, stopping all user
> created threads, and resetting the interpreter sys module path. If it
> does not cause memory leaks, slowdowns, or other problems I would like
> to call Py_Initialize() before running each job. I expect to run a
> job about once a second. Are there any known issues with doing this
> or anything else that would make this a bad approach?

Calling Py_Initialize() multiple times has no effect. Calling
Py_Initialize and Py_Finalize multiple times does leak (Python 3 has
mechanisms so this need to always be true in the future, but it is true
now for non-trivial apps.

>
> If it is a safe approach, do I have to pair each Py_Initialize() call
> with a Py_Finalize() call?
>
> If it is not a safe approach, is there another way to get what I want?

Start a new process each time?

Cheers,

Mark

roschler

unread,
Mar 20, 2009, 11:35:28 PM3/20/09
to
On Mar 20, 7:27 pm, Mark Hammond <skippy.hamm...@gmail.com> wrote:
> On 21/03/2009 4:20 AM, roschler wrote:
>
> Calling Py_Initialize() multiple times has no effect.  Calling
> Py_Initialize and Py_Finalize multiple times does leak (Python 3 has
> mechanisms so this need to always be true in the future, but it is true
> now for non-trivial apps.
>
> > If it is not a safe approach, is there another way to get what I want?
>
> Start a new process each time?
>
> Cheers,
>
> Mark

Hello Mark,

Thank you for your reply. I didn't know that Py_Initialize worked
like that.

How about using Py_NewInterpreter() and Py_EndInterpreter() with each
job? Any value in that approach? If not, is there at least a
reliable way to get a list of all active threads and terminate them so
before starting the next job? Starting a new process each time seems
a bit heavy handed.

Robert.

Graham Dumpleton

unread,
Mar 22, 2009, 6:33:53 AM3/22/09
to

Using Py_EndInterpreter() is even more fraught with danger. The first
problem is that some third party C extension modules will not work in
sub interpreters because they use simplified GIL state API. The second
problem is that third party C extensions often don't cope well with
the idea that an interpreter may be destroyed that it was initialised
in, with the module then being subsequently used again in a new sub
interpreter instance.

Given that it is one operation per second, creating a new process, be
it a completely fresh one or one forked from existing Python process,
would be simpler.

Graham

Graham Dumpleton

unread,
Mar 22, 2009, 9:14:38 PM3/22/09
to
On Mar 21, 10:27 am, Mark Hammond <skippy.hamm...@gmail.com> wrote:
> Calling
> Py_Initialize and Py_Finalize multiple times does leak (Python 3 has
> mechanisms so this need to always be true in the future, but it is true
> now for non-trivial apps.

Mark, can you please clarify this statement you are making. The
grammar used makes it a bit unclear.

Are you saying, that effectively by design, Python 3.0 will always
leak memory upon Py_Finalize() being called, or that it shouldn't leak
memory and that problems with older versions of Python have been fixed
up?

I know that some older versions of Python leaked memory on Py_Finalize
(), but if this is now guaranteed to always be the case and nothing
can be done about it, then the final death knell will have been rung
on mod_python and also embedded mode of mod_wsgi. This is because both
those systems rely on being able to call Py_Initialize()/Py_Finalize()
multiple times. At best they would have to change how they handle
initialisation of Python and defer it until sub processes have been
forked, but this will have some impact on performance and memory
usage.

So, more information appreciated.

Related link on mod_wsgi list about this at:

http://groups.google.com/group/modwsgi/browse_frm/thread/65305cfc798c088c?hl=en

Graham


Mark Hammond

unread,
Mar 23, 2009, 7:00:11 AM3/23/09
to Graham Dumpleton, pytho...@python.org
On 23/03/2009 12:14 PM, Graham Dumpleton wrote:
> On Mar 21, 10:27 am, Mark Hammond<skippy.hamm...@gmail.com> wrote:
>> Calling
>> Py_Initialize and Py_Finalize multiple times does leak (Python 3 has
>> mechanisms so this need to always be true in the future, but it is true
>> now for non-trivial apps.
>
> Mark, can you please clarify this statement you are making. The
> grammar used makes it a bit unclear.

Yes, sorry - s/this need to/this need not/

> Are you saying, that effectively by design, Python 3.0 will always
> leak memory upon Py_Finalize() being called, or that it shouldn't leak
> memory and that problems with older versions of Python have been fixed
> up?

The latter - kindof - py3k provides an enhanced API that *allows*
extensions to be 'safe' in this regard, but it doesn't enforce it.
Modules 'trivially' ported from py2k will not magically get this ability
- they must explicitly take advantage of it. pywin32 is yet to do so
(ie, it is a 'trivial' port...)

I hope this clarifies...

Mark

Graham Dumpleton

unread,
Mar 23, 2009, 7:09:39 AM3/23/09
to

Yes, but ...

There still may be problems. The issues is old, but suspect that
comments in the issue:

http://bugs.python.org/issue1856

maybe still hold true.

That is, that there are some things that Python doesn't free up which
are related to Python simplified GIL state API. Normally this wouldn't
matter as another call to Py_Initialize() would see existing data and
reuse it. So, doesn't strictly leak memory in that sense.

In mod_wsgi however, Apache will completely unload the mod_wsgi module
on a restart. This would also mean that the Python library is also
unloaded from memory. When it reloads both, the global static
variables where information was left behind have been lost and nulled
out. Thus Python when initialised again, will recreate the data it
needs.

So, for case where Python library unloaded, looks like may well suffer
a memory leak regardless.

As to third party C extension modules, they aren't really an issue,
because all that is done in Apache parent process is Py_Initialize()
and Py_Finalize() and nothing else really. Just done to get
interpreter setup before forking child processes.

There is more detail on this analysis in that thread on mod_wsgi list
at:

Graham

Aahz

unread,
Mar 29, 2009, 1:35:09 PM3/29/09
to
[p&e]

In article <e97efd52-4868-47a5...@z16g2000prd.googlegroups.com>,


Graham Dumpleton <Graham.D...@gmail.com> wrote:
>
>In mod_wsgi however, Apache will completely unload the mod_wsgi module
>on a restart. This would also mean that the Python library is also
>unloaded from memory. When it reloads both, the global static
>variables where information was left behind have been lost and nulled
>out. Thus Python when initialised again, will recreate the data it
>needs.
>
>So, for case where Python library unloaded, looks like may well suffer
>a memory leak regardless.
>
>As to third party C extension modules, they aren't really an issue,
>because all that is done in Apache parent process is Py_Initialize()
>and Py_Finalize() and nothing else really. Just done to get
>interpreter setup before forking child processes.
>
>There is more detail on this analysis in that thread on mod_wsgi list
>at:

Missing reference?
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it." --Brian W. Kernighan

Graham Dumpleton

unread,
Mar 29, 2009, 4:00:16 PM3/29/09
to
On Mar 30, 4:35 am, a...@pythoncraft.com (Aahz) wrote:
> [p&e]
>
> In article <e97efd52-4868-47a5-91ec-657bba5f0...@z16g2000prd.googlegroups.com>,

> Graham Dumpleton  <Graham.Dumple...@gmail.com> wrote:
>
>
>
>
>
> >In mod_wsgi however, Apache will completely unload the mod_wsgi module
> >on a restart. This would also mean that the Python library is also
> >unloaded from memory. When it reloads both, the global static
> >variables where information was left behind have been lost and nulled
> >out. Thus Python when initialised again, will recreate the data it
> >needs.
>
> >So, for case where Python library unloaded, looks like may well suffer
> >a memory leak regardless.
>
> >As to third party C extension modules, they aren't really an issue,
> >because all that is done in Apache parent process is Py_Initialize()
> >and Py_Finalize() and nothing else really. Just done to get
> >interpreter setup before forking child processes.
>
> >There is more detail on this analysis in that thread on mod_wsgi list
> >at:
>
> Missing reference?

It was in an earlier post. Yes I knew I forget to add it again, but
figured people would read the whole thread.

http://groups.google.com/group/modwsgi/browse_frm/thread/65305cfc798c088c

Graham

0 new messages