__main__ block called unlimited times when spawning process, using --embed option

46 views
Skip to first unread message

Czarek

unread,
Nov 29, 2011, 5:39:08 AM11/29/11
to cython-users
Hello, I tried cython --embed to create an exe and there is something
wrong happening when spawnin processing, using multiprocessing module.
I first compiled .py to .cpp using command "cython --embed --cplus --
no-docstrings -w ./ ./src/hello.py". Then I compiled that cpp to exe
using Visual Studio 2008. In my app __main__ block creates a Tkinter
window, it also creates a subprocess by calling
multiprocessing.Process(), and when I run cython's exe it creates an
infinite number of windows, also in task manager I see lots of
processes being spawned. App works fine, as expected, when
executing .py file, only 1 process is created. It also works when
compiled using pyinstaller. But when using cython --embed this strange
behavior happens. I debugged code and what is happening is that each
spawned process calls "if __name__ == "__main__" block in the main
file, but it shouldn't, it should only call the function that is
passed as 2nd parameter to multiprocessing.Process().

Czarek.

MinRK

unread,
Nov 29, 2011, 1:44:54 PM11/29/11
to cython...@googlegroups.com
This is because Windows doesn't have a proper fork, so each subprocess will execute from the beginning, including the if __name__== '__main__' block.
If this block spawns subprocesses, then it will be called ad infinitum, as each subprocess calls it again.

The way I have found to deal with this, is to protect subsequent spawns by checking if we are in the main process:

import multiprocessing
p = multiprocessing.current_process()
# the main process has name 'MainProcess'
# subprocesses will have names like 'Process-1'

if sys.platform == 'win32' and p.name != 'MainProcess':
    # in a subprocess, don't spawn
else:
   # main process, spawn

-MinRK
 

Czarek.

Czarek

unread,
Nov 29, 2011, 2:45:50 PM11/29/11
to cython-users
@-MinRK

I tried your suggestions, but it does not work.

These are the lines that spawn process:

process = multiprocessing.Process(name="finderprocess",
target=finderprocess_start, args=(queue,))
process.daemon = True

And the main block:

if __name__ == "__main__" and multiprocessing.current_process().name
== "MainProcess":
multiprocessing.freeze_support()
print "PROCESS NAME: "+multiprocessing.current_process().name
spawn_process()

After compiling, I run and get these messages in the output:

PROCESS NAME: MainProcess
PROCESS NAME: MainProcess
PROCESS NAME: MainProcess
PROCESS NAME: MainProcess

Czarek.

MinRK

unread,
Nov 29, 2011, 3:18:08 PM11/29/11
to cython...@googlegroups.com
hm, it's definitely worked for me, but mine is not a Cython application.

What if you put the check closer to the instantiation of the subprocess?

Robert Kern

unread,
Nov 30, 2011, 5:27:42 AM11/30/11
to cython...@googlegroups.com

Since Windows does not have a reasonable fork() function, multiprocessing starts
up a new sys.executable process and imports the main module such that it can
locate the other code that it needs to load. I suspect that running a new
sys.executable process is the problem here. Your embedded code is, of course,
always going to run the __main__ block. I suspect that manually written embedded
code is going to do the same thing (pyinstaller does not embed the interpreter
in another C++ program; just repackages it a bit to look nice). There might be
some multiprocessing API that you can call that can determine if you are in a
child process or not. There might be a specific flag in sys.argv that you can
look for.

You may also try having a plain .py file as your main script that just contains
a __main__ block that imports the main function from your Cython extension module.

Or you can stop trying to abuse Cython as a code obfuscator. ;-)

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Czarek

unread,
Nov 30, 2011, 9:18:34 AM11/30/11
to cython-users
@Robert Kern:

Thank you, checking sys.argv worked, there are arguments that help in
distinction between mainprocess and subprocess:

PROCESSNAME: MainProcess
sys.argv: ['cythonembed.exe']
sys.executable: cythonembed.exe

PROCESSNAME: MainProcess
sys.argv: ['cythonembed.exe', '-c', 'from multiprocessing.forking
import main; main()', '--multiprocessing-fork', '1700']
sys.executable: cythonembed.exe

But now I get a win32 unhandled exception :) I'm probably gonna give
up, tried debugging in VS but that is too much for my c++ skills, I
will just go with pyinstaller + cython for compiling obfuscated
sources to pyd, that should be enough.

Here is the win32 unhandled exception info:

"Unhandled exception at 0x1e0a8ce0 (python27.dll) in cythonembed.exe:
0xC0000005: Access violation reading location 0x00000004."

Screenshot of VS debugger's stack trace:
img689. imageshack .us/img689/4486/cythonembedwin32excepti.jpg (remove
the dot)

I am using a global exception handler, overwriting sys.excepthook
(with my errorhandler.excepthook), I've suspected that maybe I do
something wrong in that handler and changed it to only print "asd"
message and os._exit(1), but it didn't help, no message was written to
output, and win32 unhandled exception still remains.

Czarek.

Robert Kern

unread,
Nov 30, 2011, 5:08:18 PM11/30/11
to cython...@googlegroups.com

sys.excepthook only handles Python exceptions. You are seeing a segmentation
fault in the C or C++ code.

Reply all
Reply to author
Forward
0 new messages