performance running code via source versus pyinstaller binaries

2,379 views
Skip to first unread message

Michael Hale Ligh

unread,
Jun 3, 2015, 3:19:31 AM6/3/15
to pyins...@googlegroups.com
Hello,

I was wondering if anyone has experienced significant performance degradation after compiling a program with pyinstaller. When I run a program via source, its approximately 10 times faster than the exact same code in binary form. I'm aware of the minimal delay during startup that is known/expected with pyinstaller binaries and I don't believe that is the issue here.

Here's an example:

1) check out the volatility source

$ git clone https://github.com/volatilityfoundation/volatility.git

2) run the "filescan" plugin with volatility as source

$ time python volatility/vol.py -f memdump.mem filescan > /dev/null

real    1m31.799s
user    1m25.953s
sys    0m5.660s

3) check out the latest dev pyinstaller

$ git clone https://github.com/pyinstaller/pyinstaller.git

4) compile volatility (this is being done on a 64-bit Debian Linux system)

$ python pyinstaller/pyinstaller.py -F volatility/pyinstaller.spec

5) run the "filescan" plugin with volatility as a pyinstaller binary

$ time ./dist/volatility -f memdump.mem filescan > /dev/null

real    14m31.405s
user    14m23.970s
sys    0m5.700s

As you can see, the exact same code took 14m31s after being compiled, but it only took 1m31s in source form. If you're not familiar with volatility (or the filescan plugin) it is essentially scanning through a large memory dump file (several GB big) looking for specific signatures/byte patterns and then interpreting data at the matching addresses as C data structures (in short, memory forensics).

What would you suggest for troubleshooting this type of performance problem? Also, are there any known types of activities (i.e. disk I/O, network I/O, GUI interactions) or specific modules/APIs that result in severe slowdowns when compiled or should speeds theoretically be pretty similar between the source and binary versions (minus the tiny initial startup delay)?

Thank you!
MHL

Michael Hale Ligh

unread,
Jun 3, 2015, 10:15:03 AM6/3/15
to pyins...@googlegroups.com
I should add that over the past couple days, we've done more testing to try and isolate the issue. In particular, I created a simple script that built a 100 million character string one byte at a time (just an example of a memory and cpu intensive task) and compared the execution times of that script in source form versus pyinstaller exe. There was practically no difference, thus this issue isn't necessarily generic/widespread - its caused by something more specific we're doing in our application.

Since our program involves a lot of file I/O (reading, seeking), I tested another small script that wrote 100 million characters 1 byte at a time to a file and then used random.randomrange(0, 100000000) to seek to a random location in that same file and read 1 byte (100 million times). Once again, there was not a significant difference in execution times of the source versus exe code.

Next I tried importing a bunch of modules like PyQt, SQLalchmeny, PyCrypto, etc into the simple test script until the size of the exe file was about 40 MB. That also did not explicitly cause a slowdown in the execution time.

I did manage to find an older post here with a similar sounding issue: https://groups.google.com/forum/#!topic/pyinstaller/_iX2NjXckRI. The suggestion was to look at pyi_importers.py and use -v to python to trace imports. I'm not quite sure how that would help diagnose the issue. What should we be looking for specifically in that file and/or output?

Thanks,
MHL

davecortesi

unread,
Jun 3, 2015, 10:25:34 AM6/3/15
to pyins...@googlegroups.com
This is really interesting (to an observer, maybe that's not the word you'd use). PyInstaller doesn't do anything that should cause this. It collects all imported modules into a special archive that is part of the executable. It makes a local copy of the Python interpreter, also part of the executable. At startup, the bootloader launches the embedded Python and sets up its import mechanism so all imports are executed by pulling from the embedded archive. That's it! But those imports should take no more time to execute than a normal import. And anyway, importing is normally only done once, starting up.

When dealing with a file that big, and large memory allocation, some small change in the mode of operation could have a big impact. Somehow the cache-alignment gets wrong and suddenly you are doing 10x the number of memory accesses. Or some buffer is smaller and you are doing 10x as many file reads. Or the program has a smaller virtual memory allocation and suddenly it is "thrashing" to the backing store. Something like that. Not much help, I know.

davecortesi

unread,
Jun 3, 2015, 10:23:56 PM6/3/15
to pyins...@googlegroups.com
Regarding imports, in the discussion on an issue at github[1], it was pointed out how the bootloader sets up sys.path,

# Append lib directory at the end of sys.path and not at the beginning.

# Python will first try necessary libraries from the system and fallback

# to the lib directory.

This would mean that your program imports whatever is locally installed, if any, in preference to what was embedded in the executable. In a dev environment, they should be the same. Whatever PyInstaller found while bundling, should still be there. However, is there any possibility that when you are testing the bundled app, it is picking up a different version of some lib?

davecortesi

unread,
Jun 3, 2015, 10:24:43 PM6/3/15
to pyins...@googlegroups.com

Michael Hale Ligh

unread,
Jun 11, 2015, 11:35:53 AM6/11/15
to pyins...@googlegroups.com
Hi Dave,

Thanks for the references. We identified the cause of the slowdowns so I just wanted to report back the findings. We were emitting a debug statement from inside a loop (that processed about 35K items). Obviously, the debug calls are not the issue per se, since they are also emitted when run as source. However, instead of placing logging.getLogger('xyz') in each module we have a central debug facility that uses inspect.getmodule() to determine ("guess") the name of the module that emitted the debug message. This getmodule() calls os.path.isfile() or os.path.exists() which in turn calls stat() using the OS's API, however it calls stat() on a file that only exists in the pyinstaller build environment (build/pyinstaller/out00-PYZ.pyz/volatility.debug). Thus, as a pyinstaller exe, it calls stat() 35K times for a file that doesn't exist. We were able to significantly speed up the binary's runtime by just creating (via touch) an empty file named build/pyinstaller/out00-PYZ.pyz/volatility.debug since stat() appears to be significantly faster on files that exist versus files that don't. Of course, that was just for testing. It turns out inspect.getfile() returns similar information as inspect.getmodule() but it doesn't look for a non-existing file in the pyinstaller build directory....so we will probably use that instead....and/or disable the debug call inside the processing loop.

Cheers,
MHL

David Cortesi

unread,
Jun 11, 2015, 12:23:42 PM6/11/15
to pyins...@googlegroups.com
Interesting. I wonder if inspect.getmodule() even *can* work in a bundled app. Because imported modules are not stored in files but in an archive embedded in the executable[1], so they are invisible to the OS. Perhaps the bootloader sets up the import machinery so well that getmodule() can do its job, but I would not assume that.

If it is a matter of a single missing file, you can ensure it exists where the bundled app runs[2].

But in any case, if this debug-central thing is being called 35K times in a typical run -- to me that suggests "memoization here". Given the typical locality of code execution, it probably gets called from the same module many times in succession, then from another one etc. So even a small LRU cache of module names could be a big win.

Parkway

unread,
Jun 18, 2015, 7:35:18 AM6/18/15
to pyins...@googlegroups.com
@davecortesi
Not related to the OP's question. Above, you said, "PyInstaller doesn't do anything that should cause this. It collects all imported modules into a special archive that is part of the executable. It makes a local copy of the Python interpreter, also part of the executable. At startup, the bootloader launches the embedded Python and sets up its import mechanism so all imports are executed by pulling from the embedded archive."

Is the embedded archive stored on a hard drive in a temporary folder and then removed after program is exited?


David Cortesi

unread,
Jun 18, 2015, 10:44:09 AM6/18/15
to pyins...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages