python application startup times with long PYTHONPATH

335 views
Skip to first unread message

Sebastian Elsner

unread,
Apr 7, 2017, 7:40:51 AM4/7/17
to rez-config
Hi all,

when there are a lot of packages that add to PYTHONPATH and released to a central repository, which is on a network share, I am seeing the issue that startup times of python applications are incrasing (they are increasing a lot on windows, not so much on linux). The rez-env resolved PYTHONPATH is PREPENDED to the buildin module paths, so even for builtin modules of python (sys, os, etc...) the python interpreter will hit the network. If there are a lot of paths one import will walk a lot of network directories, so it gets very slow. For example importing "site" will hit each network path for "site.so", "site.py", "site.pyc", "sitemodule.so". The same goes for custom python modules. Are you seeing this issue, too?

Cheers,

Sebastian

Allan Johns

unread,
Apr 7, 2017, 5:18:39 PM4/7/17
to rez-c...@googlegroups.com
This can happen yes. I've covered this in detail in an earlier thread (somewhere). I would like to introduce the idea of 'context filesystems', where a context is not just an rxt file, but a directory. Packages would then have the opportunity to copy parts of all of their payloads into this "context filesystem". This would be the mechanism we'd use to avoid long PYTHONPATHs etc. It would interact with the existing caching system so contexts dirs would be cached, avoiding disk writes a lot of the time.

A

--
You received this message because you are subscribed to the Google Groups "rez-config" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rez-config+unsubscribe@googlegroups.com.
To post to this group, send email to rez-c...@googlegroups.com.
Visit this group at https://groups.google.com/group/rez-config.
For more options, visit https://groups.google.com/d/optout.

Fede Naum

unread,
Apr 10, 2017, 12:56:05 AM4/10/17
to rez-config
Hi,

This sounds like a good general multiplatform solution.!

Have you explored the symlink option on linux /mac ?
We currently have the symlink implemented for LD_LIBRARY_PATH. We create a tmp folder and flatten the original LD_LIBRARY_PATH symlinking all the libs in in there. Works like a charm.
I can not remember which was the issue with had with the PYTHONPATH. 

Fede


On Saturday, 8 April 2017 07:18:39 UTC+10, allan.johns wrote:
This can happen yes. I've covered this in detail in an earlier thread (somewhere). I would like to introduce the idea of 'context filesystems', where a context is not just an rxt file, but a directory. Packages would then have the opportunity to copy parts of all of their payloads into this "context filesystem". This would be the mechanism we'd use to avoid long PYTHONPATHs etc. It would interact with the existing caching system so contexts dirs would be cached, avoiding disk writes a lot of the time.

A
On Fri, Apr 7, 2017 at 9:40 PM, Sebastian Elsner <seba...@risefx.com> wrote:
Hi all,

when there are a lot of packages that add to PYTHONPATH and released to a central repository, which is on a network share, I am seeing the issue that startup times of python applications are incrasing (they are increasing a lot on windows, not so much on linux). The rez-env resolved PYTHONPATH is PREPENDED to the buildin module paths, so even for builtin modules of python (sys, os, etc...) the python interpreter will hit the network. If there are a lot of paths one import will walk a lot of network directories, so it gets very slow. For example importing "site" will hit each network path for "site.so", "site.py", "site.pyc", "sitemodule.so". The same goes for custom python modules. Are you seeing this issue, too?

Cheers,

Sebastian

--
You received this message because you are subscribed to the Google Groups "rez-config" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rez-config+...@googlegroups.com.

Allan Johns

unread,
Apr 10, 2017, 5:38:53 PM4/10/17
to rez-c...@googlegroups.com
Hey Fede,

Haven't tried generating symlinks I'm sure it works for many cases though. What I'm talking about is basically a generalisation of that anyway - each package would be able to specify whether it wants to copy part of its payload into the context filesystem, or hard/softlink it. That way you'd just be able to tailor the behavior for whatever cases you have. I did hear that there were cases with PYTHONPATH where links were problematic, if you do find out what they were then please let me know.

Thx
A




To unsubscribe from this group and stop receiving emails from it, send an email to rez-config+unsubscribe@googlegroups.com.

Sebastian Elsner

unread,
Apr 20, 2017, 3:40:25 AM4/20/17
to rez-config
If you say this can happen, I read this like: it will probably not for most people. It does not seem to be a big issue for anyone, but for us currently this is a blocker for using rez on a broader scale. Here Nuke takes minutes to start up when putting a lot of network paths in PYTHONPATH. Even if your server is super fast with metadata queries (which ours is not, I have to admit) this should be an issue, espcecially if you have hundreds of packages (like I suspect Method et al. to have)?


On Friday, April 7, 2017 at 11:18:39 PM UTC+2, allan.johns wrote:
This can happen yes. I've covered this in detail in an earlier thread (somewhere). I would like to introduce the idea of 'context filesystems', where a context is not just an rxt file, but a directory. Packages would then have the opportunity to copy parts of all of their payloads into this "context filesystem". This would be the mechanism we'd use to avoid long PYTHONPATHs etc. It would interact with the existing caching system so contexts dirs would be cached, avoiding disk writes a lot of the time.

A
On Fri, Apr 7, 2017 at 9:40 PM, Sebastian Elsner <seba...@risefx.com> wrote:
Hi all,

when there are a lot of packages that add to PYTHONPATH and released to a central repository, which is on a network share, I am seeing the issue that startup times of python applications are incrasing (they are increasing a lot on windows, not so much on linux). The rez-env resolved PYTHONPATH is PREPENDED to the buildin module paths, so even for builtin modules of python (sys, os, etc...) the python interpreter will hit the network. If there are a lot of paths one import will walk a lot of network directories, so it gets very slow. For example importing "site" will hit each network path for "site.so", "site.py", "site.pyc", "sitemodule.so". The same goes for custom python modules. Are you seeing this issue, too?

Cheers,

Sebastian

--
You received this message because you are subscribed to the Google Groups "rez-config" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rez-config+...@googlegroups.com.

Allan Johns

unread,
Apr 20, 2017, 6:04:44 PM4/20/17
to rez-c...@googlegroups.com
Yeah we have hundreds of packages, but it also depends on how many of those will end up in the one env at once, and that depends on the granularity you've chosen for your environments. I suspect we are suffering slowdown but I haven't made attempts to quantify that as yet.

As I say though I'm keen on solving this problem and I think context filesystems are the way to do it, it's a matter of finding the time to work on this feature (and the myriad of other features I would like to add to rez!)

Thx
A


To unsubscribe from this group and stop receiving emails from it, send an email to rez-config+unsubscribe@googlegroups.com.

b.f...@toonboxent.com

unread,
May 2, 2017, 3:30:23 PM5/2/17
to rez-config
We've seen that having any kind of binary on a network share in windows will make is super slow. This is especially true if rez's python lives there and each sub-context will slow down more.

This is not related to the filesystem as far as I can tell just some magic Microsoft security hot sauce. I could not find the right documentation so we haven't found the right registry key to disable it.

Here's an easy proof. Install python locally. Install python on network share.
Compare launch time vs. data transfer benchmarks. You'll see that (even considering runtime costs) the performance does not relate at all.

We tried disabling windows security essentials and similar. While I felt it helped a bit it did not solve the issue.
I did not investigate further since we barely use windows here.

I was about to try this, maybe this is related to your case:
https://support.microsoft.com/en-us/help/829700/slow-network-performance-when-you-open-a-file-that-is-located-in-a-shared-folder-on-a-remote-network-computer

Cheers,
Blazej

Allan Johns

unread,
May 2, 2017, 6:33:27 PM5/2/17
to rez-c...@googlegroups.com
Wrt local package installations, you should be able to support that in rez with some funky commands() code. For example, the python package could check localhost for the same versioned python install, and use that instead.

That doesn't address long pythonpaths though but thought it worth mentioning.

Hth
A


To unsubscribe from this group and stop receiving emails from it, send an email to rez-config+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages