How to get path to a data file from Python?

1,293 views
Skip to first unread message

lika...@gmail.com

unread,
Jan 2, 2017, 5:55:17 PM1/2/17
to bazel-discuss
Or equivalently, how to get the path to runfiles? The best way I can think of right now is to find a .py file, get its module object, and join the path with a couple of ..'s to get to the root, then join with the path of the data file. But isn't it weird and very inconvenient? Also, the number of .. has to be changed if the package is moved. What do people usually do to get the path?

BTW: I see pkgutil.get_data can get the content of the data file, but I need the path. Also, pkgutil.get_data has some limitation that the path can't contain .., so I have to find a .py in some parent directory of data. What if there is no .py in some parent directory of data, e.g. the data is in some //data directory under project root?

Doug Greiman

unread,
Jan 3, 2017, 6:28:02 PM1/3/17
to bazel-discuss, lika...@gmail.com
This is a surprisingly messy question.

1) Inside a test, look at the environment variable $TEST_SRCDIR.  This will get you the runfiles directory or possibly a sandboxed analogue.

2) For "bazel run my-python-target" or "bazel-bin/my-python-target", there is no useful environment variable, so do as you describe.  Import your project's top-level package (if it has one) and look at it's __file__ attribute.  You may have to implement some kind of heuristic search based directory names (*.runfiles/) and/or well-known file names (./WORKSPACE or some_dir/some_package/__init__.py).  This will be complicated if your program is built as a zip/.par/.pex/.egg file. 

Internally, we have a complicated and fragile set of helper functions to encapsulate this, based on assumptions about directory structure and project organization.  There doesn't seem to be a better solution without changing Python itself. 

As you mentioned, pkgutil.get_data its own limitations.  It also doesn't work with Python3 "namespace packages".

lika...@gmail.com

unread,
Jan 3, 2017, 6:39:07 PM1/3/17
to bazel-discuss, lika...@gmail.com
For case 2), can we modify the Python template file to set an environment variable, e.g. RUNFILES_DIR, before invoking the actual python script? I can find some time to send a patch, but I'd like to know if I overlooked any corner cases of this approach.
Reply all
Reply to author
Forward
0 new messages