Airbnb airflow with pex

464 views
Skip to first unread message

Matt Hite

unread,
Jul 21, 2016, 2:33:28 PM7/21/16
to pants...@googlegroups.com
I'm new to Pex. I've had success using pex to build my own applications, but I'm running into a roadblock trying to get a more complicated application turned into a functioning pex file. Specifically, I'm trying to turn airflow (from Airbnb) into a pex file.

The pex creation process seems to succeed but when I attempt to launch airflow I receive an error.

Here is my attempt:

root@a27e175e913a:/# pex airflow -c airflow -o airflow4.pex -v
  Jinja2 2.8g pex :: Resolving distributions :: Packaging WTForms
  Flask 0.10.1
  dill 0.2.5
  Flask-Login 0.2.11
  chartkick 0.4.2
  setproctitle 1.1.10
  six 1.10.0
  itsdangerous 0.24
  Flask-Cache 0.13.1
  WTForms 2.1
  lockfile 0.12.2
  SQLAlchemy 1.1.0b2
  future 0.15.2
  setuptools 24.3.0
  thrift 0.9.3
  gunicorn 19.3.0
  Markdown 2.6.6
  funcsigs 0.4
  Babel 1.3
  MarkupSafe 0.23
  pytz 2016.6.1
  python-editor 1.0.1
  pandas 0.18.1
  requests 2.10.0
  docutils 0.12
  alembic 0.8.6
  python-dateutil 2.5.3
  croniter 0.3.12
  Pygments 2.1.3
  numpy 1.11.1
  python-daemon 2.1.1
  Flask-WTF 0.12
  Werkzeug 0.11.10
  Mako 1.0.4
  airflow 1.7.1.3
  Flask-Admin 1.4.0
pex: Building pex: 482359.3ms
pex:   Resolving distributions: 479120.6ms
pex:       Packaging airflow: 1972.2ms
pex:       Packaging dill: 211.9ms
pex:       Packaging Markdown: 755.8ms
pex:       Packaging setproctitle: 497.9ms
pex:       Packaging Flask: 284.3ms
pex:       Packaging Babel: 1329.2ms
pex:       Packaging croniter: 190.0ms
pex:       Packaging pandas: 326731.8ms
pex:       Packaging SQLAlchemy: 992.0ms
pex:       Packaging thrift: 507.8ms
pex:       Packaging future: 459.8ms
pex:       Packaging chartkick: 182.9ms
pex:       Packaging alembic: 276.1ms
pex:       Packaging Mako: 248.7ms
pex:       Packaging docutils: 405.1ms
pex:       Packaging python-editor: 183.9ms
pex:       Packaging itsdangerous: 185.4ms
pex:       Packaging numpy: 136437.2ms
pex:       Packaging MarkupSafe: 312.7ms
pex:       Packaging WTForms: 292.0ms
Saving PEX file to airflow4.pex
root@a27e175e913a:/# ./airflow4.pex -h
Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 326, in execute
  File ".bootstrap/_pex/pex.py", line 258, in _wrap_coverage
  File ".bootstrap/_pex/pex.py", line 290, in _wrap_profiling
  File ".bootstrap/_pex/pex.py", line 367, in _execute
  File ".bootstrap/_pex/pex.py", line 394, in execute_script
  File ".bootstrap/_pex/finders.py", line 284, in get_script_from_distributions
  File ".bootstrap/_pex/finders.py", line 268, in get_script_from_distribution
  File ".bootstrap/_pex/finders.py", line 241, in get_script_from_egg
  File ".bootstrap/pkg_resources/__init__.py", line 1490, in metadata_listdir
  File ".bootstrap/pkg_resources/__init__.py", line 1574, in _listdir
OSError: [Errno 2] No such file or directory: '/root/.pex/install/Flask_Login-0.2.11-py2.7.egg.bd7c1959e5c1f6a8c34c070cc93c67b55a45b9b9/Flask_Login-0.2.11-py2.7.egg/EGG-INFO/scripts'

I also tried it with --no-wheel just for fun and got a different error when trying to launch the built .pex.

root@a27e175e913a:/# pex airflow -c airflow -o airflow3.pex -v --no-wheel
  Flask-Login 0.2.11 Resolving distributions :: Packaging setuptools
  Werkzeug 0.11.10
  thrift 0.9.3
  funcsigs 0.4
  setproctitle 1.1.10
  Flask-Cache 0.13.1
  python-daemon 2.1.1
  gunicorn 19.3.0
  MarkupSafe 0.23
  airflow 1.7.1.3
  Flask-WTF 0.12
  chartkick 0.4.2
  pytz 2016.6.1
  itsdangerous 0.24
  python-editor 1.0.1
  future 0.15.2
  setuptools 24.3.0
  requests 2.10.0
  Markdown 2.6.6
  six 1.10.0
  Jinja2 2.8
  Mako 1.0.4
  SQLAlchemy 1.1.0b2
  docutils 0.12
  dill 0.2.5
  numpy 1.11.1
  Flask 0.10.1
  alembic 0.8.6
  croniter 0.3.12
  Babel 1.3
  Pygments 2.1.3
  pandas 0.18.1
  lockfile 0.12.2
  WTForms 2.1
  Flask-Admin 1.4.0
  python-dateutil 2.5.3
pex: Building pex: 515067.0ms
pex:   Resolving distributions: 510907.4ms
pex:       Packaging airflow: 1362.6ms
pex:       Packaging dill: 208.2ms
pex:       Packaging Markdown: 235.6ms
pex:       Packaging funcsigs: 170.6ms
pex:       Packaging setproctitle: 437.6ms
pex:       Packaging Pygments: 1198.4ms
pex:       Packaging Flask-Login: 149.9ms
pex:       Packaging Flask: 349.3ms
pex:       Packaging Babel: 1279.0ms
pex:       Packaging croniter: 169.8ms
pex:       Packaging pandas: 334469.6ms
pex:       Packaging SQLAlchemy: 1508.0ms
pex:       Packaging gunicorn: 304.2ms
pex:       Packaging Flask-Admin: 698.0ms
pex:       Packaging python-dateutil: 204.6ms
pex:       Packaging python-daemon: 2238.5ms
pex:       Packaging thrift: 587.5ms
pex:       Packaging Flask-WTF: 183.5ms
pex:       Packaging future: 743.5ms
pex:       Packaging Jinja2: 271.9ms
pex:       Packaging requests: 499.1ms
pex:       Packaging chartkick: 161.6ms
pex:       Packaging alembic: 317.7ms
pex:       Packaging Werkzeug: 350.5ms
pex:       Packaging Mako: 255.6ms
pex:       Packaging six: 185.3ms
pex:       Packaging docutils: 637.9ms
pex:       Packaging python-editor: 151.3ms
pex:       Packaging itsdangerous: 148.5ms
pex:       Packaging numpy: 139877.5ms
pex:       Packaging MarkupSafe: 260.5ms
pex:       Packaging lockfile: 1148.6ms
pex:       Packaging WTForms: 299.9ms
pex:       Packaging setuptools: 422.8ms
Saving PEX file to airflow3.pex
root@a27e175e913a:/# ./airflow3.pex --help
Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 326, in execute
  File ".bootstrap/_pex/pex.py", line 258, in _wrap_coverage
  File ".bootstrap/_pex/pex.py", line 290, in _wrap_profiling
  File ".bootstrap/_pex/pex.py", line 367, in _execute
  File ".bootstrap/_pex/pex.py", line 394, in execute_script
  File ".bootstrap/_pex/finders.py", line 284, in get_script_from_distributions
  File ".bootstrap/_pex/finders.py", line 268, in get_script_from_distribution
  File ".bootstrap/_pex/finders.py", line 241, in get_script_from_egg
  File ".bootstrap/pkg_resources/__init__.py", line 1490, in metadata_listdir
  File ".bootstrap/pkg_resources/__init__.py", line 1574, in _listdir
OSError: [Errno 2] No such file or directory: '/root/.pex/install/alembic-0.8.6-py2.7.egg.0e6d2814c6032451e79e81ba52a759d635d5e556/alembic-0.8.6-py2.7.egg/EGG-INFO/scripts'

Can anyone point me in the right direction? 

Thanks -- your help is much appreciated!

-M

John Sirois

unread,
Jul 21, 2016, 3:22:00 PM7/21/16
to Matt Hite, pants-devel
I don't repro with pex 1.1.14 / python 2.7.12.  Running `pex airflow -c airflow -o airflow.pex` (takes a _wicked long time to build pandas and numpy!) nets me:

$ head -1 airflow.pex
#!/usr/bin/env python2.7

$ ./airflow.pex 
[2016-07-21 13:19:27,413] {__init__.py:36} INFO - Using executor SequentialExecutor
usage: airflow [-h]
               {resetdb,render,variables,pause,version,initdb,test,unpause,run,list_tasks,backfill,list_dags,kerberos,worker,webserver,flower,scheduler,task_state,trigger_dag,serve_logs,clear,upgradedb}
               ...
airflow: error: too few arguments
$ ./airflow.pex version
[2016-07-21 13:21:47,712] {__init__.py:36} INFO - Using executor SequentialExecutor
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   v1.7.1.3

Can you verify you're using latest pex 1st, and if not start an issue with more details about your setup at https://github.com/pantsbuild/pex/issues/new .

Matt Hite

unread,
Jul 21, 2016, 4:59:56 PM7/21/16
to John Sirois, pants-devel
Interesting. I'm doing this with python 2.7.3 and pex 1.1.14. My build environment is a docker container; I'm not sure if that is somehow contributing.

Here's how the container is setup:

$ more Dockerfile.wheezy
FROM debian:wheezy
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y build-essential ruby-dev python-dev wget ca-certificates libxml2-dev libxslt-dev libffi-dev
RUN pip install urllib3[secure]
RUN pip install requests
RUN pip install pex
ENV RUBYOPT -E utf-8
RUN gem install fpm
CMD "/bin/bash"

We're going to try this in just a plain wheezy VM (with the same bootstrapping specified in the Dockerfile) and see if we hit the issue.

-M

Kris Wilson

unread,
Jul 21, 2016, 5:17:44 PM7/21/16
to Matt Hite, John Sirois, pants-devel
is /root/.pex writable in the docker container? for wheels (which must be "installed") and zip_safe=False deps, pex has to explode these somewhere on disk.

it defaults to $HOME/.pex, so if that target isn't writable by you can control this location at runtime using the PEX_ROOT env variable:

$ PEX_ROOT=/tmp/some_writable_dir ./the.pex

Matt Hite

unread,
Jul 21, 2016, 6:17:30 PM7/21/16
to Kris Wilson, John Sirois, pants-devel
Yes, /root/.pex is writeable. I tried it in a normal wheezy VM with the same problem, unfortunately.

Regarding this error:

OSError: [Errno 2] No such file or directory: '/root/.pex/install/Flask_Login-0.2.11-py2.7.egg.bd7c1959e5c1f6a8c34c070cc93c67b55a45b9b9/Flask_Login-0.2.11-py2.7.egg/EGG-INFO/scripts'

I can see that '/root/.pex/install/Flask_Login-0.2.11-py2.7.egg.bd7c1959e5c1f6a8c34c070cc93c67b55a45b9b9/Flask_Login-0.2.11-py2.7.egg/EGG-INFO' exists on disk, but there is no 'scripts' inside it.

root@0bbba945bf70:~/.pex/install/Flask_Cache-0.13.1-py2.7.egg.57c4257163765d8d5a7be01c059b291c0880ed47/Flask_Cache-0.13.1-py2.7.egg/EGG-INFO# ls
PKG-INFO  SOURCES.txt  dependency_links.txt  not-zip-safe  requires.txt  top_level.txt

This is a real head scratcher for me. We also tried it on Ubuntu 16.04 LTS VM with the exact same issue. We bootstrapped the environment in the same manner specified in the Dockerfile we were using for wheezy.

What is your build environment like? I'm excited you got it to work, but now I'm just trying to figure out what I'm doing wrong/different! :(

John Sirois

unread,
Jul 21, 2016, 6:23:30 PM7/21/16
to Matt Hite, Kris Wilson, pants-devel
On Thu, Jul 21, 2016 at 4:17 PM, Matt Hite <li...@beatmixed.com> wrote:
Yes, /root/.pex is writeable. I tried it in a normal wheezy VM with the same problem, unfortunately.

Regarding this error:

OSError: [Errno 2] No such file or directory: '/root/.pex/install/Flask_Login-0.2.11-py2.7.egg.bd7c1959e5c1f6a8c34c070cc93c67b55a45b9b9/Flask_Login-0.2.11-py2.7.egg/EGG-INFO/scripts'

I can see that '/root/.pex/install/Flask_Login-0.2.11-py2.7.egg.bd7c1959e5c1f6a8c34c070cc93c67b55a45b9b9/Flask_Login-0.2.11-py2.7.egg/EGG-INFO' exists on disk, but there is no 'scripts' inside it.

root@0bbba945bf70:~/.pex/install/Flask_Cache-0.13.1-py2.7.egg.57c4257163765d8d5a7be01c059b291c0880ed47/Flask_Cache-0.13.1-py2.7.egg/EGG-INFO# ls
PKG-INFO  SOURCES.txt  dependency_links.txt  not-zip-safe  requires.txt  top_level.txt

This is a real head scratcher for me. We also tried it on Ubuntu 16.04 LTS VM with the exact same issue. We bootstrapped the environment in the same manner specified in the Dockerfile we were using for wheezy.

What is your build environment like? I'm excited you got it to work, but now I'm just trying to figure out what I'm doing wrong/different! :(

My test was on my bare arch linux development machine, no containers.
The pex venv I use to run pex is as such:
$ pyenv activate pex
(pex) $ pip list
pex (1.1.14)
pip (8.1.2)
requests (2.10.0)
setuptools (24.3.0)
wheel (0.29.0)

Brian Wickman

unread,
Jul 21, 2016, 6:35:05 PM7/21/16
to Matt Hite, Kris Wilson, John Sirois, pants-devel
Curious which setuptools is in your environment (or which setuptools built that egg.)  There is no real published standard for eggs (yay, wheels!) so we rely upon sketchy-at-best reverse engineering that may not be consistent from version to version or certain combination of features.  'scripts' is probably being searched in all packages to find something matching PEX_SCRIPT but some failsafe (try/except) is missing in pex that is usually unnecessary with eggs built with most versions of setuptools.  A good reason to migrate to wheels :-)

Matt Hite

unread,
Jul 21, 2016, 6:55:59 PM7/21/16
to John Sirois, Kris Wilson, pants-devel
Ah, my pip list looks to be the same except setuptools is older:

root@c229323db42f:/data# pip list
pex (1.1.14)
pip (8.1.2)
requests (2.10.0)
setuptools (20.10.1)
wheel (0.29.0)

It looks like this because 'pip install pex' downgrades (?) setuptools? 

root@c229323db42f:/data# pip install pex
Collecting pex
/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading pex-1.1.14-py2.py3-none-any.whl (105kB)
    100% |################################| 112kB 4.1MB/s
Collecting setuptools<20.11,>=2.2 (from pex)
  Downloading setuptools-20.10.1-py2.py3-none-any.whl (509kB)
    100% |################################| 512kB 2.8MB/s
Installing collected packages: setuptools, pex
  Found existing installation: setuptools 24.3.0
    Uninstalling setuptools-24.3.0:
      Successfully uninstalled setuptools-24.3.0
Successfully installed pex-1.1.14 setuptools-20.10.1

When I installed pip prior to installing pex, I can see the recent version (the same one you have) installed:

root@c229323db42f:/data# wget https://bootstrap.pypa.io/get-pip.py -O - | python
--2016-07-21 22:30:53--  https://bootstrap.pypa.io/get-pip.py
Resolving bootstrap.pypa.io (bootstrap.pypa.io)... 151.101.40.175
Connecting to bootstrap.pypa.io (bootstrap.pypa.io)|151.101.40.175|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1524722 (1.5M) [text/x-python]
Saving to: `STDOUT'

100%[=================================================================>] 1,524,722   5.37M/s   in 0.3s

2016-07-21 22:30:54 (5.37 MB/s) - written to stdout [1524722/1524722]

Collecting pip
/tmp/tmp4jXUSg/pip.zip/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
/tmp/tmp4jXUSg/pip.zip/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  Downloading pip-8.1.2-py2.py3-none-any.whl (1.2MB)
    100% |################################| 1.2MB 1.1MB/s
Collecting setuptools
  Downloading setuptools-24.3.0-py2.py3-none-any.whl (442kB)
    100% |################################| 450kB 2.4MB/s
Collecting wheel
  Downloading wheel-0.29.0-py2.py3-none-any.whl (66kB)
    100% |################################| 71kB 6.4MB/s
Installing collected packages: pip, setuptools, wheel
Successfully installed pip-8.1.2 setuptools-24.3.0 wheel-0.29.0
/tmp/tmp4jXUSg/pip.zip/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
root@c229323db42f:/data# pip list
pip (8.1.2)
setuptools (24.3.0)
wheel (0.29.0)

Interestingly enough, if I then upgrade setuptools AFTER building pex itself (which as I mentioned downgrades setuptools), I can build airflow successfully.

I don't know enough about the inner workings to know whether or not this is a bug or an operator error? I'm guessing your virtualenv probably shielded you from my issue?

-M


Matt Hite

unread,
Jul 21, 2016, 6:57:25 PM7/21/16
to Brian Wickman, Kris Wilson, John Sirois, pants-devel
This was a great hint. See my previous email -- pex seemed to downgrade my setuptools. If I do a pip install setuptools --upgrade after 'pip install pex' downgraded setuptools, I was able to use pex to build the airflow.pex file successfully and launch it.

Can't say I understand the how/what/why of all this, but we're on to something. ;)

John Sirois

unread,
Jul 21, 2016, 7:05:07 PM7/21/16
to Matt Hite, Brian Wickman, Kris Wilson, pants-devel
On Thu, Jul 21, 2016 at 4:57 PM, Matt Hite <li...@beatmixed.com> wrote:
This was a great hint. See my previous email -- pex seemed to downgrade my setuptools. If I do a pip install setuptools --upgrade after 'pip install pex' downgraded setuptools, I was able to use pex to build the airflow.pex file successfully and launch it.

Can't say I understand the how/what/why of all this, but we're on to something. ;)

If you're game to send the PR I see no issues with a bump of the upper bound.  Although this does not solve the problem in-general, it will solve cases like yours, and that's forward progress.

Brian Wickman

unread,
Jul 21, 2016, 7:07:01 PM7/21/16
to John Sirois, Matt Hite, Kris Wilson, pants-devel
I don't think that changing setuptools is the fix.  Changing the code that assumes 'scripts' will be in every .egg is the right thing to do (unless that code is in setuptools, then, welp.)

John Sirois

unread,
Jul 21, 2016, 7:11:21 PM7/21/16
to Brian Wickman, Matt Hite, Kris Wilson, pants-devel
Yeah - I think its in setuptools:
$ git grep 'scripts' *.py
setup.py:    'console_scripts': [

Brian Wickman

unread,
Jul 21, 2016, 7:25:36 PM7/21/16
to John Sirois, Matt Hite, Kris Wilson, pants-devel

John Sirois

unread,
Jul 21, 2016, 7:27:49 PM7/21/16
to Brian Wickman, Matt Hite, Kris Wilson, pants-devel
Aha, learn me a grep, should have been `git grep scripts **/*.py`!
Matt - are you willing to take on the proper fix Brian points out?

Matt Hite

unread,
Jul 21, 2016, 7:47:28 PM7/21/16
to John Sirois, Brian Wickman, Kris Wilson, pants-devel
I'm afraid I'm a bit green on pex itself and a bit fuzzy on the proposed fix. 

Should the following code be changed to not raise an exception? What should it return in that case?


Thanks!

Matt Hite

unread,
Jul 26, 2016, 5:05:32 PM7/26/16
to John Sirois, Brian Wickman, Kris Wilson, pants-devel
Hi, folks. It's me again. :)

I thought the downgrade/upgrade of setuptools was going to serve as a suitable workaround for the issue I was hitting. I'm afraid it has poked its head up in a new and unusual way. The pex file I have created for airflow actually works for some users on a system, but for other users it crashes with the Flask_Cache 'scripts' issue I hit before:

Traceback (most recent call last):
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 326, in execute
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 258, in _wrap_coverage
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 290, in _wrap_profiling
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 367, in _execute
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 394, in execute_script
  File "/usr/local/bin/airflow/.bootstrap/_pex/finders.py", line 284, in get_script_from_distributions
  File "/usr/local/bin/airflow/.bootstrap/_pex/finders.py", line 268, in get_script_from_distribution
  File "/usr/local/bin/airflow/.bootstrap/_pex/finders.py", line 241, in get_script_from_egg
  File "/usr/local/bin/airflow/.bootstrap/pkg_resources/__init__.py", line 1602, in metadata_listdir
  File "/usr/local/bin/airflow/.bootstrap/pkg_resources/__init__.py", line 1686, in _listdir
OSError: [Errno 2] No such file or directory: '/root/.pex/install/Flask_Cache-0.13.1-py2.7.egg.57c4257163765d8d5a7be01c059b291c0880ed47/Flask_Cache-0.13.1-py2.7.egg/EGG-INFO/scripts'

I am totally stumped. We build a new VM, run 'airflow -h' with one user and it works, then we switch to another user, launch with 'airflow -h' again, and it crashes with aforementioned exception. The users themselves have no differences of note that I can find. :(

I don't even know where to begin troubleshooting at this point.

Matt Hite

unread,
Jul 26, 2016, 6:10:16 PM7/26/16
to John Sirois, Brian Wickman, Kris Wilson, pants-devel
Well, here's another data point and I'm not sure what it means.

I can get rid of the error if I run "PEX_ROOT=/tmp/.pex airflow -h" and don't let the cache default to ~/.pex.

Any ideas on why this might be? 

Kris Wilson

unread,
Jul 26, 2016, 6:22:07 PM7/26/16
to Matt Hite, John Sirois, Brian Wickman, pants-devel
> "We build a new VM, run 'airflow -h' with one user and it works, then we switch to another user, launch with 'airflow -h' again, and it crashes with aforementioned exception. The users themselves have no differences of note that I can find."
>
> "I can get rid of the error if I run "PEX_ROOT=/tmp/.pex airflow -h" and don't let the cache default to ~/.pex."

this all still smells like a permissions issue to me.

in the "two different users on the same VM" case, does `~/.pex` resolve to `/root/.pex` in both cases? if you manually remove the `~/.pex` dir before switching to the second user, does it still repro?


Matt Hite

unread,
Jul 26, 2016, 6:48:49 PM7/26/16
to Kris Wilson, John Sirois, Brian Wickman, pants-devel
Yeah, that's what I was suspecting too but I can't seem to pinpoint any difference.

Here's an example of a user (on the same new fresh VM) that works:

mhite@m0000587:~$ rm -rf .pex
mhite@m0000587:~$ sudo rm -rf /tmp/.pex
mhite@m0000587:~$ env
TERM=xterm-256color
SHELL=/bin/bash
SSH_CLIENT=10.16.8.26 60547 22
SSH_TTY=/dev/pts/0
USER=mhite
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:
SSH_AUTH_SOCK=/tmp/ssh-yOcEZoqSdV/agent.4033
MAIL=/var/mail/mhite
PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
PWD=/home/mhite
LANG=en_US.UTF-8
SHLVL=1
HOME=/home/mhite
LOGNAME=mhite
SSH_CONNECTION=10.16.8.26 60547 10.16.129.26 22
LC_CTYPE=en_US.UTF-8
_=/usr/bin/env
mhite@m0000587:~$ airflow -h
[2016-07-26 15:45:49,049] {__init__.py:36} INFO - Using executor SequentialExecutor
usage: airflow [-h]

               {resetdb,render,variables,pause,version,initdb,test,unpause,run,list_tasks,backfill,list_dags,kerberos,worker,webserver,flower,scheduler,task_state,trigger_dag,serve_logs,clear,upgradedb}
               ...

positional arguments:
  {resetdb,render,variables,pause,version,initdb,test,unpause,run,list_tasks,backfill,list_dags,kerberos,worker,webserver,flower,scheduler,task_state,trigger_dag,serve_logs,clear,upgradedb}
                        sub-command help
    resetdb             Burn down and rebuild the metadata database
    render              Render a task instance's template(s)
    variables           List all variables
    pause               Pause a DAG
    version             Show the version
    initdb              Initialize the metadata database
    test                Test a task instance. This will run a task without
                        checking for dependencies or recording it's state in
                        the database.
    unpause             Pause a DAG
    run                 Run a single task instance
    list_tasks          List the tasks within a DAG
    backfill            Run subsections of a DAG for a specified date range
    list_dags           List all the DAGs
    kerberos            Start a kerberos ticket renewer
    worker              Start a Celery worker node
    webserver           Start a Airflow webserver instance
    flower              Start a Celery Flower
    scheduler           Start a scheduler scheduler instance
    task_state          Get the status of a task instance
    trigger_dag         Trigger a DAG run
    serve_logs          Serve logs generate by worker
    clear               Clear a set of task instance, as if they never ran
    upgradedb           Upgrade metadata database to latest version

optional arguments:
  -h, --help            show this help message and exit
mhite@m0000587:~$ which airflow
/usr/local/bin/airflow

Now, I switch to root, who doesn't work:

mhite@m0000587:~$ sudo -u root -i
ro...@m0000587.dynosde.enops.net:~# env
SHELL=/bin/bash
TERM=xterm-256color
USER=root
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:
SUDO_USER=mhite
SUDO_UID=5165
USERNAME=root
MAIL=/var/mail/root
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/root
LANG=en_US.UTF-8
SHLVL=1
SUDO_COMMAND=/bin/bash
HOME=/root
LOGNAME=root
LC_CTYPE=en_US.UTF-8
SUDO_GID=5000
_=/usr/bin/env
ro...@m0000587.dynosde.enops.net:~# rm -rf .pex
ro...@m0000587.dynosde.enops.net:~# rm -rf /tmp/.pex
ro...@m0000587.dynosde.enops.net:~# airflow -h
Traceback (most recent call last):
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 326, in execute
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 258, in _wrap_coverage
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 290, in _wrap_profiling
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 367, in _execute
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 394, in execute_script
  File "/usr/local/bin/airflow/.bootstrap/_pex/finders.py", line 284, in get_script_from_distributions
  File "/usr/local/bin/airflow/.bootstrap/_pex/finders.py", line 268, in get_script_from_distribution
  File "/usr/local/bin/airflow/.bootstrap/_pex/finders.py", line 241, in get_script_from_egg
  File "/usr/local/bin/airflow/.bootstrap/pkg_resources/__init__.py", line 1602, in metadata_listdir
  File "/usr/local/bin/airflow/.bootstrap/pkg_resources/__init__.py", line 1686, in _listdir
OSError: [Errno 2] No such file or directory: '/root/.pex/install/Flask_Cache-0.13.1-py2.7.egg.57c4257163765d8d5a7be01c059b291c0880ed47/Flask_Cache-0.13.1-py2.7.egg/EGG-INFO/scripts'

ro...@m0000587.dynosde.enops.net:~# ls -al .pex
total 20
drwxr-xr-x  3 root root  4096 Jul 26 15:47 .
drwx------  5 root root  4096 Jul 26 15:47 ..
drwxr-xr-x 52 root root 12288 Jul 26 15:47 install
ro...@m0000587.dynosde.enops.net:~# ls -al /root/.pex/install/Flask_Cache-0.13.1-py2.7.egg.57c4257163765d8d5a7be01c059b291c0880ed47/Flask_Cache-0.13.1-py2.7.egg/EGG-INFO/
total 32
drwxr-xr-x 2 root root 4096 Jul 26 15:47 .
drwxr-xr-x 5 root root 4096 Jul 26 15:47 ..
-rw-r--r-- 1 root root    1 Jul 26 15:47 dependency_links.txt
-rw-r--r-- 1 root root    1 Jul 26 15:47 not-zip-safe
-rw-r--r-- 1 root root 1032 Jul 26 15:47 PKG-INFO
-rw-r--r-- 1 root root    5 Jul 26 15:47 requires.txt
-rw-r--r-- 1 root root  718 Jul 26 15:47 SOURCES.txt
-rw-r--r-- 1 root root   12 Jul 26 15:47 top_level.txt


Matt Hite

unread,
Jul 26, 2016, 7:09:18 PM7/26/16
to Kris Wilson, John Sirois, Brian Wickman, pants-devel
I've got another (presumably unrelated problem) I'm trying to figure out. When pex packages up an application, where does it put the console scripts from a package? When I run "airflow gunicorn" I can see it crashes, looking to find the gunicorn application (it spawns an execv) and failing.

IE.

        os.execvp(
            'gunicorn', run_args
        )

mhite@m0000587:~$ airflow webserver
[2016-07-26 15:57:07,427] {__init__.py:36} INFO - Using executor SequentialExecutor
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/

[2016-07-26 15:57:08,002] {models.py:154} INFO - Filling up the DagBag from /home/mhite/airflow/dags
Running the Gunicorn server with 4 syncworkers on host 0.0.0.0 and port 8080 with a timeout of 120...
Traceback (most recent call last):
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 326, in execute
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 258, in _wrap_coverage
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 290, in _wrap_profiling
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 367, in _execute
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 398, in execute_script
  File "/usr/local/bin/airflow/.bootstrap/_pex/pex.py", line 412, in execute_content
  File "<exec_function>", line 4, in exec_function
  File "/home/mhite/.pex/install/airflow-1.7.1.3-py2-none-any.whl.5b7337894333d23579dff75e56e0b101e6a9dca0/airflow-1.7.1.3-py2-none-any.whl/airflow-1.7.1.3.dist-info/airflow-1.7.1.3.data/scripts/airflow", line 15, in <module>
  File "/home/mhite/.pex/install/airflow-1.7.1.3-py2-none-any.whl.5b7337894333d23579dff75e56e0b101e6a9dca0/airflow-1.7.1.3-py2-none-any.whl/airflow/bin/cli.py", line 423, in webserver
    'gunicorn', run_args
  File "/usr/lib/python2.7/os.py", line 344, in execvp
    _execvpe(file, args)
  File "/usr/lib/python2.7/os.py", line 380, in _execvpe
    func(fullname, *argrest)
OSError: [Errno 2] No such file or directory


mhite@m0000587:~$ find .pex -type f -name gunicorn
mhite@m0000587:~$ find .pex -type d -name gunicorn
.pex/install/gunicorn-19.3.0-py2.py3-none-any.whl.e4667f7805c68660959765693e3ef8310806b2c7/gunicorn-19.3.0-py2.py3-none-any.whl/gunicorn

Kris Wilson

unread,
Jul 26, 2016, 7:48:41 PM7/26/16
to Matt Hite, John Sirois, Brian Wickman, pants-devel
other console scripts won't get written freely to disk like they normally would via a `pip install` - everything is contained in the pex. so using pex in cases like this can break packaging assumptions on part of the package authors unless they've explicitly tested with pex.

thus, the easiest way to make this work today (short of upstream contributions) would be to generate two pexes, each with their own entrypoint.

like so:

$ pex airflow -c airflow -o /usr/local/bin/airflow
$ pex airflow -c gunicorn -o /usr/local/bin/gunicorn

then when airflow exec's `gunicorn`, it'll exec the pex with the console script entrypoint to run `gunicorn`. it's somewhat suboptimal because it requires two copies of the pex just to get two on-$PATH entry points - but this could probably be fixed by implementing some kind of "infer the console script to run from sys.argv[0]" + "symlinking symbolic names to a pex" feature.


btw, mind filing a pantsbuild/pex issue for the problem you're having above?


Kris Wilson

unread,
Jul 26, 2016, 7:58:19 PM7/26/16
to Matt Hite, John Sirois, Brian Wickman, pants-devel
the other option around "one pex w/ multiple on-$PATH entrypoints" would be to create small wrappers targeting each entrypoint separately using PEX_ENTRYPOINT.

something like:

#!/bin/sh
PEX_ENTRYPOINT='gunicorn.app.pasterapp:run' ./the.pex $*

etc

Kris Wilson

unread,
Jul 26, 2016, 8:14:14 PM7/26/16
to Matt Hite, John Sirois, Brian Wickman, pants-devel
oops - that should actually be `PEX_MODULE` vs `PEX_ENTRYPOINT`.

Matt Hite

unread,
Jul 26, 2016, 8:44:48 PM7/26/16
to Kris Wilson, John Sirois, Brian Wickman, pants-devel
This worked great, thank you! 

Benjamin Yolken

unread,
Nov 28, 2016, 6:01:07 PM11/28/16
to Pants Developers, john....@gmail.com, li...@beatmixed.com, kwi...@twitter.com
I'm running into a very similar issue when trying to pex-ify Airbnb Superset (https://github.com/airbnb/superset). Are there any plans to fix the handling of eggs without scripts? If not, should I put together a pull request? Thanks!

-Ben

Kris Wilson

unread,
Nov 29, 2016, 2:08:33 PM11/29/16
to Benjamin Yolken, Pants Developers, John Sirois, Matt Hite
A pull request (+ probably a github issue covering the crux of the problem) would be awesome if you're up for it.

Benjamin Yolken

unread,
Dec 7, 2016, 5:42:57 PM12/7/16
to Kris Wilson, Pants Developers, John Sirois, Matt Hite
Created a pull request with a fix: https://github.com/pantsbuild/pex/pull/328

Thanks,

-Ben


Reply all
Reply to author
Forward
0 new messages