For our production azkaban environment, I'd like to use a python virtual environment for every call to python in an azkaban job. This creates a problem because we have many development azkaban jobs running that do not use a virtual environment that we would like to deploy to prod.
So I would prefer NOT to do something like:
command=source path/to/venv/bin/activate & python myjob.py
type=run_over_ssh_on_different_server #just a sample
In the .job files since we already have many, many .job files running various types of myjob.py files running in development. Since we already have jobs using custom job types, we also cannot create a custom job type.
Instead I'd like to start the azkaban executor server using a virtual environment. In our startup shell scripts for the server, this would look something like:
#!/bin/bash
# Shell script that starts the azkaban execution server on startup
# Executed as root
cd /opt/azkaban/azkaban-exec-server/build/install/azkaban-exec-server
source path/to/venv/bin/activate
. bin/azkaban-executor-start.sh #should inherit the virtual environment?
deactivate #should not effect the child process
My understanding is that the executor server will be started as a child process containing the python virtual environment.
I'm asking on a forum if you think this will work because this is a very expensive operation to test. It would require a pull request, someone else to review, then we would need to bounce the server and wait ~30 minutes to an hour to verify, then run a test in azkaban and see if the virtualenv is working.
So, will the execution server pickup the change that the call to activate makes? I have seen child processes mask environment variables in the past.