I currently have a setup where I have one machine running airflow webserver and airflow scheduler. Then I have a set of three machines running airflow worker all in the same default queue. Each one of the workers has docker installed and can run docker locally. The probably I have is that building a docker image works but that image is only accessible to the worker that made it. So when another worker tries to run the docker image, it cant find it because its on another machine. Is there a way to say, for all tasks in this dag, run on the same celery worker?
The dag
dag = airflow.DAG('local_docker', default_args=default_args, schedule_interval=dt.timedelta(minutes=15))
def build_docker_image(**kwargs):
docker = dock.DockerObject()
image = docker.build_image(docker_dir=os.path.dirname(os.path.abspath(__file__)))
return image
def run_docker_image(**kwargs):
ti = kwargs['ti']
image = ti.xcom_pull(task_ids="build_docker_image")
docker = dock.DockerObject()
docker.run_image(image)
t1 = airflow.operators.PythonOperator(
task_id='build_docker_image',
dag=dag,
provide_context=True,
python_callable=build_docker_image)
t2 = airflow.operators.PythonOperator(
task_id='run_docker_image',
dag=dag,
provide_context=True,
python_callable=run_docker_image)
t2.set_upstream(t1)
armando