Running a Docker container within a DAG

721 views
Skip to first unread message

francesc...@weightwatchers.com

unread,
Jul 1, 2018, 3:09:33 PM7/1/18
to cloud-composer-discuss
Hi,
I am looking for best practices around running  a Docker container as a task within my DAG.
I am aware  that there is a DockerOperator in the airflow codebase but using that would mean that the container would be executed within the airflow worker.
I am worried mainly about the container using resources from the worker.
Is it better to create an operator or use a Python or Bash Operator to spin up a gce instance with the container ? Is that the recommended way ?

Francesco

Trevor Edwards

unread,
Jul 2, 2018, 1:05:27 PM7/2/18
to cloud-composer-discuss
Hi Francesco,

We're working on a better experience for running Docker containers. In Composer currently, you can use the DockerOperator, but that has the resource issues you mentioned. You may also be able to write a PythonOperator to spawn kubernetes jobs.

In the future, we plan to enable you to either (1) spawn a kubernetes pod in your Composer cluster by including the default credentials in airflow workers and backporting the KubernetesPodOperator or (2) spawn the pod in a separate cluster via a GKEPodOperator: https://github.com/apache/incubator-airflow/pull/3532.

We are also working on making KubernetesExecutor an alpha feature which would make docker containers part of all worker tasks.

francesc...@weightwatchers.com

unread,
Jul 3, 2018, 11:22:55 AM7/3/18
to cloud-composer-discuss
Hi Trevor,
thanks for the reply. Will look forward to those features. In the meantime, I went with the more conservative approach and used the beta compute create-with-container functionality to spin up the container on GCE. Unfortunately, this seems to be only possibly through a BashOperator as the command doesn't seem available in the compute rest api
Reply all
Reply to author
Forward
0 new messages