What is best way to run non-standard linux executable under control of Cloud Composer?

425 views
Skip to first unread message

Norbert Kremer

unread,
Jul 26, 2018, 2:44:36 PM7/26/18
to cloud-composer-discuss
I need to decompress some large files that are compressed with the RAR algorithm.

I would like to trigger a Cloud Composer task as each RAR file is copied to a bucket ( I have a cloud function listening on the bucket and this triggers a DAG, this works fine already.)

I tried BashOperator. This fails because neither unrar or unar programs are available on the CC k8s nodes, and AKAICT, can't be installed there.  In any event, I would like to use another operator to run this task on a resource outside of the CC k8s cluster.

I'm open to anything that works, whether that be SSHHook, KubernetesExecutor or anything else.

So far I have tried SSHHook.  But when I add the import for SSHExecuteOperator to my DAG like this
from airflow.contrib.operators.ssh_execute_operator import SSHExecuteOperator
I get this in Airflow dashboard.
Broken DAG: [/home/airflow/gcs/dags/gcs_response_dag.py] No module named ssh_execute_operator

What am I missing here?    What's the best way forward?  Looking for simplest solution that works, nothing fancy is needed.
thanks

Tim Swast

unread,
Jul 26, 2018, 6:16:57 PM7/26/18
to Norbert Kremer, cloud-composer-discuss
With Composer environments version 1.0.0 and later you can use the KubernetesPodOperator to run tasks in a container. We have an example at https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/composer/workflows/kubernetes_pod_operator.py

Docs coming soon.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.
To post to this group, send email to cloud-compo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-composer-discuss/cee503ae-24c5-48f8-bcf8-64704a856e48%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
  •  Tim Swast
  •  Software Friendliness Engineer
  •  Google Cloud Developer Relations
  •  Seattle, WA, USA

Norbert Kremer

unread,
Jul 31, 2018, 1:32:35 PM7/31/18
to cloud-composer-discuss
While I'm excited about using kubernetes in the future, in the absence of documentation, how useful is kubernetes_pod_operator?  We're working against deadlines to deliver code this week and this seems to be more complex than what I need, both for devs and also to train our ops team for running in production.

Isn't there a simple way to run a Cloud Composer task on a GCE VM using SSHExecuteOperator?  I've tried this and I'm unable to get it to work so far.

thanks, 

On Thursday, July 26, 2018 at 6:16:57 PM UTC-4, Tim Swast wrote:
With Composer environments version 1.0.0 and later you can use the KubernetesPodOperator to run tasks in a container. We have an example at https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/composer/workflows/kubernetes_pod_operator.py

Docs coming soon.

On Thu, Jul 26, 2018 at 11:44 AM Norbert Kremer <norbert....@gmail.com> wrote:
I need to decompress some large files that are compressed with the RAR algorithm.

I would like to trigger a Cloud Composer task as each RAR file is copied to a bucket ( I have a cloud function listening on the bucket and this triggers a DAG, this works fine already.)

I tried BashOperator. This fails because neither unrar or unar programs are available on the CC k8s nodes, and AKAICT, can't be installed there.  In any event, I would like to use another operator to run this task on a resource outside of the CC k8s cluster.

I'm open to anything that works, whether that be SSHHook, KubernetesExecutor or anything else.

So far I have tried SSHHook.  But when I add the import for SSHExecuteOperator to my DAG like this
from airflow.contrib.operators.ssh_execute_operator import SSHExecuteOperator
I get this in Airflow dashboard.
Broken DAG: [/home/airflow/gcs/dags/gcs_response_dag.py] No module named ssh_execute_operator

What am I missing here?    What's the best way forward?  Looking for simplest solution that works, nothing fancy is needed.
thanks

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

Feng Lu

unread,
Aug 1, 2018, 12:35:23 AM8/1/18
to norbert....@gmail.com, cloud-composer-discuss
Have you tried wrapping the following command "gcloud compute ssh" up with a BashOperator or PythonOperator? 

On Tue, Jul 31, 2018 at 10:32 AM Norbert Kremer <norbert....@gmail.com> wrote:
While I'm excited about using kubernetes in the future, in the absence of documentation, how useful is kubernetes_pod_operator?  We're working against deadlines to deliver code this week and this seems to be more complex than what I need, both for devs and also to train our ops team for running in production.

Isn't there a simple way to run a Cloud Composer task on a GCE VM using SSHExecuteOperator?  I've tried this and I'm unable to get it to work so far.

thanks, 

On Thursday, July 26, 2018 at 6:16:57 PM UTC-4, Tim Swast wrote:
With Composer environments version 1.0.0 and later you can use the KubernetesPodOperator to run tasks in a container. We have an example at https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/composer/workflows/kubernetes_pod_operator.py

Docs coming soon.

On Thu, Jul 26, 2018 at 11:44 AM Norbert Kremer <norbert....@gmail.com> wrote:
I need to decompress some large files that are compressed with the RAR algorithm.

I would like to trigger a Cloud Composer task as each RAR file is copied to a bucket ( I have a cloud function listening on the bucket and this triggers a DAG, this works fine already.)

I tried BashOperator. This fails because neither unrar or unar programs are available on the CC k8s nodes, and AKAICT, can't be installed there.  In any event, I would like to use another operator to run this task on a resource outside of the CC k8s cluster.

I'm open to anything that works, whether that be SSHHook, KubernetesExecutor or anything else.

So far I have tried SSHHook.  But when I add the import for SSHExecuteOperator to my DAG like this
from airflow.contrib.operators.ssh_execute_operator import SSHExecuteOperator
I get this in Airflow dashboard.
Broken DAG: [/home/airflow/gcs/dags/gcs_response_dag.py] No module named ssh_execute_operator

What am I missing here?    What's the best way forward?  Looking for simplest solution that works, nothing fancy is needed.
thanks

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.
--
  •  Tim Swast
  •  Software Friendliness Engineer
  •  Google Cloud Developer Relations
  •  Seattle, WA, USA

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.

To post to this group, send email to cloud-compo...@googlegroups.com.

Feng Lu

unread,
Aug 1, 2018, 12:40:14 AM8/1/18
to norbert....@gmail.com, cloud-composer-discuss
On a second thought, you could run "sudo apt-get install unrar-free && unrar YOUR-FILE" in the BashOperator which installs the executable and unrars the file. 

Ashish Gauli

unread,
Aug 16, 2018, 12:16:29 AM8/16/18
to cloud-composer-discuss
Were you able to find a solution? I am also stucked with the same issue? 

Tim Swast

unread,
Aug 16, 2018, 2:41:11 PM8/16/18
to Norbert Kremer, cloud-composer-discuss
Reply all
Reply to author
Forward
0 new messages