How to? gcloud compute ssh from bash operator

2,402 views
Skip to first unread message

Rick Otten

unread,
Sep 14, 2018, 9:49:44 AM9/14/18
to cloud-composer-discuss
When I try to use  "gcloud compute ssh" from a bash operator in a task, I get "no ssh keys available".

task1 = BashOperator(
                         task_id='mytask',
                         bash_command='gcloud compute ssh some_user@my_vm  \
                                                          --command=something.bash',
                         dag=dag
                        )

I can't find documentation that shows me if I use "--ssh-key-file" where I put that file so cloud composer can find it.

The documentation for how to use gcloud commands from composer is rather scant.  Other than hinting that you should use the bash operator, there isn't much else.

I'm not sure how to generate a gcloud ssh key.  Is that different from a regular ssh key?  When I run 'gcloud compute ssh' from my laptop (after running init), it doesn't seem to have any special requirements about keys and takes me right over to my vm.  Should I run 'gcloud init' in the composer instance somehow and then it will magically work?

Why wouldn't I just use plain old 'ssh -i' from the bash operator instead of gcloud's ssh?  

My understanding from reading comments is that we can't use the ssh operator/ssh hook if we require key based login instead of password based login.


Rick Otten

unread,
Sep 14, 2018, 1:16:52 PM9/14/18
to cloud-composer-discuss
Digging deeper into the logs, I can see that after it doesn't find an ssh key it generates one, and then waits about 3 minutes for it to propogate.  I just hadn't followed through for every log entry.  It was easier/faster for me to read the logs by opening the logfile in the google bucket than it was to use the Stackdriver Logging interface.

Then it was failing because it didn't have a default zone set.   I configured --zone=myzone and I got further.
It seems like it fails to connect the first time we use gcloud compute ssh, even with the wait, and even if I run the gcloud command from the command line on my laptop, but the second time I run the job it works.

I answered my own question about whether to just use plain ssh by observing what is reported back when I ran "--dry-run", which shows it is actually generating and running plain old ssh under the covers.

Because I'm using a custom port for ssh, I have to pass that as a custom ssh option, which breaks the "--command" argument.  So I have to add the command itself as a custom option rather than via the "--command" argument.

And then, if I indent the arguments in a large text block, the BashOperator complains that the bash_command is too long.  So I get rid of the indents and line ending escapes and set it up like this  which ends up with a bash operator that looks like this, which _almost_ works:


sshCommand = "gcloud compute ssh some_user@my_vm --zone=myzone -- -p 12345"

task1 = BashOperator(
                         task_id='snapshot_all_disks',
                         bash_command=sshCommand + " something.bash",
                         dag=dag
                        )

Rick Otten

unread,
Sep 14, 2018, 5:06:58 PM9/14/18
to cloud-composer-discuss


On Friday, September 14, 2018 at 1:16:52 PM UTC-4, Rick Otten wrote:

And then, if I indent the arguments in a large text block, the BashOperator complains that the bash_command is too long.  So I get rid of the indents and line ending escapes and set it up like this  which ends up with a bash operator that looks like this, which _almost_ works:



To follow up, by what I mean "almost" - sometimes the connection fails but most of the time it works.  I'm not sure why that is yet.

In a DAG with a bunch of ssh connections like this, I have to run it repeatedly - once for each ssh connection to initialize itself and then fail.  After that one has failed the first time it will work, and we can move on to the next connection.

At the moment I think it is good enough, and hopefully my research and discoveries today will help someone else get theirs working just a little faster than it took me.


Trevor Edwards

unread,
Sep 17, 2018, 1:50:01 PM9/17/18
to cloud-composer-discuss
I don't recommend generating new SSH keys with each task execution; you are likely to run out of GCE metadata this way (see https://cloud.google.com/compute/docs/storing-retrieving-metadata).

Composer mounts a few folders from GCS through which you could mount a static SSH key: https://cloud.google.com/composer/docs/concepts/cloud-storage. I'd recommend putting it under "dags" or "plugins" as it has the most reliable sync method.

Rick Otten

unread,
Sep 17, 2018, 2:56:11 PM9/17/18
to cloud-composer-discuss

On Monday, September 17, 2018 at 1:50:01 PM UTC-4, Trevor Edwards wrote:
I don't recommend generating new SSH keys with each task execution; you are likely to run out of GCE metadata this way (see https://cloud.google.com/compute/docs/storing-retrieving-metadata).

Composer mounts a few folders from GCS through which you could mount a static SSH key: https://cloud.google.com/composer/docs/concepts/cloud-storage. I'd recommend putting it under "dags" or "plugins" as it has the most reliable sync method.


It is not actually generating new ssh keys with each task execution.  It generated them once, the first time it ran. After that it is finding them somewhere.  I'm not sure where it stored them after generating them.  I don't see anything obvious in the storage bucket.

It is generating new keys for the first time every new task runs though.  So there must be a few of them around somewhere.  If they are just ssh keys, I'll see if I can generate some and drop them in the plugins folder and use /home/airflow/gcs/plugins for the path to them.  I'll delete the keys it auto-generated if I can find them.

 

Rick Otten

unread,
Sep 18, 2018, 1:59:47 PM9/18/18
to cloud-composer-discuss
I found the keys it was generating in the meta data and removed them.
I hand generated a public/private keypair and dropped them in the plugins folder and then used the --ssh-key-file option.
I get this which isn't obvious how to fix.  Should I run a "chmod" from the DAG before I try to invoke ssh?  There is nothing in the bucket set up that is obvious that will let me do a chmod 400 on the keyfiles.

[2018-09-18 17:49:12,312] {base_task_runner.py:98} INFO - Subtask: [2018-09-18 17:49:12,312] {bash_operator.py:101} INFO - @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2018-09-18 17:49:12,314] {base_task_runner.py:98} INFO - Subtask: [2018-09-18 17:49:12,313] {bash_operator.py:101} INFO - @         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
[2018-09-18 17:49:12,314] {base_task_runner.py:98} INFO - Subtask: [2018-09-18 17:49:12,314] {bash_operator.py:101} INFO - @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2018-09-18 17:49:12,315] {base_task_runner.py:98} INFO - Subtask: [2018-09-18 17:49:12,315] {bash_operator.py:101} INFO - Permissions 0644 for '/home/airflow/gcs/plugins/my_gcloud_key' are too open.
[2018-09-18 17:49:12,316] {base_task_runner.py:98} INFO - Subtask: [2018-09-18 17:49:12,316] {bash_operator.py:101} INFO - It is recommended that your private key files are NOT accessible by others.
[2018-09-18 17:49:12,316] {base_task_runner.py:98} INFO - Subtask: [2018-09-18 17:49:12,316] {bash_operator.py:101} INFO - This private key will be ignored.
[2018-09-18 17:49:12,317] {base_task_runner.py:98} INFO - Subtask: [2018-09-18 17:49:12,317] {bash_operator.py:101} INFO - key_load_private_type: bad permissions

Rick Otten

unread,
Sep 18, 2018, 4:25:09 PM9/18/18
to cloud-composer-discuss

On Tuesday, September 18, 2018 at 1:59:47 PM UTC-4, Rick Otten wrote:

I get this which isn't obvious how to fix.  Should I run a "chmod" from the DAG before I try to invoke ssh?  There is nothing in the bucket set up that is obvious that will let me do a chmod 400 on the keyfiles.

I hacked in a chmod, and it worked.  In short it looks like this now:

 
sshCommand = "chmod 400 /home/airflow/gcs/plugins/my_gcloud_key; \
gcloud compute ssh some_user@my_vm \
--zone=us-central1-f --ssh-key-file=/home/airflow/gcs/plugins/my_gcloud_key \
-- -p 22333"

snapshots = BashOperator(
                         
task_id='mytask',
                         
bash_command=sshCommand + ' "some_script.bash"',
                         
dag=dag
                       
)

It no longer generates keys and loads them in metadata everytime it hits a new task & worker combination.   I decided to run chmod every time the job runs because I don't know how likely it is to revert between job runs.

Reply all
Reply to author
Forward
0 new messages