Accelerated Mode Unexpected Disconnect

45 views
Skip to first unread message

Christopher O'Connell

unread,
Feb 21, 2014, 9:47:11 PM2/21/14
to ansible...@googlegroups.com
We're having a problem when accelerated mode unexpectedly disconnects for whatever reason. This happens often when running test playbooks from poor connections, e.g. in-flight wifi. It can also happen when running a big play book when you just hit the dropped connection lottery. In either case, the host fail out saying "unable to connect to port 5099". In future runs, it will continue to refuse the accelerated port until the hanging python daemon has been killed manually.

Thus far the only way I've found to kill the daemon is to killall python (I haven't discovered a way of definitely identifying it's PID). I can run something like

ansible GROUP -m command -a "killall pyhon"

but this has the unwanted side effect of killing all other python processes on the system.

Advice?

All the best,

~ Christopher

Michael DeHaan

unread,
Feb 24, 2014, 2:24:29 PM2/24/14
to ansible...@googlegroups.com
"This happens often when running test playbooks from poor connections, e.g. in-flight wifi."

I'm pretty sure almost nothing works over in-flight WiFi :)

"In future runs, it will continue to refuse the accelerated port until the hanging python daemon has been killed manually."

This part seems more interesting and I haven't seen this.   If you can find a way to replicate the problem without an airplane that would be helpful and we could take a look :)







--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/67a69b65-ea3d-47c9-97f6-9109621bf1d2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Gehres

unread,
Feb 24, 2014, 3:32:21 PM2/24/14
to ansible...@googlegroups.com


"In future runs, it will continue to refuse the accelerated port until the hanging python daemon has been killed manually."

This part seems more interesting and I haven't seen this.   If you can find a way to replicate the problem without an airplane that would be helpful and we could take a look :)

We've used to see this as well on occasion before we ditched accelerate for ssh_alt.  Despite our best efforts we were unable to determine a way to reliably reproduce. The only solution was logging onto the box and killing python (or waiting for the timeout).  I believe it had something to do with not have the key from the previous accelerate session and the previous session still waiting for tasks.

 
--
Peter Gehres
Site Reliability Engineer | AppDynamics, Inc.

Michael DeHaan

unread,
Feb 24, 2014, 3:43:01 PM2/24/14
to ansible...@googlegroups.com
Yes, if you are in a rekey situation that could very well be it.

We have an open feature idea for making each new connection attempt "add" a new key, which would resolve that one particular issue better.

Pipelining (the new ssh_alt isn't named ssh alt BTW, but is set by pipelining=True in ansible.cfg) is going to be better in most cases.

I'm about to adapt the docs on accelerate to strongly emphasize this.





--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages