parallel execution

3,543 views
Skip to first unread message

Istdb

unread,
Jul 16, 2013, 10:38:53 AM7/16/13
to ansible...@googlegroups.com
Dear all!

I'm in the process of exploring ansible and already found it pretty cool.

There is one thing, however, which I could not figure out: parallel execution.

I have a simple play:

- hosts: all
#  serial: 5
  tasks:
  - name: parallel
    command: sleep 10

Which I try to run with 'ansible-playbook -f 20 -i infra ./paralell-test.ymll'

It seems that the commands are executed in sequential order on all hosts set in the inventory, independently if I give or set the -f or the serial: parameter.

Any clues how to enable parallel task execution?

I'm using 1.2.1.

Thanks!

Michael DeHaan

unread,
Jul 16, 2013, 11:00:29 AM7/16/13
to ansible...@googlegroups.com
--forks 20 is indeed parallel.

You should see all of the hosts returning at the same time.

If you have serial set it will say "do all machines in this batch at the same time, then move on", so if serial is set to 5, and you have --forks 20, it will still only do 5 at a time.

serial says "complete the playbook entirely on these hosts before moving on to the next few".

Thus, serial: 1 will make things non-parallel.  The default is non-serial.






--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Michael DeHaan

unread,
Jul 16, 2013, 11:00:40 AM7/16/13
to ansible...@googlegroups.com
BTW, the default value for --forks is 5.


Istdb

unread,
Jul 16, 2013, 11:37:07 AM7/16/13
to ansible...@googlegroups.com
Hi,

Thanks for the answers.

Still, I don't get something right, because when I execute the playbook I get (without -f option and without serial var):

7 hosts, command sleep 10: real    1m22.282s
7 hosts, command sleep 1: real    0m25.313s

Based on the time difference it seems that the execution is sequential.

What could I be missing?

Thanks!

user@host:/work/mysql-ansible/mysql-cluster$ time ansible-playbook -i infra ./paralell-test.yml

PLAY [all] ********************************************************************

GATHERING FACTS ***************************************************************
ok: [ec2-54-216-187-11.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-216-223-17.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-228-130-185.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-228-154-89.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-216-199-59.eu-west-1.compute.amazonaws.com]
ok: [ec2-79-125-51-180.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-228-46-230.eu-west-1.compute.amazonaws.com]

TASK: [parallel] **************************************************************
changed: [ec2-54-216-187-11.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-228-130-185.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-216-223-17.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-228-154-89.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-216-199-59.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-228-46-230.eu-west-1.compute.amazonaws.com]
changed: [ec2-79-125-51-180.eu-west-1.compute.amazonaws.com]

PLAY RECAP ********************************************************************
ec2-54-216-187-11.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-216-199-59.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-216-223-17.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-228-130-185.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-228-154-89.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-228-46-230.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-79-125-51-180.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  


real    0m25.313s
user    0m0.680s
sys    0m0.532s

user@host:/work/mysql-ansible/mysql-cluster$ time ansible-playbook -i infra ./paralell-test.yml

PLAY [all] ********************************************************************

GATHERING FACTS ***************************************************************
ok: [ec2-54-216-187-11.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-216-223-17.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-216-199-59.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-228-130-185.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-228-154-89.eu-west-1.compute.amazonaws.com]
ok: [ec2-54-228-46-230.eu-west-1.compute.amazonaws.com]
ok: [ec2-79-125-51-180.eu-west-1.compute.amazonaws.com]

TASK: [parallel] **************************************************************
changed: [ec2-54-216-187-11.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-216-223-17.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-228-130-185.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-216-199-59.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-228-154-89.eu-west-1.compute.amazonaws.com]
changed: [ec2-54-228-46-230.eu-west-1.compute.amazonaws.com]
changed: [ec2-79-125-51-180.eu-west-1.compute.amazonaws.com]

PLAY RECAP ********************************************************************
ec2-54-216-187-11.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-216-199-59.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-216-223-17.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-228-130-185.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-228-154-89.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-54-228-46-230.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  
ec2-79-125-51-180.eu-west-1.compute.amazonaws.com : ok=2    changed=1    unreachable=0    failed=0  


real    1m22.282s
user    0m0.568s
sys    0m0.476s

Michael DeHaan

unread,
Jul 16, 2013, 1:33:42 PM7/16/13
to ansible...@googlegroups.com
So with 7 hosts and --forks 5 you should wait 20 seconds on a sleep 10.

Perhaps you have set serial to 1 in your ansible.cfg

You have also mispelled "parallel" so there is a chance you have two playbooks and serial: 1 is set in the other spelling.


--
Michael DeHaan <mic...@ansibleworks.com>
CTO, AnsibleWorks, Inc.
http://www.ansibleworks.com/

Istdb

unread,
Jul 17, 2013, 5:45:20 AM7/17/13
to ansible...@googlegroups.com

It turned out that host key checking feature and my local ssh setup introduced the serial behaviour when using ssh as a transport:

in ssh.py:

      if C.HOST_KEY_CHECKING and not_in_host_file:
            # lock around the initial SSH connectivity so the user prompt about whether to add
            # the host to known hosts is not intermingled with multiprocess output.
            fcntl.lockf(self.runner.process_lockfile, fcntl.LOCK_EX)
            fcntl.lockf(self.runner.output_lockfile, fcntl.LOCK_EX)

Now I managed to change my environment, so that everything is run in parallel. Cool.

Maybe a vvv before the actual locking would be useful (as I did not get the 'add to known hosts prompt').

Michael DeHaan

unread,
Jul 17, 2013, 7:19:11 AM7/17/13
to ansible...@googlegroups.com
Yep, host key checking only introduces serial locking until all host keys are approved.

So it seems your configuration is using a /different/ known hosts location and it wasn't picking up that location?   Would be interested in hearing more about your configuration.

Thanks!

Istdb

unread,
Jul 18, 2013, 6:22:27 AM7/18/13
to ansible...@googlegroups.com
Hello!

I have ssh configured with StrictHostKeyChecking=no for EC2, so my known_hosts is not growing for ever ;) The downside, besides being less secure, is that ansible will not found the host in the known_hosts file. Fortunately, the ANSIBLE_HOST_KEY_CHECKING=no option works well to disable this feature.

Istvan

Michael DeHaan

unread,
Jul 18, 2013, 7:48:11 AM7/18/13
to ansible...@googlegroups.com
Is there are problem I need to help with above?

FYI, if set, ANSIBLE_HOST_KEY_CHECKING=no also passes along StrictHostKeyChecking=no as well, so there's no extra reason to define it in ansible.cfg or SSH config if you don't want to.

Michael Blakeley

unread,
Sep 11, 2013, 7:42:14 PM9/11/13
to ansible...@googlegroups.com
On Thursday, July 18, 2013 4:48:11 AM UTC-7, Michael DeHaan wrote:
Is there are problem I need to help with above?

FYI, if set, ANSIBLE_HOST_KEY_CHECKING=no also passes along StrictHostKeyChecking=no as well, so there's no extra reason to define it in ansible.cfg or SSH config if you don't want to.

Reviving because I was recently bitten by this myself, and found it difficult to debug. Like the OP I assumed ansible forking was simply broken. I finally solved the problem via the discussion at http://stackoverflow.com/questions/17958949/how-do-i-drive-ansible-programmatically-and-concurrently - and then I found this thread.

More and more folks are using ansible with AWS and EC2, so I expect this to become a common problem. Folks with large numbers of hosts in EC2, and large churn of those hosts, won't want ever-growing known_hosts files. Yet they may still want host key checking when talking to non-EC2 hosts. I know some consider it dangerous to disable host key checking under any circumstances, but imagine typing 'yes' 50 times for an inventory group that will only exist for a day or two.

The ansible serialization happens automatically, so the first step might be to emit a warning. Could ansible warn when --forks is set to a non-default value, the command or playbook applies to more than one host, and forking is not possible? For bonus points the warning could suggest ANSIBLE_HOST_KEY_CHECKING=no. But I would be happy as long as there is a warning, and searching for the warning text turns up something useful. As it is, the problem is difficult to debug.

It would be even nicer to do something like ssh_config does, and allow ansible.cfg to disable host_key_checking for hosts that match a regex or wildcard pattern. That way I could set up my environment so that *.amazonaws.com never does host key checking, while all other hosts do.

Michael DeHaan

unread,
Sep 12, 2013, 1:58:19 PM9/12/13
to ansible...@googlegroups.com
That SO is a old bug with hashed hosts that should no longer apply in recent versions of Ansible.

We can and do fork once host keys are accepted so there's no need for a warning.   

Also Ansible already reads in ssh_config, so you can just put that setting there if you like.  




--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Michael Blakeley

unread,
Sep 12, 2013, 4:47:13 PM9/12/13
to ansible...@googlegroups.com
On Thursday, September 12, 2013 10:58:19 AM UTC-7, Michael DeHaan wrote:
That SO is a old bug with hashed hosts that should no longer apply in recent versions of Ansible.

We can and do fork once host keys are accepted so there's no need for a warning.   

Also Ansible already reads in ssh_config, so you can just put that setting there if you like.  

The point isn't any particular bug or feature. The point is that I'm specifically asking ansible to run concurrently using --forks, and it can't, but it doesn't let me know about the problem. If I specifically set --forks, that's a signal that I'm serious about it and want to know if it can't be done.

My ssh_config definitely triggers the global lock, using ansible installed from the dev branch, current as of a few minutes ago. Here's what I have in my ~/.ssh/config file.

Host *.amazonaws.com
     PasswordAuthentication no
     StrictHostKeyChecking no
     UserKnownHostsFile /dev/null
     User ec2-user

I'm using this configuration because I create and discard many EC2 instances: I never want to see chatter from ssh about them, and I never want to remember their key info. Like the OP in this thread, I can easily demonstrate that this single-threads, whatever value of --forks I set. But the cause of this behavior was utterly mysterious, and until I happened across that old stackoverflow thread I was at a loss to debug it.

But again the point isn't any particular bug or feature that triggers the lock. The point is that if the lock is triggered, the user experiences poor performance without knowing why. Even if there is a better configuration option for my use-case, I still think a nice, clear warning is in order. This is a situation where I've specifically asked ansible to run concurrently, using the --forks option. It can't do that, understandably. But equally I need to know when ansible can't do what I'm asking it to do.

If I create a pull request that patches the ssh connection plugin to do something like this, what are the chances you would accept it? Looking at the code it may be a bit tricky: the ssh connection plugin only seems to know about the current host, so it doesn't have the full context of the command. But it does have the runner, which should provide enough context to decide whether or not --forks was used.

Michael DeHaan

unread,
Sep 12, 2013, 6:00:40 PM9/12/13
to ansible...@googlegroups.com
"The point is that I'm specifically asking ansible to run concurrently using --forks, and it can't, but it doesn't let me know about the problem."

This self resolves after you approve the hosts.

Again, the hashing host key problem is no longer applicable.

The lock should not be a point of contention if there are no questions to ask.

I'll have James look into it though.




--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

James Cammarata

unread,
Sep 12, 2013, 6:21:23 PM9/12/13
to ansible...@googlegroups.com
I believe the initial iteration through the hosts is single-threaded, as that occurs before the forks are created, however can you demonstrate that your configuration is causing single-threaded behavior after the forks are running? The only thing that would cause that would be if each task acquires a global lock at he start of its processing and didn't release it until it was done, which definitely does not happen in the code anywhere that I can see. The only lock in the ssh connection plugin occurs only when user input is requested - we rely on ssh's built-in file locking around known_hosts (which occurs even if you're using /dev/null). Incidentally, we have heard reports of that slowing things down even compared to strict host key checking, so that might be something worth looking into.
James Cammarata <jcamm...@ansibleworks.com>
Sr. Software Engineer, AnsibleWorks, Inc.
http://www.ansibleworks.com/

Michael Blakeley

unread,
Sep 12, 2013, 6:34:33 PM9/12/13
to ansible...@googlegroups.com
On Thursday, September 12, 2013 3:21:23 PM UTC-7, James Cammarata wrote:
I believe the initial iteration through the hosts is single-threaded, as that occurs before the forks are created, however can you demonstrate that your configuration is causing single-threaded behavior after the forks are running?

Yes, I think so. I observe single-threading for every command throughout long playbooks. Setting ANSIBLE_HOST_KEY_CHECKING=no resolves that.

Does the output from this single command help?

$ ansible -i ec2.py tag_Name_test -f 9 -a date
ec2-54-200-43-114.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:33 UTC 2013
ec2-54-200-40-223.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:35 UTC 2013
ec2-54-200-33-219.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:36 UTC 2013
ec2-54-200-40-249.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:38 UTC 2013
ec2-54-200-43-44.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:40 UTC 2013
ec2-54-200-43-42.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:42 UTC 2013
ec2-54-200-40-224.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:41 UTC 2013
ec2-54-200-42-181.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:43 UTC 2013
ec2-54-200-42-164.us-west-2.compute.amazonaws.com | success | rc=0 >>
Thu Sep 12 21:23:44 UTC 2013

With ANSIBLE_HOST_KEY_CHECKING=no, the results return much more quickly and all nine hosts display the same time (within 1-2 sec anyway).

Michael Blakeley

unread,
Sep 12, 2013, 6:37:44 PM9/12/13
to ansible...@googlegroups.com
On Thursday, September 12, 2013 3:00:40 PM UTC-7, Michael DeHaan wrote:
"The point is that I'm specifically asking ansible to run concurrently using --forks, and it can't, but it doesn't let me know about the problem."

This self resolves after you approve the hosts.

With respect, not for me. It doesn't self-resolve because I never add these EC2 instances to my known_hosts file. They are ephemeral, and I don't want thousands of them clogging up my known_hosts file. The effect of my ssh_config is that amazonaws.com host keys end up in /dev/null.

Again, the point is to make it obvious why --forks isn't doing anything, whatever might cause that to happen in any given situation. Today it is difficult to debug, and has bitten at least three users. There are probably more who simply wondered why ansible feels so slow, but didn't raise the matter.
 
I've got a crude patch that does more or less what I want:

diff --git a/lib/ansible/runner/connection_plugins/ssh.py b/lib/ansible/runner/connection_plugins/ssh.py
index 02d47e0..c000765 100644
--- a/lib/ansible/runner/connection_plugins/ssh.py
+++ b/lib/ansible/runner/connection_plugins/ssh.py
@@ -28,7 +28,7 @@ import pwd
 import gettext
 from hashlib import sha1
 import ansible.constants as C
-from ansible.callbacks import vvv
+from ansible.callbacks import vvv, display
 from ansible import errors
 from ansible import utils
 
@@ -170,8 +170,14 @@ class Connection(object):

             # the host to known hosts is not intermingled with multiprocess output.
             fcntl.lockf(self.runner.process_lockfile, fcntl.LOCK_EX)
             fcntl.lockf(self.runner.output_lockfile, fcntl.LOCK_EX)
-       
-
+            # If forks were specifically requested, warn about the situation.
+            # Would be best to warn at most once per command or playbook,
+            # but how to communicate that state?
+            if self.runner.forks != C.DEFAULT_FORKS:
+                display('Cannot use --forks %d: unknown host key.'
+                        ' Consider setting ANSIBLE_HOST_KEY_CHECKING=no'
+                        % (self.runner.forks),
+                        color='yellow')
 
         try:
             # Make sure stdin is a proper (pseudo) pty to avoid: tcgetattr errors

I'd be happy to turn that into a pull request, but I'm holding off because I don't think it's quite satisfactory. It's chattier than I'd prefer, because I haven't seen a good way to communicate that the warning has already been displayed. Fixing that might require plugin interface changes, which could get much uglier.

But as far as it goes, it seems to do the job. If the user explicitly sets --forks to a non-default value, and the lock is taken, the user knows that it's happening and why.

Michael DeHaan

unread,
Sep 12, 2013, 7:08:09 PM9/12/13
to ansible...@googlegroups.com
Ok, so I understand now.

Because we do *not* see it in known hosts *AND* you have host key checking enabled globally in Ansible but not disabled for these specific hosts, the locking code is anticipating a question from SSH will engage in anticipation of needing to block for these hosts.

The answer seemst o be making an inventory variable that allows disabling host key checking on a per inventory basis.

It wouldn't be able to tell what your SSH client would do in advance, but could easily not check for that particular host.

I disagree entirely that --forks should be disabled with unknown host keys, as that's simply not true -- most users will just approve the hosts they need.

However, I'm ok with being able to turn them off for a particular group, which you could set for the ec2_tag_foo group.




--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Michael Blakeley

unread,
Sep 12, 2013, 8:45:29 PM9/12/13
to ansible...@googlegroups.com
On Thursday, September 12, 2013 4:08:09 PM UTC-7, Michael DeHaan wrote:
Ok, so I understand now.

Because we do *not* see it in known hosts *AND* you have host key checking enabled globally in Ansible but not disabled for these specific hosts, the locking code is anticipating a question from SSH will engage in anticipation of needing to block for these hosts.

The answer seemst o be making an inventory variable that allows disabling host key checking on a per inventory basis.

That sounds interesting. So that means modifying the ec2.py script to supply that variable, probably as an ec2.ini option? That could work. If the ec2.py output could be set to automatically disable host key checks for each host in my ec2 inventory, then I could still use host key checking for my small inventory of static hosts. That would be nice.

It wouldn't be able to tell what your SSH client would do in advance, but could easily not check for that particular host.

I disagree entirely that --forks should be disabled with unknown host keys, as that's simply not true -- most users will just approve the hosts they need.

I don't think I proposed anything of the kind. That isn't what my crude patch does, either. The text of the message might be misleading: it's telling you that --forks isn't doing much of anything. But it isn't changing the existing functionality at all.

What I've observed is that --forks doesn't do any good if the hosts are unknown, and that it's difficult to debug the problem if the prompts are disabled. When this happens I think the user should see what's happening, and why. When users set --forks, I suspect they're usually targeting hosts that they've already accepted - or as in my situation they don't care about key checks. So it feels about right to display a warning when there are unknown hosts *and* --forks is set.

However, I'm ok with being able to turn them off for a particular group, which you could set for the ec2_tag_foo group.

For me those groups are also ephemeral. I'd want to disable host key checking for my entire ec2 inventory, probably as an option in ec2.ini.

I like this idea, but I don't think it addresses the root problem. When host key checks happen, --forks is effectively ignored. Because this happens without any feedback to that effect, the problem is difficult to debug.

With this inventory variable idea, the ec2.ini default would probably be normal host key checking. So if someone like me sets up ssh_config to ignore amazonaws.com hosts, then the same problem will show up and will be just as difficult to track down. All commands will be serialized, and the user won't have a clue why that's happening.

What about a more generic message? A single line like "Checking host key for unknown host %s" would not add much noise on top of ordinary host-key acceptance, while still showing some visible sign of a problem when host-key storage is suppressed. If I had had that extra message logged even when I knew that I had customized ssh_config, I might have found the solution more quickly.

James Cammarata

unread,
Sep 12, 2013, 8:59:12 PM9/12/13
to ansible...@googlegroups.com
If you're setting the known hosts to /dev/null in your ssh config, you should disable strict host key checking - it makes no sense to leave that enabled in that situation. If you do that, I believe you said you did not see any performance degradation?


--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--

Michael DeHaan

unread,
Sep 12, 2013, 9:01:20 PM9/12/13
to ansible...@googlegroups.com
Michael, 

The message you proposed is unneccessary when we can get better technical solutions.

The group variable is one of those.

One recent proposal by someone else was to make the ec2 plugin always put hosts in an ec2 group, in which case it would be as simple as

group_vars/ec2:

----
ansible_host_key_checking=0

And that would easily make it settable on a per-group basis.





--

i iordanov

unread,
Feb 23, 2014, 4:31:32 PM2/23/14
to ansible...@googlegroups.com
I looked at the code in ssh.py, and I don't understand why you don't take /etc/ssh/ssh_known_hosts2 under consideration when deciding if the host is in known_hosts. The only file that is taken under consideration is $USER/.ssh/known_hosts.

Do you think that including standard locations is a good idea?

...
    def not_in_host_file(self, host):
        host_file = os.path.expanduser(os.path.expandvars("~${USER}/.ssh/known_hosts"))
        if not os.path.exists(host_file):
            print "previous known host file not found"
            return True
...

Thanks!
iordan


Michael DeHaan

unread,
Feb 24, 2014, 2:21:58 PM2/24/14
to ansible...@googlegroups.com
I've got nothing against these additions to look in both locations.

I'd be happy to see pull requests to this effect if you want to pass them along, or if not, you can file a ticket so we don't forget.

Thanks!

--Michael




--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.

i iordanov

unread,
Feb 24, 2014, 4:26:12 PM2/24/14
to ansible...@googlegroups.com
Hi Michael,

On Mon, Feb 24, 2014 at 2:21 PM, Michael DeHaan <mic...@ansible.com> wrote:
> I'd be happy to see pull requests to this effect if you want to pass them
> along, or if not, you can file a ticket so we don't forget.

I added the functionality, tested it in my environment successfully,
and created a pull request for your consideration.

Many thanks!
iordan

--
The conscious mind has only one thread of execution.

i iordanov

unread,
Feb 24, 2014, 4:30:57 PM2/24/14
to ansible...@googlegroups.com
Hi again,

I'm not sure how you want to deal with this pull request, so here is a
link to the actual patch:

https://github.com/ansible/ansible/pull/6156.patch

for you to have multiple options.

Cheers,
iordan

Trevor Hartman

unread,
Mar 3, 2014, 2:05:36 PM3/3/14
to ansible...@googlegroups.com
I seem to be experiencing the same or similar issue.

My ansible.cfg:

[si-cluster-settings]
host_key_checking = False
hostfile = hosts
forks = 15

When I run a playbook on 10 nodes, they are definitely running serially as I see large delays between results coming back on each node. I also tried setting serial: 10 in the actual playbook.

Michael DeHaan

unread,
Mar 4, 2014, 10:25:43 AM3/4/14
to ansible...@googlegroups.com
The group you have in inventory called "si-cluster-settings" is the problem.

The Ansible inventory parser doesn't know anything about that category of settings.

They go in a section called "defaults" like so:  https://github.com/ansible/ansible/blob/devel/examples/ansible.cfg

  


--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.

Vincent Janelle

unread,
Sep 29, 2014, 10:35:43 AM9/29/14
to ansible...@googlegroups.com
Just an update at Michael's request - seeing the exact same situations, with ec2.

Setting this environment variable fixes this.

Michael DeHaan

unread,
Sep 29, 2014, 10:39:20 AM9/29/14
to ansible...@googlegroups.com
Any chance I can get a copy of your known_hosts file?

Off list would be preferred.

I'm not sure that's it, but I suspect it could be.



--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.

Vincent Janelle

unread,
Sep 29, 2014, 10:44:43 AM9/29/14
to ansible...@googlegroups.com
Not sure how I'd send you a copy of /dev/null, unless ansible is attempting to parse the contents of ~/.ssh/known_hosts outside of ssh.

James Cammarata

unread,
Sep 29, 2014, 11:05:59 AM9/29/14
to ansible...@googlegroups.com
Hi Vincent, could you share a sample of the playbook you're running as well as the results of running it with -f1, -f2 and -f4? That should determine if the playbook is indeed being serialized at some point. 

Do note, however, if you're doing something like this:

- local_action: ec2 ...
  with_items:
    - ...
    - ...
    - ...

you will see serialized performance. This is caused by the fact that each pass through with_* loops must complete on all hosts before the next loop begins, and with local_action you'd only be executing on a single host (localhost), so this would constrain the playbook to a serial-like performance.

Thanks!

Vincent Janelle

unread,
Sep 29, 2014, 11:37:43 AM9/29/14
to ansible...@googlegroups.com
Exactly like what was described at the start of this thread. :(  Setting the environment variable produces the desired parallel execution.

Michael DeHaan

unread,
Sep 29, 2014, 12:28:15 PM9/29/14
to ansible...@googlegroups.com
Ansible does read ~/.ssh/known_hosts because it needs to know whether to lock itself down to 1 process to ask you the question about adding a new hosts to known_hosts.

This only happens when it detects a host isn't already there, because it must detect this before SSH asks.

And this only happens with -c ssh, -c paramiko has it's own handling (and it's own issues, I prefer the SSH implementation if folks have a new enough SSH to use ControlPersist).



Michael Blakeley

unread,
Sep 29, 2014, 12:29:07 PM9/29/14
to ansible...@googlegroups.com

Vincent, I now use a slightly different workaround. Instead of routing known_hosts to /dev/null I route it to a temp file. This keeps the EC2 noise out of my default known_hosts file, and seems to play well with ansible.

From my ~/.ssh/config file:

Host *.amazonaws.com
     
PasswordAuthentication no
     
StrictHostKeyChecking no

     
UserKnownHostsFile /tmp/ec2_known_hosts
     
User ec2-user


Hope that helps you.

-- Mike

Michael DeHaan

unread,
Sep 29, 2014, 12:29:24 PM9/29/14
to ansible...@googlegroups.com
Hi James, 

Each loop DOES happen within the host loop.

If you have 50 hosts and they are "with_items"'ing, that still happens 50 hosts at a time.





Michael DeHaan

unread,
Sep 29, 2014, 12:30:32 PM9/29/14
to ansible...@googlegroups.com
So I'm confused - are you saying you are using known_hosts that are empty?

This seems to be a completely unrelated question.

The mention of /dev/null above seemed to be based on confusion that we didn't read it, not that it was actually symlinked to /dev/null.

Can each of you clarify?

--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.

Michael Blakeley

unread,
Sep 29, 2014, 12:45:23 PM9/29/14
to ansible...@googlegroups.com
I took it that Vincent was referring to my message of 2013-09-12. In that post I mentioned using /dev/null for the ssh UserKnownHostsFile configuration key, scoped to Host *.amazonaws.com

This configuration triggers single-threaded behavior from ansible because ssh never stores any record of connecting to the EC2 hosts: not the first time, not ever. Because known_hosts is /dev/null.

-- Mike

Michael DeHaan

unread,
Sep 29, 2014, 12:54:57 PM9/29/14
to ansible...@googlegroups.com
Ansible does not find your known hosts location from ~/.ssh/config on a per host basis and does read your ~/.ssh/known_hosts.

It does this because it needs to know, in advance of SSH asking, whether it needs to lock.

Assume it's running at 50/200 forks and needs to ask a question interactively, that's why it needs to know.

So if you are saying use known_hosts in a different file, that may be EXACTLY the problem.   With host key checking on, and the data going elsewhere, it can't be found, and ansible is locking pre-emptively.


Michael DeHaan

unread,
Sep 29, 2014, 12:57:35 PM9/29/14
to ansible...@googlegroups.com
I'm wondering if we can detect configuration of alternative known_hosts locations in the ~/.ssh/config and issue a warning, which should be able to key people in to turn off the checking feature.

This should close this out, I'd think.


Matt Jaynes

unread,
Nov 10, 2014, 12:52:43 PM11/10/14
to ansible...@googlegroups.com
Sounds like some great possible solutions. 

Either 

1) Reading the SSH config to pick up the correct known_hosts locations (and perhaps setting 'host_key_checking' to false if the location is '/dev/null' since that's a common pattern - for instance, Vagrant does this by default, see https://docs.vagrantup.com/v2/cli/ssh_config.html )

or

2) A simple warning message when serialization is triggered due to known_hosts in order to save folks from some really tough debugging

Just lost a few hours debugging this issue. For several environments, I have a client's known_hosts locations set to custom locations in their SSH config, so everything was running serially (a 3 minute process * 20 servers = 60 minutes!). Persistence and sweat finally lead me to try "host_key_checking = False" and it finally ran in parallel - was so nice to finally see since I'd tried just about everything else I could imagine (forks, serial, ssh options, restructuring inventory, removing inventory groups, etc).

Thanks,
Matt

Lorrin Nelson

unread,
Jun 17, 2015, 9:39:06 PM6/17/15
to ansible...@googlegroups.com
Any updates on this? I took a gander through the GitHub issues but didn't see one that seemed related.

Steve Ims

unread,
Oct 3, 2015, 9:11:23 AM10/3/15
to Ansible Project
We got burned by this too.

We use Ansible from a single Jenkins server to manage instances in multiple EC2 VPCs.  We use strict host checking for security and we have a custom known_hosts file per VPC (we've automated updates to known_hosts on each deploy).

"Reading the SSH config to pick up the correct known_hosts locations" (option #1 posted by Matt) seems the most intuitive solution.

Guess we are generally spoiled by Ansible :-)  Ansible fits so well into our workflows that we assumed it would also honor our ssh configuration.  And in fact Ansible mostly does honor our ssh configuration because our playbooks and adhocs do run with the custom known_hosts -- but the silent impact to performance (serial, never parallel) was unexpected.

Appreciate your work!

-- Steve

einar....@gmail.com

unread,
Jun 8, 2016, 10:57:56 PM6/8/16
to Ansible Project
Ansible boasts the forks (process) approach where you can run the same plays on multiple hosts. But what I'm looking for is true multi-threading where multiple sections of the code can run asynchronously and in ANY order depending on what's ready when when you have to wait for say, AWS modules that create resources: RDS and AMI. I'm not sure the async with poll and async_status options can give you this capability but I seriously hope so, the documentation is rather vague:
http://docs.ansible.com/ansible/async_status_module.html

On Tuesday, July 16, 2013 at 7:38:53 AM UTC-7, Istdb wrote:
Dear all!

I'm in the process of exploring ansible and already found it pretty cool.

There is one thing, however, which I could not figure out: parallel execution.

I have a simple play:

- hosts: all
#  serial: 5
  tasks:
  - name: parallel
    command: sleep 10

Which I try to run with 'ansible-playbook -f 20 -i infra ./paralell-test.ymll'

It seems that the commands are executed in sequential order on all hosts set in the inventory, independently if I give or set the -f or the serial: parameter.

Any clues how to enable parallel task execution?

I'm using 1.2.1.

Thanks!
Reply all
Reply to author
Forward
0 new messages