Anyone seen a "Resource temporarily unavailable error"?

4,765 views
Skip to first unread message

Kyle Heath

unread,
Jun 12, 2013, 10:07:03 PM6/12/13
to ansible...@googlegroups.com
Hi Ansible folks,

I get an "IOError: [Errno 11] Resource temporarily unavailable" error whenever I run a playbook with fork > 1.  Just wondering if anyone has seen this problem before or has trouble-shooting advice.

Python 2.7.3 and Python 2.7.5
Ansible 1.1
Ubuntu 12.04 LTS

Console output below...

Cheers,
Kyle

PLAY [all] ********************* 

TASK: [Setup passwordless ssh from master to workers] ********************* 
11
Traceback (most recent call last):
  File "./cli.py", line 60, in <module>
    main(sys.argv)
  File "./cli.py", line 30, in main
    cluster.Resize(num_instances)
  File "/home/ubuntu/git/iwct/build/snap/cirrus/cluster/mapr.py", line 90, in Resize
    self.__AddWorkers(num_to_add)
  File "/home/ubuntu/git/iwct/build/snap/cirrus/cluster/mapr.py", line 505, in __AddWorkers
    self.__ConfigureWorkers(new_worker_instances)
  File "/home/ubuntu/git/iwct/build/snap/cirrus/cluster/mapr.py", line 714, in __ConfigureWorkers
    CHECK(util.RunPlaybookOnHosts(self.playbooks_path + '/worker.yml', hostnames, self.ssh_key, extra_vars))
  File "/home/ubuntu/git/iwct/build/snap/cirrus/util.py", line 84, in RunPlaybookOnHosts
    results = pb.run()      
  File "/home/ubuntu/git/iwct/build/snap/ansible/playbook/__init__.py", line 222, in run
    if not self._run_play(play):
  File "/home/ubuntu/git/iwct/build/snap/ansible/playbook/__init__.py", line 438, in _run_play
    if not self._run_task(play, task, False):
  File "/home/ubuntu/git/iwct/build/snap/ansible/playbook/__init__.py", line 303, in _run_task
    results = self._run_task_internal(task)
  File "/home/ubuntu/git/iwct/build/snap/ansible/playbook/__init__.py", line 277, in _run_task_internal
    results = runner.run()
  File "/home/ubuntu/git/iwct/build/snap/ansible/runner/__init__.py", line 660, in run
    results = self._parallel_exec(hosts)
  File "/home/ubuntu/git/iwct/build/snap/ansible/runner/__init__.py", line 573, in _parallel_exec
    job_queue = manager.Queue()
  File "/usr/lib/python2.7/multiprocessing/managers.py", line 667, in temp
    token, exp = self._create(typeid, *args, **kwds)
  File "/usr/lib/python2.7/multiprocessing/managers.py", line 565, in _create
    conn = self._Client(self._address, authkey=self._authkey)
  File "/usr/lib/python2.7/multiprocessing/connection.py", line 175, in Client
    answer_challenge(c, authkey)
  File "/usr/lib/python2.7/multiprocessing/connection.py", line 413, in answer_challenge
    message = connection.recv_bytes(256)         # reject large message
IOError: [Errno 11] Resource temporarily unavailable
*** Aborted at 1371088581 (unix time) try "date -d @1371088581" if you are using GNU date ***
PC: @     0x7f3a63db6313 (unknown)
*** SIGTERM (@0x3e800005710) received by PID 22554 (TID 0x7f3a65406700) from PID 22288; stack trace: ***
    @     0x7f3a64fefcb0 (unknown)
    @     0x7f3a63db6313 (unknown)
    @           0x5560a1 (unknown)
    @           0x49890a (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)
    @           0x49f1c0 (unknown)
    @           0x4a8a92 (unknown)
    @           0x4e9f36 (unknown)
    @           0x499bc0 (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)
    @           0x49f1c0 (unknown)
    @           0x4a8960 (unknown)
    @           0x4e9f36 (unknown)
    @           0x4ec11a (unknown)
    @           0x4e9f36 (unknown)
    @           0x4eb39e (unknown)
    @           0x4db6a6 (unknown)
    @           0x4e9f36 (unknown)
    @           0x49846a (unknown)
    @           0x498602 (unknown)
    @           0x49f1c0 (unknown)
    @           0x4983b8 (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)
    @           0x498602 (unknown)

Brian Coca

unread,
Jun 12, 2013, 11:23:41 PM6/12/13
to ansible...@googlegroups.com
check dmesg, that is normal when you run out of OS resources (# of open files, buffers, ram, etc).


--
Brian Coca
Stultorum infinitus est numerus
0110000101110010011001010110111000100111011101000010000001111001011011110111010100100000011100110110110101100001011100100111010000100001
Pedo mellon a minno

Kyle Heath

unread,
Jun 13, 2013, 1:46:49 AM6/13/13
to ansible...@googlegroups.com
Hi Brian,

Thanks for the suggestion... I saw nothing posted to dmesg after I run the script that crashes.  The script uses very little ram, and file and process/thread limits are set high enough...  (ulimit -Hn -> 64000, ulimit -Hu -> 55457).  Not sure what resource it ran out of...

Taking the error message at it's word that it is "temporary", I wrapped the call to the multiprocessing manager Queue constructor*** in a retry loop...  The exception gets raised only on the first attempt, and always succeeds on the second attempt.  

***File "/home/ubuntu/git/iwct/build/snap/ansible/runner/__init__.py", line 573, in _parallel_exec
    job_queue = manager.Queue()

before hack:

    job_queue = manager.Queue()

after hack:
    job_queue = None
    while not job_queue:
      try:
        job_queue = manager.Queue()
      except:
        pass
      print 'error... will retry...'
      time.sleep(2)

My script doesn't crash now... I don't understand the root cause of the problem.  Has anyone else seen such an issue?  I've seen a few reports of this error outside of ansible when using multiprocessing and sockets... Anyone else had this problem before?

-Kyle





--
You received this message because you are subscribed to a topic in the Google Groups "Ansible Project" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ansible-project/XMqAATHwB2w/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Michael DeHaan

unread,
Jun 13, 2013, 7:06:57 AM6/13/13
to ansible...@googlegroups.com
An execeedingly large number of folks are running LTS and I've never seen this reported.

Seems like you are using the API instead of /usr/bin/ansible-* though, so not sure what may be going on.


--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Michael DeHaan <mic...@ansibleworks.com>
CTO, AnsibleWorks, Inc.
http://www.ansibleworks.com/

Kyle Heath

unread,
Jun 13, 2013, 1:40:31 PM6/13/13
to ansible...@googlegroups.com
Michael,

Thanks for the reply...  I bet you are right... I've probably done something in my code before the ansible API gets called that causes the crash.  I use the multiprocessing and paramiko modules in other parts of my code before calling the ansible API, perhaps there is an interaction.

PS: I'm converting a bunch of my custom python scripts for launching MapR hadoop clusters on EC2 to ansible playbooks...  I'm very pleased with the results (much shorter and easier to maintain).   Thanks for making such a well designed tool!

Cheers,
Kyle

Dylan Martin

unread,
Jun 13, 2013, 2:44:41 PM6/13/13
to ansible...@googlegroups.com
I think that is a network error, not a system resource error.  IE the resource in question is a network host or service.  

sibaprasad mahapatra

unread,
Dec 17, 2014, 2:48:18 AM12/17/14
to ansible...@googlegroups.com
Did anybody find any solution to this error. I am having the same issue not. 

I am using Ansible 1.7.2 with Eucalyptus cloud. 

msg: Instance creation failed => InternalFailure: Not enough resources: no cluster controller is currently available to run instances. 

Thanks,
Sp

Ritesh Shetty

unread,
Feb 1, 2015, 8:56:42 PM2/1/15
to ansible...@googlegroups.com
I am too getting this error and could not find a way to fix it ..
I am using ubuntu 14.04 and using the python api to call the playbook.run()

This used to work when i used to invoke my code from apache. But now i am trying to call using a simple python command like "python consumer.py"
Dont know if i need to specify anything else ?





2015-01-31 00:26:04,933 - root - ERROR - Error in executing playbook[Errno 11] Resource temporarily unavailable

Traceback (most recent call last):
  File "/opt/stack/venv/local/lib/python2.7/site-packages/attis-1.0.0a1-py2.7.egg/attis/engine/contentprocessor.py", line 100, in runPlaybook
    pb.run()
  File "/opt/stack/venv/local/lib/python2.7/site-packages/ansible-1.8.2-py2.7.egg/ansible/playbook/__init__.py", line 347, in run
    if not self._run_play(play):
  File "/opt/stack/venv/local/lib/python2.7/site-packages/ansible-1.8.2-py2.7.egg/ansible/playbook/__init__.py", line 674, in _run_play
    self._do_setup_step(play)
  File "/opt/stack/venv/local/lib/python2.7/site-packages/ansible-1.8.2-py2.7.egg/ansible/playbook/__init__.py", line 619, in _do_setup_step
    accelerate_port=play.accelerate_port,
  File "/opt/stack/venv/local/lib/python2.7/site-packages/ansible-1.8.2-py2.7.egg/ansible/runner/__init__.py", line 1458, in run
    results = self._parallel_exec(hosts)
  File "/opt/stack/venv/local/lib/python2.7/site-packages/ansible-1.8.2-py2.7.egg/ansible/runner/__init__.py", line 1349, in _parallel_exec

    job_queue = manager.Queue()
  File "/usr/lib/python2.7/multiprocessing/managers.py", line 667, in temp
    token, exp = self._create(typeid, *args, **kwds)
  File "/usr/lib/python2.7/multiprocessing/managers.py", line 565, in _create
    conn = self._Client(self._address, authkey=self._authkey)
  File "/usr/lib/python2.7/multiprocessing/connection.py", line 175, in Client
    answer_challenge(c, authkey)
  File "/usr/lib/python2.7/multiprocessing/connection.py", line 428, in answer_challenge

    message = connection.recv_bytes(256)         # reject large message
IOError: [Errno 11] Resource temporarily unavailable


Ritesh Shetty

unread,
Feb 1, 2015, 9:30:41 PM2/1/15
to ansible...@googlegroups.com
OK, so i did a couple of tests. I ran the ansible api call from a regular python file like this

python rough.py

What rough.py does is simply call the python api. This works perfectly fine.
Now why it was not working is because i has a oslo-messaging listener and then i passed the execution to the python api. Some how these 2 dont work together.

i dont know if this is related or not
http://stackoverflow.com/questions/14736766/why-does-gevent-socket-break-multiprocessing-connections-auth

Question is is there any work around ?
Reply all
Reply to author
Forward
0 new messages