EC2 module: state=present always returns 'wait for instances running timeout'

858 views
Skip to first unread message

Dave Stern

unread,
Nov 17, 2014, 3:17:38 PM11/17/14
to ansible...@googlegroups.com
We've had a task to create EC2 instances running with no changes for weeks. Suddenly sometime after Thu Nov 13 04:11:43 UTC 2014, the module would timeout on all requests to create new instances. This happens on both Mac OS X and on Ubuntu. I've updated to the latest ansible, boto, awscli, etc. with no effect. We haven't updated this code in a very long time.

Was something changed during AWS re:invent? Is there something else going on with the AWS CLI that the EC2 module conflicts with now?

ansible-playbook 1.7.2
Python 2.7.5+
aws-cli/1.6.2 Python/2.7.5+ Linux/3.11.0-12-generic


Michael DeHaan

unread,
Nov 17, 2014, 3:29:34 PM11/17/14
to ansible...@googlegroups.com
Not that I'm aware of.

Anyone else seeing similar problems?



--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/a6252950-f989-4a8f-adab-3450437b63e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dave Stern

unread,
Nov 17, 2014, 4:20:01 PM11/17/14
to ansible...@googlegroups.com
Is there any way to get more debugging than `ansible-playbook -vvvv` provides? I'd like to know if it's a network issue, ssh, or something else and the instance is terminated before I can test it.

Dave Stern

unread,
Nov 18, 2014, 4:01:47 PM11/18/14
to ansible...@googlegroups.com
I figured this out, and it prompts me to wonder if there's been a request to propagate boto debug info via ansible yet. If not, is there a clean way this could be included? I had to write a script to debug boto output to get the reason for the failure. Being able to propagate this error up to ansible might be beneficial in the future. Additionally, I'm wondering why so many volumes were left orphaned in our AWS account. I'm not blaming the EC2 module, but I haven't found that cause yet.

We hit a limit in our EBS volumes. For future debugging reference, I'll include my troubleshooting below.

I created a simple script based on a StackOverflow post (http://stackoverflow.com/a/20658354/1464556):

#!/usr/bin/python


import boto
import os
from pprint import pprint


version
= boto.Version
print version
boto
.set_stream_logger('boto')
conn
= boto.connect_ec2(aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'])

res
= conn.get_all_reservations()

pprint
(res[0].instances)

I ran the ansible-playbook command and took snapshots of the boto output every 15 seconds while it ran, then diff'ed the first and last one:

while true; do echo "running $(date)"; /tmp/test.py &> /tmp/test.py.output.$(date +%s); sleep 15; done




This was the obvious difference:

>                     <stateReason>
>                         <code>Client.VolumeLimitExceeded</code>
>                         <message>Client.VolumeLimitExceeded: Volume limit exceeded</message>
>                     </stateReason>
Reply all
Reply to author
Forward
0 new messages