Powering EC2 instances on/off

C. S.

unread,

Apr 21, 2014, 12:29:03 PM4/21/14

to ansible...@googlegroups.com

Hi folks,

We’re trying to implement a system where we can power environments on and off AWS when they’re not in use. However the ec2 inventory module excludes instances that are not in a running state. It seems like adding an option to the ec2 module to include stopped instances would work, but then I guess ansible would need a corresponding option to call the module with to include the stopped instances. Which seems a it hacky…

Maybe ansible needs a notion of host state? Any thoughts?

Thx!

-cs

Scott Anderson

unread,

Apr 21, 2014, 1:10:38 PM4/21/14

to ansible...@googlegroups.com

I use this module: https://github.com/ansible/ansible/pull/6349

Full disclosure: Michael believes all inventory should be done via inventory scripts; I respectfully disagree. :-) I find ec2.py to be very slow (20 seconds to refresh the cache with a small number of instances, for example) and prefer querying inventory directly in the script itself for many use cases.

Regards,

-scott

C. S.

unread,

Apr 21, 2014, 1:39:20 PM4/21/14

to ansible...@googlegroups.com

Thanks!

That’s interesting, your module is the same as ec2_facts just with filtering. And the ec2_facts module says it may add filtering in the notes. I think I’d agree with Michael’s pov, but it looks like we’ve already gone down facts being outside the inventory module, so maybe a pull request against ec2_facts with the filters would get accepted. Long run it does seem like hosts and modules need to have some idea of state …

--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/e10fe6c9-e2b9-4ddb-9797-71c0602ea7c2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Scott Anderson

unread,

Apr 21, 2014, 1:45:22 PM4/21/14

to ansible...@googlegroups.com

Actually, it’s not the same as ec2_facts other than it returns facts about an instance.

ec2_facts only works when run on an actual AWS instance (it calls the Amazon ec2 metadata servers) and it only retrieves the facts for that instance alone.

ec2_instance_facts, on the other hand, can retrieve multiple instance facts at once from anywhere (I use it in a local action). It’s more like ec2.py run for specific instances from within a playbook.

Regards,

-scott

You received this message because you are subscribed to a topic in the Google Groups "Ansible Project" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ansible-project/mwLbfIe-TBA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ansible-proje...@googlegroups.com.

To post to this group, send email to ansible...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/94C96347-186E-47B4-A58A-AA135A9ED736%40yahoo.com.

C. S.

unread,

Apr 21, 2014, 1:54:39 PM4/21/14

to ansible...@googlegroups.com

Thanks for the clarification, right, the use case and implementation are a bit different. Seems like they could be combined however.

To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/40FAF301-F370-4BB9-BF9D-3C688F952D61%40gmail.com.

Michael DeHaan

unread,

Apr 23, 2014, 9:04:46 AM4/23/14

to ansible...@googlegroups.com

"so maybe a pull request against ec2_facts with the filters would get accepted. Long run it does seem like hosts and modules need to have some idea of state … "

Anything applying to more than one host definitely shouldn't be done by the facts module.

To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/9B55B687-ABEB-4F32-B0FE-BAFD2315A0A4%40yahoo.com.

ghe...@gmail.com

unread,

Apr 25, 2014, 2:31:59 PM4/25/14

to ansible...@googlegroups.com

So, I'm curious, for the case where you want to start "stopped" EC2 instances, what's the current recommended approach?

I've kind of ignored this task for now, managing that by hand (it's just our dev env, but it's still a couple of dozen instances at least). I'm almost about to pull Scott's branch in locally since it looks so much better than manual management.

Scott Anderson

unread,

Apr 25, 2014, 2:59:23 PM4/25/14

to ansible...@googlegroups.com

On Apr 25, 2014, at 2:31 PM, ghe...@gmail.com wrote:

So, I'm curious, for the case where you want to start "stopped" EC2 instances, what's the current recommended approach?

I've kind of ignored this task for now, managing that by hand (it's just our dev env, but it's still a couple of dozen instances at least). I'm almost about to pull Scott's branch in locally since it looks so much better than manual management.

Here’s an example in case you do use ec2_instance_facts. This example creates maintenance instances for updating AMIs.

Notes:

* This is part of a set of scripts that will create an entire load balanced application environment (including DNS, VPC, centralized logging, and RDS) in a bare AWS account in about 20-30 minutes.

* app_environment is dev, test, stage, or prod. The scripts will create the same setup in each environment with some differences such as RDS size, domain name, and so forth.

* I use a naming convention for AWS resources of ‘<product>-<environment>-<AWS type>-<purpose>’, eg. foo-stage-ec2-logging or foo-prod-ami-web.

# The base image is created from a standard Ubuntu LTS instance. Then, packages common to all

# of the images (eg. security, ansible, boto, etc.) are installed and configured.

# There’s a separate pull request (also rejected, hi Michael... ;-) for the ec2_ami_facts module.

- name: Obtain list of existing AMIs

local_action:

module: ec2_ami_facts

description: "{{ ami_image_name }}"

tags:

environment: "{{ app_environment }}"

region: "{{ vpc_region }}"

aws_access_key: "{{ aws_access_key }}"

aws_secret_key: "{{ aws_secret_key }}"

register: ami_facts

ignore_errors: yes

# If a version of the AMI exists, record this. Otherwise use the base Ubuntu image.

- set_fact:

environment_base_image_id: "{{ ami_facts.images[0].id }}"

when: ami_facts.images|count > 0

- set_fact:

environment_base_image_id: "{{ ami_base_image_id }}"

when: ami_facts.images|count == 0

# See if the maintenance image for this image type for this environment is running.

- name: Obtain list of existing instances

local_action:

module: ec2_instance_facts

name: "{{ ami_maint_instance_name }}”

# Everything but terminated

states:

- pending

- running

- shutting-down

- stopped

- stopping

tags:

environment: "{{ app_environment }}"

region: "{{ vpc_region }}"

aws_access_key: "{{ aws_access_key }}"

aws_secret_key: "{{ aws_secret_key }}"

register: instance_facts

ignore_errors: yes

- set_fact:

environment_maint_instance: "{{ instance_facts.instances_by_name.get(ami_maint_instance_name) }}"

when: instance_facts.instances|count > 0

# If there is no such instance, create one.

- name: Create an instance for managing the AMI creation

local_action:

module: ec2

state: present

image: "{{ environment_base_image_id }}"

instance_type: t1.micro

group: "{{ environment_public_ssh_security_group }}"

instance_tags:

Name: "{{ ami_maint_instance_name }}"

environment: "{{ app_environment }}"

key_name: "{{ environment_public_ssh_key_name }}"

vpc_subnet_id: "{{ environment_vpc_public_subnet_az1_id }}"

assign_public_ip: yes

wait: yes

wait_timeout: 600

region: "{{ vpc_region }}"

aws_access_key: "{{ aws_access_key }}"

aws_secret_key: "{{ aws_secret_key }}"

register: maint_instance

when: environment_maint_instance is not defined

- set_fact:

environment_maint_instance: "{{ maint_instance.instances[0] }}"

when: maint_instance is defined and maint_instance.instances|count > 0

- name: Ensure instance is running

local_action:

module: ec2

state: running

instance_ids: "{{ environment_maint_instance.id }}"

wait: yes

wait_timeout: 600

region: "{{ vpc_region }}"

aws_access_key: "{{ aws_access_key }}"

aws_secret_key: "{{ aws_secret_key }}"

register: maint_instance

when: environment_maint_instance is defined

# If we had to start the instance then the public IP will not have been defined when

# we gathered facts above, so get it again.

- name: Obtain public IP of newly running instance

local_action:

module: ec2_instance_facts

name: "{{ ami_maint_instance_name }}"

states:

- running

tags:

environment: "{{ app_environment }}"

region: "{{ vpc_region }}"

aws_access_key: "{{ aws_access_key }}"

aws_secret_key: "{{ aws_secret_key }}"

register: instance_facts

when: maint_instance|changed

- set_fact:

environment_maint_instance: "{{ instance_facts.instances_by_name.get(ami_maint_instance_name) }}"

when: maint_instance|changed

# Pass the collected facts on the new maintenance image host for configuration by role.

- name: Add new maintentance instance to host group

local_action:

module: add_host

hostname: "{{ environment_maint_instance.public_ip }}"

groupname: maint_instance

app_environment: "{{ app_environment }}"

# This passes the new/existing private key file to ansible for use in contacting the hosts. Better way to do this?

ansible_ssh_private_key_file: "{{ environment_public_ssh_private_key_file }}"

environment_maint_instance: "{{ environment_maint_instance }}"

- name: Wait for SSH on maintenance host

local_action:

module: wait_for

host: "{{ environment_maint_instance.public_ip }}"

port: 22

# This is annoying as Hades. Sometimes the delay works, sometimes it's not enough.

# The check fails if the port is open but the ssh daemon isn't yet ready to accept

# actual traffic, right after the maintenance instance is started.

#delay: 10

timeout: 320

state: started

# TODO fix the hardcoded user too

- name: Really wait for SSH on maintenance host

local_action: command ssh -o StrictHostKeyChecking=no -i {{ environment_public_ssh_private_key_file }} ubuntu@{{ environment_maint_instance.public_ip }} echo Rhubarb

register: result

until: result.rc == 0

retries: 20

delay: 10

Regards,

-scott

James Carroll

unread,

Apr 25, 2014, 3:08:07 PM4/25/14

to ansible...@googlegroups.com

I'm fairly new to Ansible. How do I get your code into my Ansible install so I can use it? I run from source.

Thanks!

James

On Monday, April 21, 2014 12:29:03 PM UTC-4, cove_s wrote:

__________________________________________________________________
The information contained in this email message and any attachment may be privileged, confidential, proprietary or otherwise protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution, copying or use of this message and any attachment is strictly prohibited. If you have received this message in error, please notify us immediately by replying to the message and permanently delete it from your computer and destroy any printout thereof.

Scott Anderson

unread,

Apr 25, 2014, 3:10:07 PM4/25/14

to ansible...@googlegroups.com

On Apr 25, 2014, at 3:08 PM, James Carroll <james....@idmworks.com> wrote:

> I'm fairly new to Ansible. How do I get your code into my Ansible install so I can use it? I run from source.

I keep all of my new/modified modules in a library directory under where my play books are. Ansible will find the libraries there and use them over the ones in the Ansible install.

Regards,
-scott

Michael DeHaan

unread,

Apr 25, 2014, 4:57:54 PM4/25/14

to ansible...@googlegroups.com

Using local ./library content is fine, but please don't run a fork with extra packages added if you are going to ask questions about them -- or at least identify that you are when you do.

It can make Q&A very confusing when people ask about things that aren't merged.

To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/af1d0ff7-2158-484a-b0d1-f84dd9c564a7%40googlegroups.com.

Gustavo Hexsel

unread,

Apr 25, 2014, 7:17:16 PM4/25/14

to ansible...@googlegroups.com

Just as a side-note, I was able to get the wait_for mode to work for ssh with a bit of fiddling (so you don't have to wait with 2 tasks):

- hosts: 127.0.0.1

connection: local

gather_facts: false

vars_files:

- env.yaml

tasks:

- name: Wait for SSH to come up after the reboot

wait_for: host={{item}} port=22 delay=60 timeout=90 state=started

with_items: groups.tag_env_{{pod}}_

ignore_errors: yes

register: result

until: result.failed is not defined

retries: 5

This seems to work for me all the time, but maybe I just got lucky. I create groups based on tags. Groups are tag-based "class_database": "", "class_monitoring": "", "env_qa1": "", which I register using add_host.

--
You received this message because you are subscribed to a topic in the Google Groups "Ansible Project" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ansible-project/mwLbfIe-TBA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/0D6BADC5-829C-4EE9-A6FA-7B672330B5E7%40gmail.com.

Michael DeHaan

unread,

Apr 26, 2014, 8:36:00 AM4/26/14

to ansible...@googlegroups.com

I'm always a bit wary when so many keywords come together. It's usually the sign something can be simplified and is not "Ansible-like" enough.

- name: Wait for SSH to come up after the reboot

wait_for: host={{item}} port=22 delay=60 timeout=90 state=started

with_items: groups.tag_env_{{pod}}_

ignore_errors: yes

register: result

until: result.failed is not defined

retries: 5

Can likely be simplified to:

- hosts: localhost

tasks:

- ec2: # provisioning step here with add_host...

- hosts: groups.tag_env_{{ pod }}_

tasks:

- name: Wait for SSH to come up after the reboot

local_action: wait_for host={{item}} port=22 delay=60 timeout=90

A few key concepts:

(A) Using the host loop is clearer than doing a "with_items" across the group

(B) You should only need to do one wait_for. Consider increasing the timeout rather than looping over a retry

(C) You should not need to register the result of the retry since there is no loop

(D) You won't need to ignore errors because we're running wait_for off localhost, which we know we can connect to.

--

You received this message because you are subscribed to the Google Groups "Ansible Project" group.

To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.

To post to this group, send email to ansible...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/CADbsxYNcu2DP08qPpCoYd%3DAgk15NiL5q%2BOnS8qUGDeabux6hPQ%40mail.gmail.com.

Gustavo Hexsel

unread,

Apr 26, 2014, 1:32:44 PM4/26/14

to ansible...@googlegroups.com

Then I can consider this a bug report. Without retries, wait_for fails for every EC2 AMI I tried (admitedly, they're all variations of CentOS).

Things I've seen:

- it reports port open, then refuses to connect

- it reports times out even though I was able to manually log in prior to the timeout

- it fails with ssh errors while checking the port (this one is a bit rare)

This combination is less than ideal, but it seemed to work for all my cases. Also, a minor thing, you have an ec2 task then you start using the groups.tag_xxx, is it implied you have an add_host there? Cause my ec2 instances won't appear unless I add that.

To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/CA%2BnsWgzbErVs7UPjgNmv1ZvjYrbs3uPssDv8XhHEs3Lk7Sr0sA%40mail.gmail.com.

Gustavo Hexsel

unread,

Apr 26, 2014, 1:33:44 PM4/26/14

to ansible...@googlegroups.com

Nvm, saw the add_host in the comment.

Reply all

Reply to author

Forward