modelling inventory variables

505 views
Skip to first unread message

Serge van Ginderachter

unread,
Jan 21, 2014, 4:18:29 PM1/21/14
to ansible...@googlegroups.com
Hi list,


​TL;DR: I'd like to know how people model their inventory data for a large set of hosts (+500 vm's) that are given the mostly the same role, but with many varying applications parameters, to the extent where a simple with_items list or even with_nested list doesn't satisfy anymore.


I have been pondering some time on the subject at hand, where I'm hesitant if the way I started working with ansible and how it growed over time, is the best possible way. In particular on how to model the inventory
​, 
variables, but obviously also in the way implement
​ing​
and nest
​ing​
groups.

Rather than showing how I did it, let me explain some of the particulars of this environment, so I can ask the community "how would you do it?"

We're mostly a Java shop, and have a very standardized, and sometimes particular setup:

* 75% of all hosts (vm's) are tomcat hosts (I'll focus on just those from here);
* every specific tomcat setup is deployed as two nodes (not a real cluster, but mostly stateless applications behind a loadbalancer);
* every cluster typically has 1 application (1 deployed war with 1 context path in tomcat speak, basically providing http://node/app ); 
* occasionally a node/cluster will have more than one such 'application' hosted. This can be on the same Tomcat instance (same tcp port 8080), but could also be living on another port (which calls the need for a separate ip/port combination or pool on the load balancer)
* every application cluster typically is part of a larger application which can vary from one to several application clusters
* the big applications are part of a project, a project is part of an organisation
* every application has three instances in each environment: development, testing and production (clustered in the same way, everywhere)
* the loadbalancer performs typically one, but sometimes more, health checks
​per
 application (a basic GET, and checking a string in the response), and will automatically mark a node as down if that fails
* some applications can communicate with some other applications if need be, but only by
​communicating through
 the loadbalancer; this is also enforced by the network;
​ so​
we need a configuration here that says 'node A may communicate with node B'; we do that on the load balancer
​ at the time, and every such set needs a separate LB config;
* every application is of course consumed in some way or another, and is defined on the load balancer (nodes and pools and virtual servers in F5 speak)

Yes, this means every tomcat application lives on, in total, 6 instances (2 cluster nodes x 3 environments), hence 6 virtual machines

A basic inventory would hence show as:

all inventory
|_ organisation 1
   |_ project 1
      |_ application 1
         |_ dev
            |_ node 1
            |_ node 2
         |_ test
            |_ ..
         |_ prod
            |_ ..
      |_ application 2
            |_ ..
   |_ project 2
      |_ ..
|_ organisation 1
   |_ ..

Some other implented groups are:

|_ development
   |_ organisation1-dev
      |_application1-dev 
|_ testing
|_ production

or

​- ​
tomcat
​ ​
|_ application1
​​
​ ​
|_ application2
​- 
<some_other_server_role_besides_tomcat>
​ ​
|_ application7
​ ​
|_ application9

Our environment counts around 100 applications, hence 600 vm's at this moment, so keeping everything rigorously standard is very important. 
​Automating the load balancer from a config per application has become​ a key issue1
​So w
hen looking beyond the
​purely per ​
groups and node inventory, on a node we get following data important to configure things on the load balancer:


* Within an application server:

node
|_ subapp1
   |_ healthcheck1
   |_ healthcheck2
|_ subapp1
​     ......​


*
​So
 we also need to define which application cluster may communicate with what other application cluster. Normally this is the same configuration for all environments, but on some occasions a node in environment X might need to communicate with a node in environment Y (e.g. a dev node that needs relaying mail, as we have just one smtp speaking node
​"prod" ​
setup for all environments, these exceptions are rare, but I tend to think necessary exceptions should be automated as well.)

This
​cluster to cluster communication
 thing is actually something I'm not sure what the best way would be to implement in variables, as at this point it isn't about just a host or group var any more, but it's about data for multiple hosts (e.g. giving access from app A to app B requires network facts from both clusters).

Also, at this point, data gets nested very deep, looping over separate applications with different paths, on different ports, with each instance having multiple healthchecks. Until here I managed it, but now combine this with the need of giving certain clusters access to one or more of those instances on one or more other clusters. Basically, I'm stumping on the limits of with_nested here.


So, given this, how would you design the inventory data, to implement all this? Am I overdoing it by wanting to put everything in a combined set of complex variables?


I look forward to different viewpoints :)


Thanks,



    Serge

Michael DeHaan

unread,
Jan 21, 2014, 4:30:53 PM1/21/14
to ansible...@googlegroups.com
"* 75% of all hosts (vm's) are tomcat hosts (I'll focus on just those from here);

ok

* every specific tomcat setup is deployed as two nodes (not a real cluster, but mostly stateless applications behind a loadbalancer);
* every cluster typically has 1 application (1 deployed war with 1 context path in tomcat speak, basically providing http://node/app ); 

this sounds somewhat like serving multiple customers or different variations on a project from a shared infrastructure?  i.e. AcmeCorp and BetaCorp?   This seems to imply groups here to me so far.

* occasionally a node/cluster will have more than one such 'application' hosted. This can be on the same Tomcat instance (same tcp port 8080), but could also be living on another port (which calls the need for a separate ip/port combination or pool on the load balancer)

This seems to imply each node/cluster has a playbook that defines what groups get what roles.    If you want to generate those, that could be reasonable depending on use case.

* every application cluster typically is part of a larger application which can vary from one to several application clusters
* the big applications are part of a project, a project is part of an organisation

AWX is pretty useful for segrating things and permissions between organizations, if you're talking about access control.   Can be useful.  Just throwing that out there.

* every application has three instances in each environment: development, testing and production (clustered in the same way, everywhere)

This seems like you might want to maintain three seperate inventories, that way "-i development" never risks managing production and there is no crossing of the streams (assuming people have seen Ghost Busters)

* the loadbalancer performs typically one, but sometimes more, health checks 
per
 application (a basic GET, and checking a string in the response), and will automatically mark a node as down if that fails


* some applications can communicate with some other applications if need be, but only by 
communicating through
 the loadbalancer; this is also enforced by the network;
  so
we need a configuration here that says 'node A may communicate with node B'; we do that on the load balancer
at the time, and every such set needs a separate LB config;
* every application is of course consumed in some way or another, and is defined on the load balancer (nodes and pools and virtual servers in F5 speak)

Seems unrelated to the above bits (or at least not complicating it).

Summary of my suggestion:

* groups per "customer"
* seperate inventories for QA/stage/prod
* define role to server mapping in playbooks, which you might generate if inventory is a source of such knowledge
* roles of course still written by hand




--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Michael DeHaan <mic...@ansibleworks.com>
CTO, AnsibleWorks, Inc.
http://www.ansibleworks.com/

Brian Coca

unread,
Jan 21, 2014, 4:44:29 PM1/21/14
to ansible...@googlegroups.com
I have a similar setup, just a bit smaller, but I've taken the looping and complexity into the configuration templates vs the ansible tasks. 

I don't know if this helps you much, but I found that a bit of complexity in the jinja templates goes a long way and executes much faster than putting it into tasks.


--
Brian Coca
Stultorum infinitus est numerus
0110000101110010011001010110111000100111011101000010000001111001011011110111010100100000011100110110110101100001011100100111010000100001
Pedo mellon a minno

C. S.

unread,
Jan 22, 2014, 3:11:10 AM1/22/14
to ansible...@googlegroups.com
We’re somewhat similar to you in size and complexity… we don’t have org and proj layers though, we’re flatter with app-type-env, mostly… 
 
My 2c…

- Use Ansible roles (of course)
- Use the group_vars directory for vars, as opposed to passing the vars into the role directly, much easier to mange and track changes to envs. (also easy to parse for generating docs of what connects to what)
- Databases, loadbals, firewalls get their own groups too, just like your app servers.
- Deploying a new app means you need to link everything together by editing the correct group_vars files for the database, loadbal, app and firewall. Then run the playbooks in the right order. (Obviously there’s room for automation here)
- Little known feature -i <directory> will cause ansible to use all the files and scripts in the dir for the inventory (very useful!)
- Lists of associative arrays in group_vars files are quite nice for managing accounts, ACLs and other things you need to keep on adding to. 

HTH

Serge van Ginderachter

unread,
Jan 22, 2014, 7:06:39 AM1/22/14
to ansible...@googlegroups.com
​​

* occasionally a node/cluster will have more than one such 'application' hosted. This can be on the same Tomcat instance (same tcp port 8080), but could also be living on another port (which calls the need for a separate ip/port combination or pool on the load balancer)

This seems to imply each node/cluster has a playbook that defines what groups get what roles.    If you want to generate those, that could be reasonable depending on use case.

In this case, there is just one playbook with one set of roles, to deploy tomcat and all.
The variations in application happens in inventory/group variables​​.​​
 
* every application has three instances in each environment: development, testing and production (clustered in the same way, everywhere)

This seems like you might want to maintain three seperate inventories, that way "-i development" never risks managing production and there is no crossing of the streams (assuming people have seen Ghost Busters)

  ( ​/me puts on his proton pack​ )

Well, to defeat the marshmallow man, you need to cross them.​​

Avoiding to run something on dev instead of production means you have to remember to target the right inventory; here I have to remember to run with the right --limit. Same issue, just a different cli option.
Also, in some cases I need to run things on hosts in different environments, so a total separation is not possible.

* the loadbalancer performs typically one, but sometimes more, health checks 
per
 application (a basic GET, and checking a string in the response), and will automatically mark a node as down if that fails
* some applications can communicate with some other applications if need be, but only by 
communicating through
 the loadbalancer; this is also enforced by the network;
  so
we need a configuration here that says 'node A may communicate with node B'; we do that on the load balancer
at the time, and every such set needs a separate LB config;

* every application is of course consumed in some way or another, and is defined on the load balancer (nodes and pools and virtual servers in F5 speak)

Seems unrelated to the above bits (or at least not complicating it).

Well, actually, here is where it gets more complicated, and where I struggle the most. The above was just to give a clear idea of the environment. 

Putting the config for this loadbalancer here in the inventory ​​and choosing a variable model to use with the tasks/modules I have evolves to something too deeply nested.

So far I have this model, and am able to configure up until pools and monitors:

default_publishedapps:
- name:         "web"                                                           
  type:        "{{ default_apptype }}"                                         
  port:         8080                                                           
  lbport:       "{{ default_lbport }}"
  monitortype:  "{{ default_monitortype }}"                                    
  quorum:       0                                                              
  monitors:                                                                     
  - name:      "{{ default_monitor_appname }}"
    type:       http                                                            
    get_path:  "{{ default_get_path }}"                                         
    protocol:  "{{ default_protocol }}"                                        
    get_extra: "{{ default_get_extra }}"                                       
    receive:   "{{ default_receive_string }}"                                  
    monitorname: web
#- name: tcp                                                                    
#  type: tcp                                                                    
#  port: 1234                                                                   
#  lbport: 601234                                                               
#  monitortype: "{{ default_monitortype }}"                                     
#  quorum: 0                                                                    
#  monitors:                                                                    
#  - name:         "tcp"                                                        
#    type:         tcp_half_open                                                
#    send:         ""                                                           
#    receive:      ""    

This works with the subelements plugin.

At this point, I now need a way to say 

" App X needs to be defined in a loadbalancer virtual proxy, and be accessible to node Z "

And then I need to define these proxies, and for this I need to loop through 

- settings from the former list off applications on a host
- settings from the latter list of which applications to define and make available to which other hosts
-  and use network settings from those other hosts
-   and all this *could* cross environments.

I didn't implement this yet (still needs work on the virtual proxy  module), but thgis is where I'm hesitant bout how to move forward.
I feel I'd need some extra's in ansible to get to this in a clean way, possibly 
- nesting lookup plugins
- have a way to create new lists doing things like:
  - set_fact: ....
    with_items: .....
   Where the registered result is a list of all iterations.

But I might miss other solutions. 


Summary of my suggestion:

* groups per "customer"
* seperate inventories for QA/stage/prod
* define role to server mapping in playbooks, which you might generate if inventory is a source of such knowledge
* roles of course still written by hand


I think I may say this is the basic usage for most things ansible, and I'm well aware of these practices already :)


Thanks!


Serge

Serge van Ginderachter

unread,
Jan 22, 2014, 7:08:43 AM1/22/14
to ansible...@googlegroups.com

On 21 January 2014 22:44, Brian Coca <bria...@gmail.com> wrote:
I have a similar setup, just a bit smaller, but I've taken the looping and complexity into the configuration templates vs the ansible tasks. 

I don't know if this helps you much, but I found that a bit of complexity in the jinja templates goes a long way and executes much faster than putting it into tasks.

​Yes! I 100% agree. As templates can have quite some logic, one can leave a abig part of complexity with them.

Alas, not everything can be configured wit a text file. Here I work on a proprietary load balancer with an API, and specific modules (the big ip stuff)


Thanks,

Serge​

Serge van Ginderachter

unread,
Jan 22, 2014, 7:16:58 AM1/22/14
to ansible...@googlegroups.com
On 22 January 2014 09:11, C. S. <cov...@yahoo.com> wrote:
- Use Ansible roles (of course)

Obviously :)​​ But ansible play syntax related things are not  really an issue here (except perhaps how far I can iterate through things)

- Use the group_vars directory for vars, as opposed to passing the vars into the role directly, much easier to mange and track changes to envs. (also easy to parse for generating docs of what connects to what)

As our environment is mostly 1 application type, everything must be parametrized in inventory, I can't afford to hardcode things in playbooks here. So, yes.
​​
- Databases, loadbals, firewalls get their own groups too, just like your app servers.​​
- Deploying a new app means you need to link everything together by editing the correct group_vars files for the database, loadbal, app and firewall. Then run the playbooks in the right order. (Obviously there’s room for automation here)

As of now, they are just delegated hosts, not really part of the inventory, as i see it, the config of the loadbalncer depends on data from the nodes, data that should be part of that node.
I don't really like the idea to have certain data about certain applications, part of a node, be linked directly to a separate host.
But maybe that's part of the reason I complicate things? Not sure.
​​
- Little known feature -i <directory> will cause ansible to use all the files and scripts in the dir for the inventory (very useful!)

I already heavily split up things in different subdirectories :) Which has drawbacks however, but that's another story.
​​
- Lists of
​​
associative arrays in group_vars files are quite nice for managing accounts, ACLs and other things you need to keep on adding to. 

Can you elaborate on what exactly you mean by this? By 
associative arrays?


Thanks,


Serge​​


C. S.

unread,
Jan 24, 2014, 3:59:26 AM1/24/14
to ansible...@googlegroups.com
On Jan 22, 2014, at 04:16 , Serge van Ginderachter <se...@vanginderachter.be> wrote:


On 22 January 2014 09:11, C. S. <cov...@yahoo.com> wrote:
- Use Ansible roles (of course)

Obviously :)​​ But ansible play syntax related things are not  really an issue here (except perhaps how far I can iterate through things)

- Use the group_vars directory for vars, as opposed to passing the vars into the role directly, much easier to mange and track changes to envs. (also easy to parse for generating docs of what connects to what)

As our environment is mostly 1 application type, everything must be parametrized in inventory, I can't afford to hardcode things in playbooks here. So, yes.
​​
- Databases, loadbals, firewalls get their own groups too, just like your app servers.​​
- Deploying a new app means you need to link everything together by editing the correct group_vars files for the database, loadbal, app and firewall. Then run the playbooks in the right order. (Obviously there’s room for automation here)

As of now, they are just delegated hosts, not really part of the inventory, as i see it, the config of the loadbalncer depends on data from the nodes, data that should be part of that node.
I don't really like the idea to have certain data about certain applications, part of a node, be linked directly to a separate host.
But maybe that's part of the reason I complicate things? Not sure.
​​

I would think so, the data is still part of your node logically, even if it’s split up between files so located it’s where it’s being used.

- Little known feature -i <directory> will cause ansible to use all the files and scripts in the dir for the inventory (very useful!)

I already heavily split up things in different subdirectories :) Which has drawbacks however, but that's another story.

We don’t actually split up our inventories, we just use one, and then always use —limit to control which hosts it get’s applied to. Other than some base os type playbooks, we have no use case where we’d run all our playbooks over all hosts, we only do very specific playbook runs.

​​
- Lists of
​​
associative arrays in group_vars files are quite nice for managing accounts, ACLs and other things you need to keep on adding to. 

Can you elaborate on what exactly you mean by this? By 
associative arrays?


e.g. 
inventory/group_vars/tag_Role_my_db_cluster_01:
my_db_users:
   - db: database1
     login: app1
     pass: secret
     perms: rw
     ….

   - db: database2
     login: app2
     pass: secret
     perms: ro
     ...

role/dbcluster/tasks/main.yml:
  - dbmodule: database={{item.db}} name={{item.login}} password={{item.pass}} perms={{item.perms}} …
    with_items: my_db_users

Also, the above syntax for my_db_users scales nicely if you have long values and a lot of them per entry. 

Serge van Ginderachter

unread,
Feb 10, 2014, 2:50:43 PM2/10/14
to ansible...@googlegroups.com
Hi,


For who is further interested in this discussion, allow me to link to my presentation on http://cfgmgmtcamp.eu/ on this topic:


Without the talk itself, this presentation is not fully informative, but I'm happy to further discuss it, or to receive private mail on it, if you find that need.


   Serge

Michael DeHaan

unread,
Feb 10, 2014, 3:21:02 PM2/10/14
to ansible...@googlegroups.com
Yeah maybe start a new thread and fill in between the lines maybe?

My understanding from private conversations was you had some possible ideas for upgrades.

I also think you are probably going to be interested in writing your own classifier because you have some cross-cutting and modelling concerns that might not be generic, but also had some ideas around some small tweaks that could be made to the INI parser
and also some efficiency thoughts around vars_plugins (which are an internalisms and not really intended to be user-serviceable), but I'm open to seeing if we can make that better -- (benchmarks might help?)

I was actually talking to a customer recently who had a similar set of concerns, but ultimately I think this gets into site-specifics very very quickly, as they basically had a 5-dimensional problem going on and might end up breaking out Neo4j :)



Michael DeHaan

unread,
Feb 10, 2014, 6:57:57 PM2/10/14
to ansible...@googlegroups.com

We don’t actually split up our inventories, we just use one, and then always use —limit to control which hosts it get’s applied to. Other than some base os type playbooks, we have no use case where we’d run all our playbooks over all hosts, we only do very specific playbook runs.



I'm generally (theoretically) cautious of what happens if you leave off --limit in that case if the playbook regularly targets everything.   Maybe make a wrapper script?

Maybe it's not been a thing.


 

Serge van Ginderachter

unread,
Feb 11, 2014, 2:54:25 AM2/11/14
to ansible...@googlegroups.com

On 11 February 2014 00:57, Michael DeHaan <mic...@ansible.com> wrote:
I'm generally (theoretically) cautious of what happens if you leave off --limit in that case if the playbook regularly targets everything.   Maybe make a wrapper script?

​Yes, I have a wrapper script that queries the inventory, and presents a subset to be used as parameters​ to said script, which is then used by developers, who can run certain things (tags) on certain groups (development etc.)


Reply all
Reply to author
Forward
0 new messages