Hi Jon
This is a classical many(servers)-to-many(roles) problem. It can be represented via:
* Role "R1" runs on these servers: s1,s2,s3
* Server "S1" belongs to these roles: r1,r2,r3
* Intermediate "glue" entity: Glue G1 ties s1,r1, Glue G2 ties s1,r3, etc..
Each approach has some pros and cons:
* Clarity in writing definition
* Definition amenability to being interrogated: "Which servers are in role X ? What roles does server Y serve ?"
In some cases, like you've mentioned, the mapping may be implicitly available via consistent naming of your nodes.
In salt, between environments, grains (automatic and custom), pillars, nodegroups, and jinja templating+variables+looping, there are many ways to explicitly define mappings. The "best" way depends on your requirements.
I have a similar use case in my installation - an ad serving infrastructure - where there are many servers, many roles (delivery, assets, bidders, apps, dbs, etc..) that require provisioning asymmetrically over the servers, as well as a few "super-roles" that are composed of other roles (think role "delivery-all" composed of role "delivery-customer1 and "delivery-customer1")
When choosing my solution, my criteria were:
* Having a single authoritative source for the definitions
* No duplication, within salt or outside it
* Being able to target to all servers in role R
* In the commandline for one-off runs
* In the state for provisioning
* In the pillars for secret visibility
* Given server S, provision onto it all assigned roles
* Provision configuration "about the role" to external systems (app config, data ETL, monitoring), including list of servers in role
* Said configuration must include list of servers in role independent of their current reachability
Based on the above, I decided to stick to 1 environment (base) and no custom grains (not deterministic during server unreachability).
My current solution for a medium-sized test cluster, as it stands now after a few iterations, is as follows:
* Define role attributes as pillars:
* pillar["roles"]["foo"]["confkey1"] = "confval1"
* Define servers serving a given role as a nodegroup:
* nodegroup "servers_role_foo": L@server1,server2,server3,server4,server5
* Define super-roles as compositions of other nodegroups:
* nodegroup "servers_role_bar": N@servers_role_bar1 or N@servers_role_bar2
* States topfile targets a role nodegroup and points it to a dedicated state for the role:
'servers_role_foo':
- match: nodegroup
- role_foo
* role_foo state simply invokes the needed steps to setup that role on the current server running it
* State that provisions info *about* a role, such as to nagios monitoring for example, does this:
* {% set servers = pillar["master"]["nodegroups"]["servers_role_foo"].split("@")[1].split(",") %}
* It's not pretty and fragile (doesn't check L@, or spaces between commas, etc - it's a working proof-of-concept at the moment - ideally I'd use a salt method to unpack the nodegroup definition)
… and the above (medium-complexity in my opinion) satisfies all my requirements. I've actually simplified the above slightly, as I have 1 more layer of indirection (many "roles" are duplicates save for their config keys, so that's handled with some looping etc..), but the basics above stay the same.
I hope it helps you evaluate your solution, which may end up being much simpler than the above if you have fewer requirements.
I'd love to hear from others who have tackled similar architectures in salt.