Managing environments

681 views
Skip to first unread message

Jon Tegner

unread,
Apr 29, 2013, 3:26:02 AM4/29/13
to salt-...@googlegroups.com
Hi, I've been looking into salt for a few months now (actually, I stumbled over it just after taking a course in Puppet, and almost directly decided to use Salt instead). Using it in an HPC environment, where we have a few well defined types of machines, i.e., a large bunch of compute nodes, a few storage nodes, and a few pre- and post-processing machines. All of the machines are named in a consistent manner, e.g., compute nodes are named r001, r002, ... etc, so it is easy to define the roles, and when I run highstate all systems are taken care of.

Question. Since it is working so well in this "HPC context" I'm thinking of extending it to manage also the workstations in our group, however, here there is no naming convention, and furthermore the machines might have different requirements. E.g., one machine needs packages and configurations required for "fluidDynamics" and "solidMechanics" and another requires "fluidDynamics" and "combustion".

What would be the best way of doing this? Should I define the relevant lists of machines under environments like "fluidDynamics", "solidMechanics" etc in top.sls, or is there a more elegant way of doing this (using pillar maybe)?

Thanks!

/jon

Steve Butterworth

unread,
Apr 29, 2013, 1:00:50 PM4/29/13
to salt-...@googlegroups.com
Hi Jon,

Short Version

Identify systems by assigning roles using pillar, and key configuration off of those roles.


Long Version

I won't claim that my way is optimal, but here is a fairly concrete example from a very heterogeneous department-level academic support environment. (You may find one server doing five things here, but you won't often find ten doing one thing, except for storage/file servers.)

 (1) Match on pillar values (I have chosen a key named 'roles' with a list of string role names to select on)
 (2) Create a per-host pillar data file that supplements global and group oriented pillar data.
 (3) My package/service grouping reside in a subdirectory called bundles,
 (4) My role-based collections of packages/services reference bundles from a subdirectory called roles.

Every host specification consists of a set of:
 (1) pillar data
 (2) roles which imply bundles of packages and services
 (3) explicit extra bundles of packages and services

My (lightly edited) salt directory structure looks like:

.
|-- _grains
|-- _modules
|-- _runners
|-- _states
|-- bundles
|-- pkgrepos
|-- roles
`-- top.sls



So, my top.sls looks like:

base:
  '*':
    - bundles.core
    - bundles.localenv

  'os_family:RedHat':
    - match: grain
    - pkgrepos.redhat

  'os_family:Debian':
    - match: grain
    - pkgrepos.debian

  'roles:basic-server':
    - match: pillar
    - roles.basic-server

  'roles:web-server':
    - match: pillar
    - roles.web-server

  'roles:python-science':
    - match: pillar
    - bundles.python-science

  'roles:user-server':
    - match: pillar
    - bundles.latex-doc-env
    - bundles.user-server

  # bunch of other roles elided

My basic server roles file, roles/basic-server.sls, referenced in the top file as roles.basic-server, looks like:

include:
  - bundles.ssh.server
  - bundles.snmp
  - bundles.salt.minion
  # stuff elided

Other roles files will use the include lower level roles files, plus bundles specific to the role. For my very basic webserver,
I have :

include:
  - roles.basic-server
  - bundles.apache

At this point, I am still missing the Pillar data that ties everything together. My pillar directory looks like:

.
|-- global
|-- hostinfo
|-- packages
|-- private
`-- top.sls

My pillar top.sls looks like:

base:
  '*':
    - global
    - global.defaults

  # messy stuff elided

  # per-host configuration data
  {{grains['id']}}:
    - hostinfo.{{grains['id']}}  # Real implementation not this simple due to dots in names

Finally, the host specific pillar file looks like:

# passwords and the like that we don't want in version control
include:
  - private.{{grains.id}}
  - private.other
# what does this system do?
roles:
  - gitlab-server

# other host-specific info
owner:
  - name-of-owner
location:
  building: Foobar MacDuff Labs
  room: 197
  rack: 4
  position: 21
ssh:
  port: {{pillar['ssh']['port']['internal-only']}}

I haven't reached the stage of broad deployment, and I know that there are still inconsistencies in how functions are divided, but I hope this will provide some useful ideas for how to structure configuration to handle unique systems. (common config + local config). Roles and bundles have an intrinsically non-unique relationship in this architecture -- you could build everything from bundles alone -- but roles serve both a structural/organizational and declarative purpose. My view of configuration management and how to organize configuration hierarchies is deeply affected by work that I did with Bcfg2 before deciding to move to Salt.

If we are fortunate, someone will disagree strongly enough to put forward an elegant counter-proposal.

Mina Naguib

unread,
Apr 29, 2013, 1:21:11 PM4/29/13
to salt-...@googlegroups.com

Hi Jon

This is a classical many(servers)-to-many(roles) problem.  It can be represented via:
* Role "R1" runs on these servers: s1,s2,s3
* Server "S1" belongs to these roles: r1,r2,r3
* Intermediate "glue" entity: Glue G1 ties s1,r1, Glue G2 ties s1,r3, etc..

Each approach has some pros and cons:
* Clarity in writing definition
* Definition amenability to being interrogated: "Which servers are in role X ?  What roles does server Y serve ?"

In some cases, like you've mentioned, the mapping may be implicitly available via consistent naming of your nodes.

In salt, between environments, grains (automatic and custom), pillars, nodegroups, and jinja templating+variables+looping, there are many ways to explicitly define mappings.  The "best" way depends on your requirements.

I have a similar use case in my installation - an ad serving infrastructure - where there are many servers, many roles (delivery, assets, bidders, apps, dbs, etc..) that require provisioning asymmetrically over the servers, as well as a few "super-roles" that are composed of other roles (think role "delivery-all" composed of role "delivery-customer1 and "delivery-customer1")

When choosing my solution, my criteria were:
* Having a single authoritative source for the definitions
* No duplication, within salt or outside it
* Being able to target to all servers in role R
* In the commandline for one-off runs
* In the state for provisioning
* In the pillars for secret visibility
* Given server S, provision onto it all assigned roles
* Provision configuration "about the role" to external systems (app config, data ETL, monitoring), including list of servers in role
* Said configuration must include list of servers in role independent of their current reachability

Based on the above, I decided to stick to 1 environment (base) and no custom grains (not deterministic during server unreachability).

My current solution for a medium-sized test cluster, as it stands now after a few iterations, is as follows:

* Define role attributes as pillars:
* pillar["roles"]["foo"]["confkey1"] = "confval1"

* Define servers serving a given role as a nodegroup:
* nodegroup "servers_role_foo": L@server1,server2,server3,server4,server5

* Define super-roles as compositions of other nodegroups:
* nodegroup "servers_role_bar": N@servers_role_bar1 or N@servers_role_bar2

* States topfile targets a role nodegroup and points it to a dedicated state for the role:
'servers_role_foo':
- match: nodegroup
- role_foo

* role_foo state simply invokes the needed steps to setup that role on the current server running it

* State that provisions info *about* a role, such as to nagios monitoring for example, does this:
* {% set servers = pillar["master"]["nodegroups"]["servers_role_foo"].split("@")[1].split(",") %}
* It's not pretty and fragile (doesn't check L@, or spaces between commas, etc - it's a working proof-of-concept at the moment - ideally I'd use a salt method to unpack the nodegroup definition)

… and the above (medium-complexity in my opinion) satisfies all my requirements.  I've actually simplified the above slightly, as I have 1 more layer of indirection (many "roles" are duplicates save for their config keys, so that's handled with some looping etc..), but the basics above stay the same.

I hope it helps you evaluate your solution, which may end up being much simpler than the above if you have fewer requirements.

I'd love to hear from others who have tackled similar architectures in salt.
--
You received this message because you are subscribed to the Google Groups "Salt-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to salt-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jon Tegner

unread,
May 3, 2013, 2:11:08 AM5/3/13
to salt-...@googlegroups.com
Thanks a lot!

Information is very much appreciated!

Regards,
/jon
Reply all
Reply to author
Forward
0 new messages