How to read in file into array of hashes to use build start script template

554 views
Skip to first unread message

James Perry

unread,
Jan 13, 2017, 4:15:05 PM1/13/17
to Puppet Users
After spending most of they day digging around and researching, I find Puppet's immutable variables are keeping me from properly handling what I'm trying to do, so I want to see if anyone else has some suggestions on how to handle was I need to accomplish.

Goal: Ingest a CSV file provided by a user and generate a start / stop script, dynamically, for every server in scope, based on CSV file. 

CSV Format: 
SERVER,start command 

Example. 
SERVERA, /usr/local/bin/prog start databasea
SERVERA, /usr/local/bin/prog start databaseb
SERVER1, /usr/local/bin/prog start database123


The basic design I had in mind for the manifest is to: 
1. Read in the file as provided,
2. Convert <A>,<B> to downcase(A) => B
3. if $hostname == A 
       $my_server_script_lines = $my_hash[A][B]
       file { 
         ... 
         content => template("basic_start_script"),
         }

4. Create a template that runs through the $my_server_script_lines to  put each start line under start) and under stop) after doing a substr replacement of start for stop in B. 

Code so far
include stdlib
$my_data = file("/home/me/database.csv")
$my_subst = downcase(split($my_data2,'[,\n]'))
$my_hash = hash($my_subst)

notice ($my_hash[SERVERA])

$ puppet apply --verbose test.pp
Info: Loading facts
Notice: Scope(Class[main]): '/usr/local/bin/prog start databaseb'
Notice: Compiled catalog for myhost.net in environment production in 0.16 seconds
Info: Applying configuration version '1484340247'
Notice: Applied catalog in 0.03 seconds

Here are the values of the variables as it processes through

$my_data = "SERVERA,/usr/local/bin/prog start databasea
SERVERA,/usr/local/bin/prog start databaseb
SERVERB,/usr/local/bin/prog start database123"

$my_subst = [servera, '/usr/local/bin/prog start databasea' , servera, '/usr/local/bin/prog start databaseb' , serverb, '/usr/local/bin/prog start database123' ]
 
$my_hash = {servera => '/usr/local/bin/prog start databaseb' , serverb => '/usr/local/bin/prog start database123' }

So I already know why the hash conversion dropped the "start databasea" for the servera key, what I can't seem to figure out is how to have it convert into a array of value pairs for a specific key.   

   { servera => ['/usr/local/bin/prog start databasea', '/usr/local/bin/prog start databaseb'], serverb => ['/usr/local/bin/prog start database123'] }

I tried various iterations of .each to try to create and fill the array pointed to by the hash, but Puppet doesn't permit that as it would be changing an already assigned variable / hash. 

I was able use the $my_subst variable in an erb template to create the start/stop lines.  It worked ok for the 3 line example above, but when I got to dozens of servers / start lines being applied to hundreds of servers on each check-in it soon killed the CPU in my master server as it ran through a loop checking if $hostname == servername. 

Is it possible to have Puppet handle parsing the data in $my_substr, or even right from the raw file data to do the following? 
   1. Run through incoming data to fill start command array.   ['/usr/local/bin/prog start databasea', '/usr/local/bin/prog start databaseb']
   2. Assign that to the array of key-pairs.  { servera => ['/usr/local/bin/prog start databasea', '/usr/local/bin/prog start databaseb'], serverb => ['/usr/local/bin/prog start database123'] }

Thanks! 

Garrett Honeycutt

unread,
Jan 13, 2017, 4:26:50 PM1/13/17
to puppet...@googlegroups.com
> *Notice: Scope(Class[main]): '/usr/local/bin/prog start databaseb'*
Hi James,

One approach would be to not do it within a puppet manifest and instead
transform that data with a language you are familiar with and have it
write to its own file in Hiera as YAML or JSON. Once the data structure
is there, you can use the create_resources() function to create the
resources from the data in Hiera.

Another approach would be to write a custom function or ENC that uses
your CSV as the data store and for a given server respond with the start
command. If you are not familiar with ruby, the custom ENC would be
easier, since it can be in any language.

Instead of a CSV, you might want to look at Consul which can host
key/value pairs for you. You can then query it to see which databases
are associated with a given server.

Best regards,
-g

--
Garrett Honeycutt
@learnpuppet
Puppet Training with LearnPuppet.com
Mobile: +1.206.414.8658
Message has been deleted

John Gelnaw

unread,
Jan 13, 2017, 8:03:29 PM1/13/17
to Puppet Users
Set up hiera correctly, add a yaml file to your hierarchy, and and translate the CSV file to YAML:

I'm a perl geek, so:
#!/bin/perl

while(<>)  {
  chomp
; 
  tr
/A-Z/a-z/;
  
my @a = split(/\,/);
  push
(@{$hash{$a[0]}}, $a[1]);
}
for my $srv (sort(keys(%hash)))  {
  
print "startup::$srv\n";
  
for my $cmd (@{$hash{$srv}})  {
    
print "  - $cmd\n";
  
}
}


... yes, that array syntax in the hash is hideous.  ;)

Also, I know I should be using CSV and YAML modules, but the example was simple enough.

That should produce something like:

startup::servera:
  - /usr/local/bin/prog start databasea
  - /usr/local/bin/prog start databaseb 
startup::serverb:
  - /usr/local/bin/prog start database123

Although I'd probably drop the "/usr/local/bin/prog start", since it seems to be common to all.

Then a class:

class startup  {

  $array
= hiera(startup::$hostname, "none")

 
if (! $array == "none") {
   
< do stuff >
 
}
}



I'm assuming serverb doesn't need to know servera's business (loading the entire thing on every server seems wasteful to me), but if it does, change the yaml to:

startup:
  servera:
    - command 1
    - command 2
  serverb:
    - command 1

And then just load the entire hash:

$hash = hiera("startup")

James Perry

unread,
Jan 20, 2017, 12:23:52 PM1/20/17
to Puppet Users
Thanks for the code. 

What I am trying to find is the correct way to use what Puppet has already defined in the code base to handle processing everything into a hash of key/value pairs inside of the class, if possible. 

It seems that it should be able to do it, I am just looking at it wrong. 

James Perry

unread,
Jan 20, 2017, 12:28:02 PM1/20/17
to Puppet Users
Thanks.

The reason I have a CSV is that is what is provided from the users out of their own private database where they keep this data. I have to take the detail as it is given. Now I can manually process the data to be how I think I want, but I'm trying to keep this as simple as possible for the other team members (KISS principal).  

For the custom ENC, the new environment is Foreman over top of Puppet.  Can I use a Puppet ENC when Foreman is setup to do that itself? 

With respect to a custom function, what would there be a performance impact by having to have Ruby run that block? 

John Gelnaw

unread,
Jan 22, 2017, 10:56:46 PM1/22/17
to Puppet Users
On Friday, January 20, 2017 at 12:28:02 PM UTC-5, James Perry wrote:
Thanks.

The reason I have a CSV is that is what is provided from the users out of their own private database where they keep this data. I have to take the detail as it is given. Now I can manually process the data to be how I think I want, but I'm trying to keep this as simple as possible for the other team members (KISS principal).  

For the custom ENC, the new environment is Foreman over top of Puppet.  Can I use a Puppet ENC when Foreman is setup to do that itself? 

I have a very complex ENC myself, so the idea of merging the Foreman ENC with my own ENC appeals to me-- Ultimately, they're both just spitting out YAML.

My current line of attack is to have my ENC (configured already within puppet) call the Foreman node.rb script, merge the two data structures and output the resulting YAML, but the migration to puppet 4.x has priority at the moment.

James Perry

unread,
Jan 27, 2017, 9:58:12 AM1/27/17
to Puppet Users
I am looking to see if I can make this work with a define or have to resort to an each loop.  Still hacking away to see what I can find. With the each I can still loop through to try to get to the goal of having a key/value pair to pass on to a template only if the current client matches one of the host names in scope. 

Worst case I will just go ahead to split the variables up accordingly per client and hard code in the module / parms file. 

My goal was to have it so we could just use the CSV file we were given to dynamically build the data. But it may be less costly (from a CPU cycle level) to just go back to the good old days :) 
Reply all
Reply to author
Forward
0 new messages