Hiera vs LDAP

457 views
Skip to first unread message

Trevor Vaughan

unread,
Jul 18, 2012, 5:09:42 PM7/18/12
to puppet...@googlegroups.com
So, I was following the thread "how to conditionally add users to a
virtualized group?" and had a bit of a realization that I'm not quite
sure why Hiera is a better backend than LDAP.

Hiera:

- Stores hierarchical data locally on your system
- Uses YAML
- Integrates with puppet

LDAP

- Stores hierarchical data across potentially multiple systems (think
puppet master scaling and data sync)
- Uses LDIFs
- Needs some glue code written

However, both are hierarchical databases based on 'read often/write
rarely' principals.

Besides the glue code to make LDAP do Hiera-like things, what are the
issues? It also seems that using a well known and supported system,
such as LDAP, would foster greater enterprise support (except in those
places where you have to spawn your own due to insane directory
admins).

And, yes, I know that a hiera back-end could be written to support
LDAP but that would just be an unnecessary data transference if I'm
reading it right.

If you wanted local "fast" copies of the data on all of your puppet
masters (and you do) then a simple LDAP slave would be spawned on each
master.

Thanks,

Trevor

--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvau...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --

Christopher Wood

unread,
Jul 18, 2012, 5:39:36 PM7/18/12
to puppet...@googlegroups.com
(inline, verbosely rhubarbing for the audience not the poster)

On Wed, Jul 18, 2012 at 05:09:42PM -0400, Trevor Vaughan wrote:
> So, I was following the thread "how to conditionally add users to a
> virtualized group?" and had a bit of a realization that I'm not quite
> sure why Hiera is a better backend than LDAP.
>
> Hiera:
>
> - Stores hierarchical data locally on your system
> - Uses YAML

YAML is... something that nearly everything can read/write. People who don't often write YAML (like me) write our YAML in our favourite scripting languages and then export via one of the many export modules. YAML is a simple text file, or four.

> - Integrates with puppet
>
> LDAP
>
> - Stores hierarchical data across potentially multiple systems (think
> puppet master scaling and data sync)
> - Uses LDIFs
> - Needs some glue code written

s/some/much/

The skills to deal with LDAP are much less commonplace than the skills to deal with YAML. YAML is much more standardized than all the different LDAP implementations out there (I have used several).

> However, both are hierarchical databases based on 'read often/write
> rarely' principals.
>
> Besides the glue code to make LDAP do Hiera-like things, what are the
> issues? It also seems that using a well known and supported system,
> such as LDAP, would foster greater enterprise support (except in those
> places where you have to spawn your own due to insane directory
> admins).

LDAP is quite verbose. Compare:

operations:
bind: ssh HOST "hostname;ps -ef | grep named"
date: ssh HOST "echo \`hostname\` \`date\`"
df: ssh HOST "df -h -l"

Versus pseudo-ldif:

dn: dc=operations,dc=c,dc=scripts,dc=mycompany,dc=com
dc: operations
objectclass: yamlthing
item:: YmluZDogc3NoIEhPU1QgImhvc3RuYW1lO3BzIC1lZiB8IGdyZXAgbmFtZWQi
item:: ZGF0ZTogc3NoIEhPU1QgImVjaG8gXGBob3N0bmFtZVxgIFxgZGF0ZVxgIg==
item:: PW4gZGY6IHNzaCBIT1NUICJkZiAtaCAtbCIK

One big difference between LDAP and YAML: You can have ordered attributes (lists) in YAML. There is no guaranteed attribute ordering in LDAP; attributes may be returned in any order, on an implementation-specific basis. (To the best of my knowledge.)

Depending on your access method (and likely implementation details), you may have to implement client-side logic to get your heirarchical values working just right. For instance, which order will the following entries be returned in?

dc=three
dc=two,dc=three
dc=two,dc=two,dc=three
dc=one,dc=two,dc=three

It might be the above order, as I've seen from OpenLDAP. Then the entries' attributes will be returned in any order. Happy parsing.

LDAP also requires another client/server infrastructure, more daemons to patch and maintain, et cetera.

> And, yes, I know that a hiera back-end could be written to support
> LDAP but that would just be an unnecessary data transference if I'm
> reading it right.
>
> If you wanted local "fast" copies of the data on all of your puppet
> masters (and you do) then a simple LDAP slave would be spawned on each
> master.

Replicating between ldap instances isn't simple to set up. (Are you using single-master, hub-spoke, multimaster, etc.?) Some light reading:

http://www.openldap.org/doc/admin24/replication.html

http://directory.fedoraproject.org/wiki/Howto:MultiMasterReplication

Of course, I don't know the point at which the ldap advantages outweigh the setup costs, or how these compare to YAML.

> Thanks,
>
> Trevor
>
> --
> Trevor Vaughan
> Vice President, Onyx Point, Inc
> (410) 541-6699
> tvau...@onyxpoint.com
>
> -- This account not approved for unencrypted proprietary information --
>
> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
>
>

Aaron Grewell

unread,
Jul 18, 2012, 5:42:40 PM7/18/12
to puppet...@googlegroups.com
On Wed, Jul 18, 2012 at 2:09 PM, Trevor Vaughan <tvau...@onyxpoint.com> wrote:
> So, I was following the thread "how to conditionally add users to a
> virtualized group?" and had a bit of a realization that I'm not quite
> sure why Hiera is a better backend than LDAP.
>

In our environment at least, messing around with the LDAP schema is a
non-starter. I can change my Hiera setup any time. That alone makes
it better for me.

Trevor Vaughan

unread,
Jul 18, 2012, 10:00:12 PM7/18/12
to puppet...@googlegroups.com
Thanks for the post, it was quite interesting.

For the attributes, I've seen some reasonable text about storing
serialized JSON in LDAP attributes which should work quite well for
preserving ordering, etc...

In terms of infrastructure, if you've got any significant number of
users, you're probably already using LDAP, whether or not it's bound
with Kerberos so this shouldn't be something that is rocket surgery
for an enterprise scenario.

Yes, I *completely* agree that for smaller infrastructures, Hiera
makes a lot of sense but I'm starting to feel people pushing at the
edges in such a way that we're going to start reinventing the wheel in
terms of needing multi-master replication, synchronization, localized
offline caching, etc....

LDAP is certainly verbose, but there are enough libraries to be able
to wrap it in pretty much any data structure that you like so that
shouldn't be too bad overall.

Interesting conversation.

Thanks,

Trevor

Jason Edgecombe

unread,
Jul 18, 2012, 8:03:44 PM7/18/12
to puppet...@googlegroups.com
Along that line, it would be more flexible to write an LDAP backend for
hiera. That way, you could use company-wide attributes and overwirte
them or augment them in hiera, as needed.

Jason

jcbollinger

unread,
Jul 19, 2012, 9:06:43 AM7/19/12
to puppet...@googlegroups.com


On Wednesday, July 18, 2012 4:09:42 PM UTC-5, Trevor Vaughan wrote:
So, I was following the thread "how to conditionally add users to a
virtualized group?" and had a bit of a realization that I'm not quite
sure why Hiera is a better backend than LDAP.

Hiera:

- Stores hierarchical data locally on your system
- Uses YAML
- Integrates with puppet

That's not actually a very good description of Hiera.  It describes the combination of the built-in data backend shipped with hiera, whose use is optional, with the Puppet integration layer, which is packaged separately.  None of it describes the hiera core, which I would describe as a framework for selecting and combining data from various sources based on a user-defined priority hierarchy.
 

LDAP

- Stores hierarchical data across potentially multiple systems (think
puppet master scaling and data sync)
- Uses LDIFs
- Needs some glue code written

However, both are hierarchical databases based on 'read often/write
rarely' principals.

No, I don't think they're actually very comparable.  The Hiera core is an hierarchically-oriented data access framework, compatible (in principle) with a wide variety of databases.  The two are not on the same level.  Indeed, as you acknowledged, it would be feasible to write a hiera back-end that accessed data from an LDAP directory.  That would be comparable to hiera's YAML back-end, but not to the hiera core.
 

Besides the glue code to make LDAP do Hiera-like things, what are the
issues? It also seems that using a well known and supported system,
such as LDAP, would foster greater enterprise support (except in those
places where you have to spawn your own due to insane directory
admins).

I'm not saying that LDAP wouldn't be a good data repository for Puppet.  You're quite right that it is well understood and widely deployed.  I am certain that there are LDAP backends out there for extlookup(), and perhaps also for Hiera.  There are probably also be custom data access functions for LDAP in some shops.  I don't think there are any fundamental issues blocking its use with Puppet.

On the other hand, LDAP is far too heavyweight to be viable as the singular primary data source for Puppet, and Puppet needs such a thing to make an ecosystem of modules viable. 
 

And, yes, I know that a hiera back-end could be written to support
LDAP but that would just be an unnecessary data transference if I'm
reading it right.

I'm not sure what you mean by that, but the advantages of accessing LDAP via hiera include:
  1. Abstraction.  Module authors don't need to care where data comes from.
  2. Pluggability.  The LDAP back-end could be swapped out for a different one. That's useful for testing, or for non-LDAP shops to use the same modules as LDAP shops.
  3. Combining and overriding data.  Data packaged with modules or added / overridden by admins don't need to be recorded in LDAP.
Depending on how you do your accounting, there may not even be any more layers in an hiera-based stack.  Any way around, getting the data into puppet means going through a Puppet custom function that wraps an LDAP client that gets data from your directory.


John

R.I.Pienaar

unread,
Jul 19, 2012, 9:12:17 AM7/19/12
to puppet...@googlegroups.com


----- Original Message -----
> From: "jcbollinger" <John.Bo...@stJude.org>
> To: puppet...@googlegroups.com
> Sent: Thursday, July 19, 2012 2:06:43 PM
> Subject: [Puppet Users] Re: Hiera vs LDAP
>
>
>> And, yes, I know that a hiera back-end could be written to support
>> LDAP but that would just be an unnecessary data transference if I'm
>> reading it right.
>
>
> I'm not sure what you mean by that, but the advantages of accessing
> LDAP via hiera include:
>
> 1. Abstraction. Module authors don't need to care where data
> comes from.
> 2. Pluggability. The LDAP back-end could be swapped out for a
> different one. That's useful for testing, or for non-LDAP shops
> to use the same modules as LDAP shops.
> 3. Combining and overriding data. Data packaged with modules or
> added / overridden by admins don't need to be recorded in LDAP.
>
> Depending on how you do your accounting, there may not even be any
> more layers in an hiera-based stack. Any way around, getting the
> data into puppet means going through a Puppet custom function that
> wraps an LDAP client that gets data from your directory.

And as of puppet 3 you will get built in integration of hiera into
parameterized classes - without the need for calling any custom
functions or assigning local variables. This means all parameterized
class written to date will automatically become hiera enabled.

And by extension become LDAP enabled if you just wrote an LDAP backend
for hiera. The cost of an additional abstraction really just doesnt
feature when considering this and Johns excellent points.

Trevor Vaughan

unread,
Jul 19, 2012, 12:30:54 PM7/19/12
to puppet...@googlegroups.com
All,

Thanks for the responses, I'm certainly seeing the benefits of Hiera
over a direct LDAP interface but I do think that an LDAP interface and
schema should be on the roadmap.

Also, you might want to summarize some of this and add it to the Hiera Wiki.

Thanks!

Trevor
Reply all
Reply to author
Forward
0 new messages