Hiera should have an save API

219 views
Skip to first unread message

Kelsey Hightower

unread,
May 1, 2012, 12:31:28 PM5/1/12
to puppe...@googlegroups.com
I'm thinking of adding a new save API to Hiera. The idea is that Hiera should provide an iterface for saving data, which should make it easy for front-end tools to interact with backends that support saving data.

An example of how this might work is shown in the commit message in the following pull request:

I have a ticket here:

Do people think this is a good idea? I see this as a foundational bit for building UI's on top of Hiera. 

Daniel Pittman

unread,
May 1, 2012, 1:17:53 PM5/1/12
to puppe...@googlegroups.com
On Tue, May 1, 2012 at 09:31, Kelsey Hightower <kel...@puppetlabs.com> wrote:

> I'm thinking of adding a new save API to Hiera. The idea is that Hiera
> should provide an iterface for saving data, which should make it easy for
> front-end tools to interact with backends that support saving data.

Why does it make it any easier than having tools with already know
about their back-end semantics directly managing that data? It has
substantial limitations (eg: no user concept, no credentials, and no
way to determine the appropriate set based on the back-end.)

It doesn't document anything of substance about the API: will save be
fast or slow? Can save deadlock? How does it differentiate different
operations on the same key? How do I determine the hierarchy - or do
I need to implicitly know that to use this?

I have no idea how it works across machines. Can I use this from the
dashboard when that is installed on a different machine to the master?
How do changes propagate after `save` is called when I have multiple
masters?

It also makes it impossible to use this in any meaningful UI: there is
absolutely no mechanism to determine what the failure was. Did we
fail because we got the hierarchy wrong, or the backend wrong, or
something else failed? Should I just retry, or give up?

> Do people think this is a good idea? I see this as a foundational bit for
> building UI's on top of Hiera.

The principal is reasonable, but this isn't even close to a proposal
for a save API that works in the real world.

--
Daniel Pittman
⎋ Puppet Labs Developer – http://puppetlabs.com
♲ Made with 100 percent post-consumer electrons

R.I.Pienaar

unread,
May 1, 2012, 1:32:05 PM5/1/12
to puppe...@googlegroups.com


----- Original Message -----
> From: "Daniel Pittman" <dan...@puppetlabs.com>
> To: puppe...@googlegroups.com
> Sent: Tuesday, May 1, 2012 6:17:53 PM
> Subject: Re: [Puppet-dev] Hiera should have an save API
>
> On Tue, May 1, 2012 at 09:31, Kelsey Hightower
> <kel...@puppetlabs.com> wrote:
>
> > I'm thinking of adding a new save API to Hiera. The idea is that
> > Hiera
> > should provide an iterface for saving data, which should make it
> > easy for
> > front-end tools to interact with backends that support saving data.
>
> Why does it make it any easier than having tools with already know
> about their back-end semantics directly managing that data? It has
> substantial limitations (eg: no user concept, no credentials, and no
> way to determine the appropriate set based on the back-end.)

This has been my main concern too and why I never implemented anything like this
in the first place - I think the data being queried is best modelled elsewhere.
The data is best created at the time when you classify a node in that same UI -
hiera should query that data but not know too much about the visual aspects of
it.

This would be usable for small installs who just use the json/yaml backends and
have no node classification system (other than maybe hand editing these files
and using hiera_include or something). People who are already happy to just
hand hack JSON/YAML anyway.

Having to know the hierarchy on the CLI isn't that great an experience and neither
is typing complex data like hashes, arrays and such.

In mcollective I can type complex data on the CLI because the DDL describes the
data - I know when you typed "1" that it should be a number or a boolean and I
convert that for you. Hiera has no data description, its free form so even with
a face or whatever it just would be limited use - soon you'll be editing JSON
or YAML again to represent arrays of hashes, thats wrong.

>
> It doesn't document anything of substance about the API: will save be
> fast or slow? Can save deadlock? How does it differentiate
> different operations on the same key? How do I determine the hierarchy - or do
> I need to implicitly know that to use this?

This is impossible to answer - the save API has no idea about the backends.

We *could* in theory extend backends to provide all these answers through some
kind of flag about the backend but I do not think we should.

Backends are easy to write and understand and so people do actually write them
vs some other plugins we might have like providers or types. It's a pretty thin
line. Its a case of could but imo should not.

>
> I have no idea how it works across machines. Can I use this from the
> dashboard when that is installed on a different machine to the
> master?
> How do changes propagate after `save` is called when I have multiple
> masters?
>
> It also makes it impossible to use this in any meaningful UI: there
> is
> absolutely no mechanism to determine what the failure was. Did we
> fail because we got the hierarchy wrong, or the backend wrong, or
> something else failed? Should I just retry, or give up?
>
> > Do people think this is a good idea? I see this as a foundational
> > bit for
> > building UI's on top of Hiera.
>
> The principal is reasonable, but this isn't even close to a proposal
> for a save API that works in the real world.

I would love to see a solution for this but its deceptively hard to do
and I think ultimately better solved by exposing a REST API into your
dasbhoard/foreman/etc where you have RBAC and the other points you
raised

Daniel Pittman

unread,
May 1, 2012, 4:56:49 PM5/1/12
to puppe...@googlegroups.com
On Tue, May 1, 2012 at 10:32, R.I.Pienaar <r...@devco.net> wrote:
> ----- Original Message -----
>> From: "Daniel Pittman" <dan...@puppetlabs.com>
>> To: puppe...@googlegroups.com
>> Sent: Tuesday, May 1, 2012 6:17:53 PM
>> Subject: Re: [Puppet-dev] Hiera should have an save API
>>
>> On Tue, May 1, 2012 at 09:31, Kelsey Hightower
>> <kel...@puppetlabs.com> wrote:
>>
>> > I'm thinking of adding a new save API to Hiera. The idea is that
>> > Hiera
>> > should provide an iterface for saving data, which should make it
>> > easy for
>> > front-end tools to interact with backends that support saving data.
>>
>> Why does it make it any easier than having tools with already know
>> about their back-end semantics directly managing that data?  It has
>> substantial limitations (eg: no user concept, no credentials, and no
>> way to determine the appropriate set based on the back-end.)
>
> This has been my main concern too and why I never implemented anything like this
> in the first place - I think the data being queried is best modelled elsewhere.
> The data is best created at the time when you classify a node in that same UI -
> hiera should query that data but not know too much about the visual aspects of
> it.

I am not certain they need to be tied together in time like that, but
I agree that Hiera does better as middleware for query between some
rich backend that owns "how to update me", and the consumers of the
data.

> This would be usable for small installs who just use the json/yaml backends and
> have no node classification system (other than maybe hand editing these files
> and using hiera_include or something).  People who are already happy to just
> hand hack JSON/YAML anyway.

...even if the "how to update me" is to use a text editor.

>> > Do people think this is a good idea? I see this as a foundational
>> > bit for building UI's on top of Hiera.
>>
>> The principal is reasonable, but this isn't even close to a proposal
>> for a save API that works in the real world.
>
> I would love to see a solution for this but its deceptively hard to do
> and I think ultimately better solved by exposing a REST API into your
> dasbhoard/foreman/etc where you have RBAC and the other points you
> raised

That is where I would generally lean - I think of the Puppet / Hiera /
"modify data for Hiera" thing as a cycle - three parts, talking to
each other, and each separate. Having a smarted back-end like
Dashboard or Foreman feels like a better fit, overall, than trying to
put abstraction in place over save.

Kelsey Hightower

unread,
May 1, 2012, 7:55:34 PM5/1/12
to puppe...@googlegroups.com
On Tuesday, May 1, 2012 1:17:53 PM UTC-4, Daniel Pittman wrote:
On Tue, May 1, 2012 at 09:31, Kelsey Hightower <kel...@puppetlabs.com> wrote:

> I'm thinking of adding a new save API to Hiera. The idea is that Hiera
> should provide an iterface for saving data, which should make it easy for
> front-end tools to interact with backends that support saving data.

Why does it make it any easier than having tools with already know
about their back-end semantics directly managing that data?  It has
substantial limitations (eg: no user concept, no credentials, and no
way to determine the appropriate set based on the back-end.)

It seems that we can make the same argument for all of Hiera. The goal with the save API is to provide a simple interface for saving data, just like looking up data. When users write backends they can implement the save method however they see fit.
 


It doesn't document anything of substance about the API: will save be
fast or slow?  Can save deadlock?  How does it differentiate different
operations on the same key?  How do I determine the hierarchy - or do
I need to implicitly know that to use this?

The save method works for a specific backend + source (level in the hierarchy) combination. Basically the reverse of the lookup operation.
 

I have no idea how it works across machines.  Can I use this from the
dashboard when that is installed on a different machine to the master?
 How do changes propagate after `save` is called when I have multiple
masters?

Seems we have the same issues we have with the lookup method.
 

It also makes it impossible to use this in any meaningful UI: there is
absolutely no mechanism to determine what the failure was.  Did we
fail because we got the hierarchy wrong, or the backend wrong, or
something else failed?  Should I just retry, or give up?

Great suggestions, this can be added.
 


> Do people think this is a good idea? I see this as a foundational bit for
> building UI's on top of Hiera.

The principal is reasonable, but this isn't even close to a proposal
for a save API that works in the real world.

No problem back to the drawing board.

Kelsey Hightower

unread,
May 1, 2012, 8:52:15 PM5/1/12
to puppe...@googlegroups.com
On Tuesday, May 1, 2012 1:32:05 PM UTC-4, R.I. Pienaar wrote:


----- Original Message -----
> From: "Daniel Pittman" <dan...@puppetlabs.com>
> To: puppe...@googlegroups.com
> Sent: Tuesday, May 1, 2012 6:17:53 PM
> Subject: Re: [Puppet-dev] Hiera should have an save API
>
> On Tue, May 1, 2012 at 09:31, Kelsey Hightower
> <kel...@puppetlabs.com> wrote:
>
> > I'm thinking of adding a new save API to Hiera. The idea is that
> > Hiera
> > should provide an iterface for saving data, which should make it
> > easy for
> > front-end tools to interact with backends that support saving data.
>
> Why does it make it any easier than having tools with already know
> about their back-end semantics directly managing that data?  It has
> substantial limitations (eg: no user concept, no credentials, and no
> way to determine the appropriate set based on the back-end.)

This has been my main concern too and why I never implemented anything like this
in the first place - I think the data being queried is best modelled elsewhere.
The data is best created at the time when you classify a node in that same UI -
hiera should query that data but not know too much about the visual aspects of
it.

If this is the true goal of Hiera then we should call this out. I agree with all these points
if the goal of Hiera is to be a read-only tool.
 

This would be usable for small installs who just use the json/yaml backends and
have no node classification system (other than maybe hand editing these files
and using hiera_include or something).  People who are already happy to just
hand hack JSON/YAML anyway.

This would be nice for other backends like the redis one as well. But I guess it would be
simple enough to delegate the responablity to the end-user to build a full solution, kinda
like the ENC model :)

I think it would be helpful to give users the ability to use Hiera as an abstraction layer
for reads as well as writes. Hiera provides a really simple way of doing things based on
key/value pairs, backends, and sources (namespaces) in the hierarchy. I consider this 
a big win for "small" or simple use cases. I think users with larger or complex installs
would perfer writing an ENC vs using Hiera at all.
 
Having to know the hierarchy on the CLI isn't that great an experience and neither
is typing complex data like hashes, arrays and such.

In mcollective I can type complex data on the CLI because the DDL describes the
data - I know when you typed "1" that it should be a number or a boolean and I
convert that for you.  Hiera has no data description, its free form so even with
a face or whatever it just would be limited use - soon you'll be editing JSON
or YAML again to represent arrays of hashes, thats wrong.

This is a problem no matter what. Building a CLI tool for any data tool will have these
problems. I can't see how this is specific to Hiera having a save function. We do not
have to expose this via the CLI until we have a good solution. In the meanwhile people
can use it through the ruby API.
 
>
> It doesn't document anything of substance about the API: will save be
> fast or slow?  Can save deadlock?  How does it differentiate
> different operations on the same key?  How do I determine the hierarchy - or do
> I need to implicitly know that to use this?

This is impossible to answer - the save API has no idea about the backends.

We *could* in theory extend backends to provide all these answers through some
kind of flag about the backend but I do not think we should.

Backends are easy to write and understand and so people do actually write them
vs some other plugins we might have like providers or types.  It's a pretty thin
line. Its a case of could but imo should not.

I vote to keep the save API as simple as the lookup one. We have the same issues
about speed and errors with lookup, but people still find value with it.
 

>
> I have no idea how it works across machines.  Can I use this from the
> dashboard when that is installed on a different machine to the
> master?
>  How do changes propagate after `save` is called when I have multiple
> masters?
>
> It also makes it impossible to use this in any meaningful UI: there
> is
> absolutely no mechanism to determine what the failure was.  Did we
> fail because we got the hierarchy wrong, or the backend wrong, or
> something else failed?  Should I just retry, or give up?
>
> > Do people think this is a good idea? I see this as a foundational
> > bit for
> > building UI's on top of Hiera.
>
> The principal is reasonable, but this isn't even close to a proposal
> for a save API that works in the real world.

I would love to see a solution for this but its deceptively hard to do
and I think ultimately better solved by exposing a REST API into your
dasbhoard/foreman/etc where you have RBAC and the other points you
raised

The idea is that this simple save function would be behind a REST API like
the one you mention. Do the hard work of modeling and capturing data then
make a call to Hiera#save. If a REST API for Hiera is needed we can build 
one.

Thanks for the feedback on this by the way!
 

Daniel Pittman

unread,
May 8, 2012, 5:59:05 PM5/8/12
to puppe...@googlegroups.com
On Tue, May 1, 2012 at 5:52 PM, Kelsey Hightower <kel...@puppetlabs.com> wrote:
> On Tuesday, May 1, 2012 1:32:05 PM UTC-4, R.I. Pienaar wrote:
>> ----- Original Message -----
>> > From: "Daniel Pittman" <dan...@puppetlabs.com>
>> > To: puppe...@googlegroups.com
>> > Sent: Tuesday, May 1, 2012 6:17:53 PM
>> > Subject: Re: [Puppet-dev] Hiera should have an save API
>> >
>> > On Tue, May 1, 2012 at 09:31, Kelsey Hightower
>> > <kel...@puppetlabs.com> wrote:
>> >
>> > > I'm thinking of adding a new save API to Hiera. The idea is that Hiera
>> > > should provide an iterface for saving data, which should make it easy for
>> > > front-end tools to interact with backends that support saving data.
>> >
>> > Why does it make it any easier than having tools with already know
>> > about their back-end semantics directly managing that data?  It has
>> > substantial limitations (eg: no user concept, no credentials, and no
>> > way to determine the appropriate set based on the back-end.)
>>
>> This has been my main concern too and why I never implemented anything
>> like this in the first place - I think the data being queried is best modelled
>> elsewhere. The data is best created at the time when you classify a node in that same
>> UI - hiera should query that data but not know too much about the visual
>> aspects of it.
>
> If this is the true goal of Hiera then we should call this out. I agree with
> all these points if the goal of Hiera is to be a read-only tool.

...or, perhaps, we should say that *this query API* is a read-only
abstraction, inappropriate for writes, even if the Hiera project
includes mechanisms to update data that are different to the query
mechanism.

>> This would be usable for small installs who just use the json/yaml
>> backends and have no node classification system (other than maybe hand editing these
>> files and using hiera_include or something).  People who are already happy to
>> just hand hack JSON/YAML anyway.
>
> This would be nice for other backends like the redis one as well. But I
> guess it would be simple enough to delegate the responablity to the end-user to build a full
> solution, kinda like the ENC model :)
>
> I think it would be helpful to give users the ability to use Hiera as an
> abstraction layer for reads as well as writes. Hiera provides a really simple way of doing
> things based on key/value pairs, backends, and sources (namespaces) in the hierarchy. I
> consider this a big win for "small" or simple use cases. I think users with larger or
> complex installs would perfer writing an ENC vs using Hiera at all.

Longer term that is, I think, a false dichotomy: small and large sites
want to classify nodes, and to bind data to manifests.

They don't want different tools to do that, they just want it to work.

We can absolutely satisfy both, and should seek to do so in a way that
scales when someone moves from being "small" to being "large".


>> > The principal is reasonable, but this isn't even close to a proposal
>> > for a save API that works in the real world.
>>
>> I would love to see a solution for this but its deceptively hard to do
>> and I think ultimately better solved by exposing a REST API into your
>> dasbhoard/foreman/etc where you have RBAC and the other points you
>> raised
>
> The idea is that this simple save function would be behind a REST API like
> the one you mention. Do the hard work of modeling and capturing data then
> make a call to Hiera#save. If a REST API for Hiera is needed we can build
> one.

...but the save function proposed is too abstract from the reality of
data storage to be able to do that. Each backend needs additional
context - or someone to write a custom back-end for their site, ever
time - to be effective.

This is a good idea, but at the wrong level of abstraction.

Jeff McCune

unread,
May 8, 2012, 7:04:52 PM5/8/12
to puppe...@googlegroups.com
On Tue, May 8, 2012 at 2:59 PM, Daniel Pittman <dan...@puppetlabs.com> wrote:
> The idea is that this simple save function would be behind a REST API like
> the one you mention. Do the hard work of modeling and capturing data then
> make a call to Hiera#save. If a REST API for Hiera is needed we can build
> one.

...but the save function proposed is too abstract from the reality of
data storage to be able to do that.  Each backend needs additional
context - or someone to write a custom back-end for their site, ever
time - to be effective.

What additional context is necessary?

Why would custom back ends be necessary if the default one we use supports writability?
 
This is a good idea, but at the wrong level of abstraction.

I'm not yet convinced this is the wrong level of abstraction.  If I understand your original email, you mentioned building tools that understand the semantics of specific back end storage systems in order to write data into the system.  That seems to defeat the whole point of a robust plugin system.

If read operations are good enough to warrant abstraction, surely write operations are too.  Right?

-Jeff

Andrew Parker

unread,
May 8, 2012, 7:17:33 PM5/8/12
to puppe...@googlegroups.com
I'll chime in on this now, I suppose.

You are right that both read and write operations are good for abstraction. The problem comes that comes into play is that read and write operations usually end up with completely different needs for their abstractions and so combining them together in a single system can be problematic (this is the basis for the CQRS architectural design). So although you can combine the write model and the read model in the same application, they often will have little to do with each other and so you might as well keep them separate.

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.

Jeff McCune

unread,
May 8, 2012, 7:26:01 PM5/8/12
to puppe...@googlegroups.com
On Tue, May 8, 2012 at 4:17 PM, Andrew Parker <an...@puppetlabs.com> wrote:
I'll chime in on this now, I suppose.

You are right that both read and write operations are good for abstraction. The problem comes that comes into play is that read and write operations usually end up with completely different needs for their abstractions and so combining them together in a single system can be problematic (this is the basis for the CQRS architectural design). So although you can combine the write model and the read model in the same application, they often will have little to do with each other and so you might as well keep them separate.


Ah I see what you mean now.

It's not that we shouldn't have a write API, it's that it probably shouldn't be all tangled up in the query API.

Thanks for the reply Andrew.

-Jeff

Kelsey Hightower

unread,
May 9, 2012, 5:26:41 PM5/9/12
to puppe...@googlegroups.com
On Tuesday, May 8, 2012 7:17:33 PM UTC-4, Andy Parker wrote:
I'll chime in on this now, I suppose.

You are right that both read and write operations are good for abstraction. The problem comes that comes into play is that read and write operations usually end up with completely different needs for their abstractions and so combining them together in a single system can be problematic (this is the basis for the CQRS architectural design). So although you can combine the write model and the read model in the same application, they often will have little to do with each other and so you might as well keep them separate.

Can you clarify "separate"? Hiera is the thing that has a plugin system and delates lookups to the backend (plugin). The plugin returns a response. The plugin can be simple or very complex in how it goes about fetching the data. The only thing Hiera provides is a common interface for doing lookups. Based on your response, I'm still not clear why we cannot do the same thing for save. I can see your argument for why this is a bad thing in general, but why is it a bad thing for Hiera? 
 

On May 8, 2012, at 4:04 PM, Jeff McCune wrote:

On Tue, May 8, 2012 at 2:59 PM, Daniel Pittman <dan...@puppetlabs.com> wrote:
> The idea is that this simple save function would be behind a REST API like
> the one you mention. Do the hard work of modeling and capturing data then
> make a call to Hiera#save. If a REST API for Hiera is needed we can build
> one.

...but the save function proposed is too abstract from the reality of
data storage to be able to do that.  Each backend needs additional
context - or someone to write a custom back-end for their site, ever
time - to be effective.

What additional context is necessary?

Why would custom back ends be necessary if the default one we use supports writability?
 
This is a good idea, but at the wrong level of abstraction.

I'm not yet convinced this is the wrong level of abstraction.  If I understand your original email, you mentioned building tools that understand the semantics of specific back end storage systems in order to write data into the system.  That seems to defeat the whole point of a robust plugin system.

If read operations are good enough to warrant abstraction, surely write operations are too.  Right?

-Jeff

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+unsubscribe@googlegroups.com.

Andrew Parker

unread,
May 9, 2012, 7:30:42 PM5/9/12
to puppe...@googlegroups.com
Jeff came over and asked Daniel and I about this. He might be able to give his understanding, as well. My response is below.

On May 9, 2012, at 2:26 PM, Kelsey Hightower wrote:

On Tuesday, May 8, 2012 7:17:33 PM UTC-4, Andy Parker wrote:
I'll chime in on this now, I suppose.

You are right that both read and write operations are good for abstraction. The problem comes that comes into play is that read and write operations usually end up with completely different needs for their abstractions and so combining them together in a single system can be problematic (this is the basis for the CQRS architectural design). So although you can combine the write model and the read model in the same application, they often will have little to do with each other and so you might as well keep them separate.

Can you clarify "separate"? Hiera is the thing that has a plugin system and delates lookups to the backend (plugin). The plugin returns a response. The plugin can be simple or very complex in how it goes about fetching the data. The only thing Hiera provides is a common interface for doing lookups. Based on your response, I'm still not clear why we cannot do the same thing for save. I can see your argument for why this is a bad thing in general, but why is it a bad thing for Hiera? 
 

Hiera can do whatever it wants. If it adds an interface for how to write data back to all of its backends, however, it will be either creating a save interface that provides no abstraction or it will limit what is a valid backend to those that can conform to the save interface. Some backends may not be able to be written to at all, in which case you have the problem of backends refusing to implement a portion of the interface. If hiera wants to take a very specific, opinionated stance as to what a good data storage mechanism is for it, then it can, but it will do that at the expense of being able to reasonably implement backends for simply reading from anything that does not fit the specific model.

If the puppet layer wants to stay independent of the opinions of the hiera system, then those two issues cannot be combined at the level of puppet.


--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To view this discussion on the web visit https://groups.google.com/d/msg/puppet-dev/-/6j67wPa8zBQJ.

To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages