puppet module masquerading

67 views

Skip to first unread message

Joachim Thuau

unread,

Jun 19, 2014, 9:01:40 PM6/19/14

to puppe...@googlegroups.com

(Felix - my apologies, i'm just going to start a new thread :)

As requested, a new thread to discuss the possibility of having a module masquerading feature.

Have a look at PUP-2811 -- I'm going to try to keep that updated with the output of this discussion.

What I am after is something akin to the debian package "provides" property. When a Debian system needs an MTA, most packages shouldn't care what MTA is installed, just that "MTA" as a feature is installed. I can install any of exim, postfix, sendmail, qmail. The package name is still "exim" or "postfix" or whatever, but they all provide a "mta" package (which doesn't really exists, it's just so that it can be used as a dependency). So when logwatch needs to make sure a MTA is installed to send mails about the logs, it doesn't care which one, as long as *something* provides a mail feature.

In the context of puppet modules, we could have competing modules offering similar features, and a module that needs that feature, but doesn't care which one is being used, as long as one *is*.

The example I have is around this:

Sssd and nslcd are both systems to get user information out of ldap/active directory. They can't be installed together, and some other modules needs to make sure some sort of auth has been setup prior doing the changes...

One module for sssd:

class sssd as auth {

# setup sssd

}

One module for nslcd:

class nslcd as auth {

# setup nslcd

}

And one module that needs one of them:

class ssh_authz {

require auth

# setup some ssh authz that require one of the auth module to be applied

}

Then we have some nodes:

node nodeSSSD {

# do the right thing here... even if this is coming from ENC in whatever order

include sssd

include ssh_authz

}

node nodeNSLCD {

# still should be able to apply in the right sequence based on the require in ssh_authz

include ssh_authz

include nslcd

}

node broken-missing {

# fails because nothing provides "auth" that's required...

include ssh_authz

}

node broken-dups {

# duplicate resources, as both sssd and nslcd show up as "auth"

include sssd

include ssh_authz

include nslcd

}

I hope this could be useful especially around modules that potentially provide similar features. If i provide a module for installing a piece of software that requires a DB, we could simply say that we need some DB module (and maybe some default, if none exists for the "puppet module install". Maybe I have a "mysql" module that's pretty simple, and maybe there is another one with much more control, and just need one of them for that purpose...

hoping for a lively discussion around this...

Thanks,

Jok

John Bollinger

unread,

Jun 20, 2014, 3:16:03 PM6/20/14

to puppe...@googlegroups.com

I think there is potential for a feature such as this, but for the sake of argument, what are the advantages of doing it that way, as opposed to via classes? For instance:

class ssh_authz {
require 'auth'
}

class auth ($implementation = 'sssd') {
require $implementation
}

Or if a wrapper class is not a sufficient aliasing mechanism then is it really best to put the facility alias in the class definition? For that to be reusable pretty much requires wide consensus on the facility aliases and their meaning. Wouldn't it be better to associate classes with aliases at the point of declaration? For example:

node nodeNSLCD {

include 'ssh_authz'

include 'nslcd' as 'auth'

}

(I suspect that particular syntax might be difficult for the parser to digest, but it conveys the idea.)

John

Henrik Lindberg

unread,

Jun 23, 2014, 8:45:05 AM6/23/14

to puppe...@googlegroups.com

I see this as an essential feature to support maintainable modules.
And by "this", I mean a mechanism that decouples the wanted Element and
its Container (where the Element is the thing we want/need/require, and
the Container is the named thing it is in, typically a Module).

This can be achieved by modules describing their provided and required
capabilities. These are described as name spaced names and they are
versioned. This creates an open ended system that can describe
dependencies on either modules (like now) by using a name in the Module
namespace, a gem in a gem namespace, a puppet class in a puppet name
space, new types of dependencies can be added etc.

As a very nice side effect the ability to describe provided capabilities
makes it possible for a described entity to list the same capability
multiple times with different versions, or version ranges, thus making
it possible to describe components as "backwards compatible", if not in
full, for a smaller portion of the services it contains.

The difficulty with this (any other similar solution) is the resolution
of the dependencies as it requires something like a SAT solver.

I worked on the implementation of such a system for Eclipse (Eclipse p2)
which has been in use for a couple of years now as *the* software
update/configuration mechanism for the Ecipse ecosystem. (If you are a
Puppet Labs Geppetto user you have already used it, as it is what
updates Geppetto with new releases).

While such a system (as p2) is very flexible and powerful, the main
problem is to explain why something is not installable (complete /
updateable) - i.e. when there are parts missing, or when there are
ambiguities. Even though p2 has such capabilities (thanks to the sat
solver in use) it is often still a bit of a puzzle when facing non
regular configurations (or tracking down the metadata that has bad
consequences).

If we do not attempt to solve the resolution, and simply validate the
constraints, the problem is much much simpler, but you also do not get
any help configuring a solution (except being slapped when the
configuration is wrong/incomplete).

It would be very interesting to conduct an experiment using p2 to
describe configurations in the puppet domain. The p2 system can be used
for other things than Java/OSGi/Eclipe and it is supported in the Nexus
repository manager.

- henrik
--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

John Bollinger

unread,

Jun 23, 2014, 1:18:31 PM6/23/14

to puppe...@googlegroups.com

On Friday, June 20, 2014 2:16:03 PM UTC-5, John Bollinger wrote:

I think there is potential for a feature such as this, but for the sake of argument, what are the advantages of doing it that way, as opposed to via classes?

Also, have you investigated how far in this direction the existing 'alias' metaparameter can take you?

John

John Bollinger

unread,

Jun 23, 2014, 3:48:27 PM6/23/14

to puppe...@googlegroups.com

On Monday, June 23, 2014 7:45:05 AM UTC-5, henrik lindberg wrote:

I see this as an essential feature to support maintainable modules.
And by "this", I mean a mechanism that decouples the wanted Element and
its Container (where the Element is the thing we want/need/require, and
the Container is the named thing it is in, typically a Module).

Agreed.

This can be achieved by modules describing their provided and required
capabilities. These are described as name spaced names and they are
versioned.

That's so easy to say, yet so fraught with difficulties. In particular, the biggest issues I see revolve around the meanings of capability names. Capability names defined and maintained by the modules that require them present a potential problem because capability provider modules can't be expected to know and declare all the capabilities that they in fact provide.

Running with the authentication example, if module 'ssh' requires capability ssh::authentication::service, then why should it be the responsibility of nsssd and nslcd to know about that in order to declare that they provide it? And then, what if I switch out puppetlabs-ssh for example42-ssh, which happens also to require a capability of the same name? What system or procedure can make sure that puppetlabs and example42 agree on the meaning of that capability?

I think the capability consumer must declare what capabilities it requires, yes, but it must be up to the user to bind capability providers to the names. At least, it must be within the user's power to do so. Indeed, I'm having trouble how the desired decoupling is achieved any other way.

This creates an open ended system that can describe
dependencies on either modules (like now) by using a name in the Module
namespace, a gem in a gem namespace, a puppet class in a puppet name
space, new types of dependencies can be added etc.

As a very nice side effect the ability to describe provided capabilities
makes it possible for a described entity to list the same capability
multiple times with different versions, or version ranges, thus making
it possible to describe components as "backwards compatible", if not in
full, for a smaller portion of the services it contains.

So far, this sounds a lot like the dependency features of software packaging formats such as RPM and DEB. But look at what's happened in that space, especially with RPMs: despite the commonality of the format, RPMs are for the most part partitioned by family (RedHat vs. SUSE vs. ...) and distro version, with comparatively poor compatibility across those lines. It think the fact that they have even that much coherency is related to the distro maintainer serving as central heavyweight with strong influence on which capability names are used and what they mean.

The difficulty with this (any other similar solution) is the resolution
of the dependencies as it requires something like a SAT solver.

I think you're saying that given a set of available modules, one needs the equivalent of a SAT solver to find a self-consistent subset that provides a given collection of capabilities. But that's a problem for the user to solve, so in addition to computing an answer (perhaps with the help of a tool), he has the alternative of redefining the problem to make it easy by creating new modules or modifying existing ones.

I guess that could be a problem that you want an enhanced module tool to be able to handle, but that seems a side issue to me. The problem that the catalog builder has to solve is much simpler: whether a given collection of classes and resources (i.e. the contents of one catalog) has any unsatisfied requirements.

Moreover, I'm inclined to think that even though SAT is in general a hard problem (NP-complete, in fact), the instances likely to arise in a Puppet context are all fairly easily computable. There will be few components -- usually just one -- providing any given capability, and few exclusion constraints. I don't think you actually need a very clever SAT solver there.

Also, all that is moot if, as I suggested, it is the user's responsibility to bind provider components to capability names.

I worked on the implementation of such a system for Eclipse (Eclipse p2)
which has been in use for a couple of years now as *the* software
update/configuration mechanism for the Ecipse ecosystem. (If you are a
Puppet Labs Geppetto user you have already used it, as it is what
updates Geppetto with new releases).

While such a system (as p2) is very flexible and powerful, the main
problem is to explain why something is not installable (complete /
updateable) - i.e. when there are parts missing, or when there are
ambiguities. Even though p2 has such capabilities (thanks to the sat
solver in use) it is often still a bit of a puzzle when facing non
regular configurations (or tracking down the metadata that has bad
consequences).

If we do not attempt to solve the resolution, and simply validate the
constraints, the problem is much much simpler, but you also do not get
any help configuring a solution (except being slapped when the
configuration is wrong/incomplete).

At least as a first go, I think it would be fine to stop at validating. That's a pretty natural extension of the current system, and it seems like it would present a fairly low barrier to entry.

It would be very interesting to conduct an experiment using p2 to
describe configurations in the puppet domain. The p2 system can be used
for other things than Java/OSGi/Eclipe and it is supported in the Nexus
repository manager.

It's nice to have a variety of interesting problems from which to choose. :-)

John

Joachim Thuau

unread,

Jun 23, 2014, 7:54:38 PM6/23/14

to puppe...@googlegroups.com

it appears as though you can't use Alias for resource dependencies, which is what i'm after. the best "workaround" i have so far, is to use a class in the middle, where i shove all my logic about picking one or the other class that I need, based on a parameter. that does mean that my "meta class" needs to be aware of all the variations.

having a given module provide a masquerading name would allow for anyone to bring a new implementation for an existing module. say puppetlabs-mysql, for example. what prevents me to create jthuau-mysql-better (not that I can do better) and use that as a drop-in replacement for puppetlabs-mysql ? if i could say "i have this module over here, that can replace this other one", without necessarily have to choose globally which one i use (maybe there are cases where i only want puppetlabs-mysql).

for the record, i'm liking where this is going. a simple "alias" feature (without pushing it too far) would be a great benefits, methinks, even if i need to spend some time scratching my head if i mess up something...

Henrik Lindberg

unread,

Jun 23, 2014, 10:32:36 PM6/23/14

to puppe...@googlegroups.com

On 2014-23-06 21:48, John Bollinger wrote:
>
>
> On Monday, June 23, 2014 7:45:05 AM UTC-5, henrik lindberg wrote:
>
>
> I see this as an essential feature to support maintainable modules.
> And by "this", I mean a mechanism that decouples the wanted Element and
> its Container (where the Element is the thing we want/need/require, and
> the Container is the named thing it is in, typically a Module).
>
>
>
> Agreed.
>
>
>
> This can be achieved by modules describing their provided and required
> capabilities. These are described as name spaced names and they are
> versioned.
>
>
>
> That's so easy to say, yet so fraught with difficulties. In particular,
> the biggest issues I see revolve around the meanings of capability
> names. Capability names defined and maintained by the modules that
> require them present a potential problem because capability provider
> modules can't be expected to know and declare all the capabilities that
> they in fact provide.
>

Well, they do provide everything in the module. The publishing tool
would extract those as puppet capabilities i.e. class x, resource type
y, function f, etc. (each in the module's namespace).

A module that accepts "plugins" declares a required capability
(typically an optional requirement) - i.e. the module supports
additional things, special things if the optional requirement is
fulfilled. It may also have hard requirements.

This works well in practice in the Eclipse ecosystem with many thousands
of components and requirements spread across a large number of projects
(several hundred at Eclipse itself, and thousands elsewhere).

> Running with the authentication example, if module 'ssh' requires
> capability ssh::authentication::service, then why should it be the
> responsibility of nsssd and nslcd to know about that in order to declare
> that they provide it? And then, what if I switch out puppetlabs-ssh for
> example42-ssh, which happens also to require a capability of the same
> name? What system or procedure can make sure that puppetlabs and
> example42 agree on the meaning of that capability?
>

Someone ones the capability name space. They should be named after the
publisher. It is the publisher of the capability name that defines what
it is. This is no different than the definition of any other API. If an
API is something that is a shared concern the involved parties should
collaborate on its definition (again not different from any other API).
Ideally they write a test that validates that something has the
capability it states it has.

> I think the capability consumer must declare what capabilities it
> requires, yes, but it must be up to the user to bind capability
> providers to the names. At least, it must be within the user's power to
> do so. Indeed, I'm having trouble how the desired decoupling is
> achieved any other way.
>

Picking one implementation over another typically means including one
"container of functionality" (in puppet's case a "module") instead of
another in the system's configuration (in puppet's case, include it on
the module path). The difficulty comes when there are two modules that
happens to both provide the same capability, and the system wants to use
other non overlapping parts. In this case, the binding of the capability
that clashes is a problem the user would need to handle.

(In practice from Eclipse, I have not seen this happen. If I had this
problem I would break out the clashing part into a separate container if
I had no other means to control this). I can imagine several ways to
filter out capabilities from modules so only the wanted one is exposed.

The desire here seems to be to say that I need an x, and there is one
called your::x, and another called mine::x (two different capabilities
because the publishers did not map the :x to be instances of the same
capability. Instead, a consumer figures out that "hey I can use either
one of those two 'x' because I only set the parameter "bar" (there are
lots of others, but I only care about what it does when I create an

x {name: bar => 10 }

In this case we bascially just need to be able to bind 'x' to your::x,
or mine::x. This is something the binding system can do (i.e. map one
name to another). The configuration just needs to contain the target of
the binding.

A fancier approach is to define the fact that your::x and mine::x share
a capability and publish this in a module "our:x", make it bind our:x to
either your::x or mine:.x and we can use our::x everywhere in the logic
(except in the our module where we do the binding).

If our::x is a truly common shared concern then maybe it should become
an API in its own right, and the publishers of the two 'x's would agree
to declare that they do indeed publish this capability.

(long story on how it could work...)

> This creates an open ended system that can describe
> dependencies on either modules (like now) by using a name in the Module
> namespace, a gem in a gem namespace, a puppet class in a puppet name
> space, new types of dependencies can be added etc.
>
> As a very nice side effect the ability to describe provided
> capabilities
> makes it possible for a described entity to list the same capability
> multiple times with different versions, or version ranges, thus making
> it possible to describe components as "backwards compatible", if not in
> full, for a smaller portion of the services it contains.
>
>
>
> So far, this sounds a lot like the dependency features of software
> packaging formats such as RPM and DEB. But look at what's happened in
> that space, especially with RPMs: despite the commonality of the format,
> RPMs are for the most part partitioned by family (RedHat vs. SUSE vs.
> ...) and distro version, with comparatively poor compatibility across
> those lines. It think the fact that they have even that much coherency
> is related to the distro maintainer serving as central heavyweight with
> strong influence on which capability names are used and what they mean.
>

Yes, I am aware of such issues. It is a complex domain to start with.

> The difficulty with this (any other similar solution) is the resolution
> of the dependencies as it requires something like a SAT solver.
>
>
>
> I think you're saying that given a set of available modules, one needs
> the equivalent of a SAT solver to find a self-consistent subset that
> provides a given collection of capabilities. But that's a problem for
> the user to solve, so in addition to computing an answer (perhaps with
> the help of a tool), he has the alternative of redefining the problem to
> make it easy by creating new modules or modifying existing ones.
>

yes.

> I guess that could be a problem that you want an enhanced module tool to
> be able to handle, but that seems a side issue to me. The problem that
> the catalog builder has to solve is much simpler: whether a given
> collection of classes and resources (i.e. the contents of one catalog)
> has any unsatisfied requirements.
>

yes (as I also pointed out later) - there are only cases of either
fulfilled (ok), missing, or ambiguously resolved to consider at runtime.
(with a mechanism to block out unwanted things to handle the
ambiguities, and manual "go find something that satisfies the
requirement for what is missing).

The finding of missing things is however a problem esp. if the
granularity of modules increases (which I think will happen, just like
in the Eclipse / OSGi ecosystem) where modules act as fragments,
options, extra conditional features on certain platforms etc. Basically
because it is both easier and cleaner to handle them this way when there
is a solver / provisioning system that handles these cases (rather than
complex conditional logic inside of larger modules that carry
implementations of all the 'what-ifs', and 'also does this on..' features.

> Moreover, I'm inclined to think that even though SAT is in general a
> hard problem (NP-complete, in fact), the instances likely to arise in a
> Puppet context are all fairly easily computable. There will be few
> components -- usually just one -- providing any given capability, and
> few exclusion constraints. I don't think you actually need a very
> clever SAT solver there.
>

That is what the Eclipse community thought at first and the so called
"update manager" then plagued thousands of users for 10 years until it
was replaced.
The fact that SAT solving is typically simple means that it works in
practice at reasonable speed. It is however invaluable for the more
complex cases.

> Also, all that is moot if, as I suggested, it is the user's
> responsibility to bind provider components to capability names.
>

I think you need both - not initially, but I think the Puppet ecosystem
will evolve just like other component based software systems I have seen
(I happen to know the Eclipse one fairly well having been part of it
almost from the start).

> I worked on the implementation of such a system for Eclipse (Eclipse
> p2)
> which has been in use for a couple of years now as *the* software
> update/configuration mechanism for the Ecipse ecosystem. (If you are a
> Puppet Labs Geppetto user you have already used it, as it is what
> updates Geppetto with new releases).
>
> While such a system (as p2) is very flexible and powerful, the main
> problem is to explain why something is not installable (complete /
> updateable) - i.e. when there are parts missing, or when there are
> ambiguities. Even though p2 has such capabilities (thanks to the sat
> solver in use) it is often still a bit of a puzzle when facing non
> regular configurations (or tracking down the metadata that has bad
> consequences).
>
> If we do not attempt to solve the resolution, and simply validate the
> constraints, the problem is much much simpler, but you also do not get
> any help configuring a solution (except being slapped when the
> configuration is wrong/incomplete).
>
>
>
> At least as a first go, I think it would be fine to stop at validating.
> That's a pretty natural extension of the current system, and it seems
> like it would present a fairly low barrier to entry.
>

I think so to. And it has to be done at runtime even if a SAT solver (or
similar) was used to calculate the configuration.

> It would be very interesting to conduct an experiment using p2 to
> describe configurations in the puppet domain. The p2 system can be used
> for other things than Java/OSGi/Eclipe and it is supported in the Nexus
> repository manager.
>
>
>
> It's nice to have a variety of interesting problems from which to
> choose. :-)
>

Yes for sure :-)

Basically, I want to separate the two problems; describing "who can
do/provide what" and resolving that, and "what does the name x mean,
what is it bound to".

I will come back with some ideas for the binding of names - which I
think (for classes and resource types) is basically a operation for the
new Type system (and something that we discussed in the form of creating
a name as an alias for a type. It then follows naturally that this could
be done between resource types and classes as well. i.e.

something like:

type Resource[our::x] = type Resource[your::x]
type Class[our::y] = type Class[awesome::y]

Which is a general mechanism - e.g.

type MyStruct = Struct[
{ name => String[1,3], shape => Enum[big, small, green]}
]

(+ more to define types from scratch)

After that point, the aliased type is simply used instead of the
original type.

Resource[our::x] { title:
bar => 10
}

or indeed

our::x { title:
bar => 10
}

which means the same thing, since our::x can only be a resource type at
that point and would resolve to Resource[our::x], which is an alias for
whatever was defined.

Regards

John Bollinger

unread,

Jun 25, 2014, 10:26:41 AM6/25/14

to puppe...@googlegroups.com

On Monday, June 23, 2014 9:32:36 PM UTC-5, henrik lindberg wrote:

On 2014-23-06 21:48, John Bollinger wrote:
>
>
> On Monday, June 23, 2014 7:45:05 AM UTC-5, henrik lindberg wrote:
>
>
> I see this as an essential feature to support maintainable modules.
> And by "this", I mean a mechanism that decouples the wanted Element and
> its Container (where the Element is the thing we want/need/require, and
> the Container is the named thing it is in, typically a Module).
>
>
>
> Agreed.
>
>
>
> This can be achieved by modules describing their provided and required
> capabilities. These are described as name spaced names and they are
> versioned.
>
>
>
> That's so easy to say, yet so fraught with difficulties. In particular,
> the biggest issues I see revolve around the meanings of capability
> names. Capability names defined and maintained by the modules that
> require them present a potential problem because capability provider
> modules can't be expected to know and declare all the capabilities that
> they in fact provide.
>
Well, they do provide everything in the module. The publishing tool
would extract those as puppet capabilities i.e. class x, resource type
y, function f, etc. (each in the module's namespace).

Sure, but relying on capabilities declared and discovered that way does nothing to achieve decoupling element from container. In fact, it's what we have now.

A module that accepts "plugins" declares a required capability
(typically an optional requirement) - i.e. the module supports
additional things, special things if the optional requirement is
fulfilled. It may also have hard requirements.

I think plugin-style capabilities are mostly a mid-term future consideration, if that. The OP was looking more at a way to express hard requirements abstractly. He couched it in different terms, but that would be the result if it were possible to cause one capability name to be an alias for a different one, as specifically proposed.

This works well in practice in the Eclipse ecosystem with many thousands
of components and requirements spread across a large number of projects
(several hundred at Eclipse itself, and thousands elsewhere).

Yes, but Eclipse has some significant differences from Puppet. Among the more relevant are

The overall architecture of the product is basically a tree, organized around plugin-style extension points, whereas a Puppet catalog is more of a loose mesh, with validity/consistency constraints instead of extension points.
Eclipse can and does rely on Java interfaces to express the semantics of demanded of extensions / plugins, but Puppet has nothing comparable.

Really, I think the infrastructure needed to support plugin-like module extension points is pretty much orthogonal to that needed to support abstract capabilities. It does make sense for a module that supports plugins to control the associated identifiers and specifications, but that's not where this thread started, so I'd like to avoid going too far in that direction. For abstraction of capabilities, on the other hand, it does not cover all the interesting use cases to require that the component providing the implementation declare the abstract capability(-ies) provided.