on the future of stdlib...

125 views
Skip to first unread message

Henrik Lindberg

unread,
Jan 22, 2014, 1:15:31 PM1/22/14
to puppe...@googlegroups.com
Recent Pull Requests to puppetlabs-stdlib has raised issues w.r.t how we
should handle stdlib going forward. We have also had a discussion about
types and providers in core and if they should be moved out to modules
in a "tier2". That discussion also raises issues about
backwards compatibility, and configuration of modules.

I am posting this to trigger a discussion and to collect requirements.

* Is there content in stdlib that really belongs in core?
* Is there content in stdlib that should never have been included (it
should really be in a different module)?
* How do we keep stdlib in sync with the Puppet version?
* How should module authors deal with their dependency on stdlib?
* How should we handle future, breaking changes?
* A function is corrected/modified and is thus not backwards
compatible (it is not deprecated, just changed behavior).
* A function changes signature (it is called with a different set
of arguments).
* Should we have "backwards compatibility" as a separate module for
those that want to use older modules on a newer Puppet? (e.g.
stdlib-puppet3compat).
* Do we have to namespace functions (to allow multiple versions of the
same function for different modules?)
* Do we have to use a scheme such as adding a number after a
function/type
* Do we need to implement module specific loading (allow multiple
versions of a function).
* What about types and providers? Since they all have to co-exist in
the same catalog it is not really possible to support multiple
versions. How should we deal with types and providers?

Ideas, concerns, requirements, questions ...

Regards
- henrik


Ashley Penney

unread,
Jan 22, 2014, 3:01:22 PM1/22/14
to puppe...@googlegroups.com
On Wed, Jan 22, 2014 at 1:15 PM, Henrik Lindberg <henrik....@cloudsmith.com> wrote:
Recent Pull Requests to puppetlabs-stdlib has raised issues w.r.t how we should handle stdlib going forward. We have also had a discussion about types and providers in core and if they should be moved out to modules in a "tier2". That discussion also raises issues about
backwards compatibility, and configuration of modules.

I am posting this to trigger a discussion and to collect requirements.

I figured I'd have a stab at some answers to get the ball rolling. 
 
* Is there content in stdlib that really belongs in core?

Well, the validation stuff does and is covered by your type system, moving forward.

The stuff I would argue belongs in the core (and almost certainly redesigned from what's in stdlib) are:

1) Validation, which comes in with the type system.  We have all the validate() and is_*() functions in stdlib.
2) Better ways to get information from other parts of puppet:  things like getparam() in order
to be able to reach into a define for a value, getvar(), get_module_path(), 
3) Better ways to handle duplicate resources:  We have ensure_packages, ensure_resources, all hacks to deal with duplication.
4) Iteration over certain data.  We know future parser has iteration which solves most of this and the only other use cases are the has_*() functions which will mostly be replaced by structured data and iteration.

These were the things where we've got multiple functions and so they feel like language issues to me.  The rest is mostly genuine stdlib stuff. 
 
* Is there content in stdlib that should never have been included (it
  should really be in a different module)?

There's definitely been a tendency towards merge all the things.  Maybe we need a more sophisticated breakup in terms of "broadly applicable" and "weird specific code", stdlib and stdlib-extras.  We could maybe do a run through forge for usages of each thing to determine how frequently used they are to help that split.
 
* How do we keep stdlib in sync with the Puppet version?

My idea would have been major version stays in sync with major Puppet release but we kind of ruined that for ourselves.   :D
 
* How should module authors deal with their dependency on stdlib?
* How should we handle future, breaking changes?
  * A function is corrected/modified and is thus not backwards
    compatible (it is not deprecated, just changed behavior).
  * A function changes signature (it is called with a different set
    of arguments).
  * Should we have "backwards compatibility" as a separate module for
    those that want to use older modules on a newer Puppet? (e.g.
    stdlib-puppet3compat).
  * Do we have to namespace functions (to allow multiple versions of the
    same function for different modules?)
  * Do we have to use a scheme such as adding a number after a
    function/type
  * Do we need to implement module specific loading (allow multiple
    versions of a function).

There's a lot of questions here, and I'm struggling to find good answers to them.
While reading this I had a gut instinct of:

"Part of the problem is that stdlib is once again just a big ol' blob of stuff.  Maybe stdlib
should be a Modulefile that requires a whole bunch of smaller puppetlabs-stdlib-function
modules of various versions.  We roll them up into stdlib releases and test them against
new versions of Puppet.  If you need to correct a function then you can always just upgrade
puppetlabs-stdlib-validate_bool or whatever."

 
  * What about types and providers? Since they all have to co-exist in
    the same catalog it is not really possible to support multiple
    versions. How should we deal with types and providers?

This needs to be deferred until we have isolated environments in Puppet and this is really something that needs to be a huge priority.  We've run into it with PE, when we want to make pe_X versions of modules so that users can install a newer concat without messing with pe_concat but then we run into the issue of newer types and providers crashing into older ones.
 

If anything I guess the heart of my replies is "Maybe stdlib needs to be a container for smaller modules."  It's modules all the way down, people!

Erik Dalén

unread,
Jan 23, 2014, 3:16:49 AM1/23/14
to Puppet Developers
On 22 January 2014 21:01, Ashley Penney <ape...@gmail.com> wrote:
On Wed, Jan 22, 2014 at 1:15 PM, Henrik Lindberg <henrik....@cloudsmith.com> wrote:
Recent Pull Requests to puppetlabs-stdlib has raised issues w.r.t how we should handle stdlib going forward. We have also had a discussion about types and providers in core and if they should be moved out to modules in a "tier2". That discussion also raises issues about
backwards compatibility, and configuration of modules.

I am posting this to trigger a discussion and to collect requirements.

I figured I'd have a stab at some answers to get the ball rolling. 
 
* Is there content in stdlib that really belongs in core?

Well, the validation stuff does and is covered by your type system, moving forward.

The stuff I would argue belongs in the core (and almost certainly redesigned from what's in stdlib) are:

1) Validation, which comes in with the type system.  We have all the validate() and is_*() functions in stdlib.
2) Better ways to get information from other parts of puppet:  things like getparam() in order
to be able to reach into a define for a value, getvar(), get_module_path(), 
3) Better ways to handle duplicate resources:  We have ensure_packages, ensure_resources, all hacks to deal with duplication.
4) Iteration over certain data.  We know future parser has iteration which solves most of this and the only other use cases are the has_*() functions which will mostly be replaced by structured data and iteration.

These were the things where we've got multiple functions and so they feel like language issues to me.  The rest is mostly genuine stdlib stuff. 

The to_bytes() function (which I added) is mostly there due to issues with Facter, that it prints human readable memory and disk sizes instead of machine readable ones. But perhaps this can be solved with Facter 2.0 so this function could eventually be removed or moved into its own module if people really want it.

--
Erik Dalén

Andy Parker

unread,
Jan 23, 2014, 12:03:41 PM1/23/14
to puppe...@googlegroups.com
I think that issue was solved inside facter. I see that facter 1.7.4 has memorysize_mb and family.
 
--
Erik Dalén

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CAAAzDLdYHdu4t1dun8jA_2bem-5WMSbPuu4%3DRT4Fjos%2Bq_xG%3Dw%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
Andrew Parker
Freenode: zaphod42
Twitter: @aparker42
Software Developer

Join us at PuppetConf 2014September 23-24 in San Francisco

John Bollinger

unread,
Jan 23, 2014, 6:46:43 PM1/23/14
to puppe...@googlegroups.com
On Wednesday, January 22, 2014 12:15:31 PM UTC-6, henrik lindberg wrote:
Recent Pull Requests to puppetlabs-stdlib has raised issues w.r.t how we
should handle stdlib going forward. We have also had a discussion about
types and providers in core and if they should be moved out to modules
in a "tier2". That discussion also raises issues about
backwards compatibility, and configuration of modules.

I am posting this to trigger a discussion and to collect requirements.

* Is there content in stdlib that really belongs in core?


That begs a more fundamental question of policy on what kinds of things belong in core in general.  Also a question of whether there's a distinction between "core" and a hypothetical "tier1" that isn't the core product but is developed and maintained by PL and packaged with core.

With that said, there are many functions in stdlib that seem strangely placed in an optional add-on module.  Functions that I would definitely put in that category include

empty()
flatten()
get_module_path()
has_key()
is_array()
is_hash()
is_numeric()
is_string()
join()
keys()
size()
values()

Other functions that I might suggest putting in that category include

abs()
any2array()
concat()
count()
delete()
delete_at()
delete_values()
delete_undef_values()
difference()
dirname()
floor()
fqdn_rotate()
hash()
intersection()
join_keys_to_values()
lstrip()
max()
member()
merge()
min()
pick()
prefix()
range()
reject()
reverse()
rstrip()
sort()
squeeze()
str2saltedsha512()
strip()
suffix()
type()
union()
unique()
values_at()
zip()

There are also some functions whose usefulness, I question, but which, if indeed generally useful, probably should not be optional add-ons:

bool2num()
capitalize()
chomp()
chop()
downcase()
getvar()
grep()
is_bool()
is_float()
is_integer()
num2bool()
shuffle()
str2bool()
strftime()
swapcase()
time()
type()
upcase()
uriescape()


* Is there content in stdlib that should never have been included (it
   should really be in a different module)?


Anything in my third list above might fall into this category.  In addition, I would definitely put any stdlib function not already listed above into this category.

In particular, although I see the validate_*() functions serving a useful role, I think don't see them as core, and moreover they would hang together nicely in their own module.

Likewise, there is a group of functions centered on probing nodes' network characteristics; these also don't seem quite "core", but do hang together as their own group.

There are a few oddball functions that perhaps hang together loosely, but that don't feel "core"-ish to me: base64(), loadyaml(), parsejson(), parseyaml().

And then there is a handful of functions that should not exist:   Those certainly should not be considered "core", and the only reason I see even for keeping them in stdlib is to avoid breaking third-party modules: defined_with_params(), ensure_packages(), ensure_resource(), getparam(), is_function_available().  Really, these would be better in a "dangerous_hacks_you_should_not_use" module, just to make it clear.

 
* How do we keep stdlib in sync with the Puppet version?


Is it necessary to do so?  Few of its functions seem dependent on Puppet internals, and none of those are among the ones that I think clearly belong as part of the core product.  Most of those dependent on Puppet internals are in the group "dangerous_hacks" group.  Yank those out, I say, and make them somebody else's problem.

 
* How should module authors deal with their dependency on stdlib?


What sort of "dealing" are you asking about?  Modules themselves can't really do much "deal"ing; if they try to call functions that don't exist then they will just break.  If there were a reliable fallback then that's what the module would (should) use in the first place.

The only other kind of dealing I can think of involves module metadata and the puppet module tool, but it's not clear to me what kinds of problems you suppose module authors need to deal with in that area.

 
* How should we handle future, breaking changes?
   * A function is corrected/modified and is thus not backwards
     compatible (it is not deprecated, just changed behavior).


Alternative 1: don't do that; create a new function instead.  This is a tried and true strategy.

Alternative 2: continue applying semantic versioning to stdlib, and refer back to "How should module authors deal with their dependency on stdlib?"

 
   * A function changes signature (it is called with a different set
     of arguments).


See above.

 
   * Should we have "backwards compatibility" as a separate module for
     those that want to use older modules on a newer Puppet? (e.g.
     stdlib-puppet3compat).


That would be a reasonable approach for functions removed from stdlib and not otherwise moved anywhere else.  It would be sure to cause a tremendous mess if done for stdlib functions that were modified.

 
   * Do we have to namespace functions (to allow multiple versions of the
     same function for different modules?)


"Have to"?  No.  I think it's a good idea, however.

 
   * Do we have to use a scheme such as adding a number after a
     function/type


You mean when new behavior or signature is desired for an existing function?  Yes, I think the best approach is my alternative 1, which is typically implemented by giving the new function a name derived from the old one's.

 
   * Do we need to implement module specific loading (allow multiple
     versions of a function).


I think namespacing functions is a better strategy.

 
   * What about types and providers? Since they all have to co-exist in
     the same catalog it is not really possible to support multiple
     versions. How should we deal with types and providers?



I think the only significant issue there that is not analogous to the situation with functions is third-party providers for existing types.  That's a thorny issue that I'm not prepared to address at the moment.


John

Henrik Lindberg

unread,
Jan 23, 2014, 7:31:33 PM1/23/14
to puppe...@googlegroups.com
Thanks John for lots of good feedback, and I realize I should clarify
some things... comments inline below...
Mostly agree - guess we have to arrange an "Puppet Stdlib Idol" where
people can phone in and vote or something. For now --- just noting
what everyone feels about what is there.

> * How do we keep stdlib in sync with the Puppet version?
>
>
>
> Is it necessary to do so? Few of its functions seem dependent on Puppet
> internals, and none of those are among the ones that I think clearly
> belong as part of the core product. Most of those dependent on Puppet
> internals are in the group "dangerous_hacks" group. Yank those out, I
> say, and make them somebody else's problem.
>

Well, there is one big problem that we would like to solve, and that is
how functions are called. The ambition is to deal with this in Puppet 4
timeframe. The problem with the current calling convention is (among
other things) that undef gets translated to an empty string - we would
like to stop transforming values that way. The idea is that a function
can opt in to the 4x calling convention.

There may not be any functions in stdlib that will benefit from the
calling convention change though (have not reviewed for that yet).

> * How should module authors deal with their dependency on stdlib?
>
>
>
> What sort of "dealing" are you asking about? Modules themselves can't
> really do much "deal"ing; if they try to call functions that don't exist
> then they will just break. If there were a reliable fallback then
> that's what the module would (should) use in the first place.
>
> The only other kind of dealing I can think of involves module metadata
> and the puppet module tool, but it's not clear to me what kinds of
> problems you suppose module authors need to deal with in that area.
>
> * How should we handle future, breaking changes?
> * A function is corrected/modified and is thus not backwards
> compatible (it is not deprecated, just changed behavior).
>
>
>
> Alternative 1: don't do that; create a new function instead. This is a
> tried and true strategy.
>
> Alternative 2: continue applying semantic versioning to stdlib, and
> refer back to "How should module authors deal with their dependency on
> stdlib?"
>
> * A function changes signature (it is called with a different set
> of arguments).
>
I am thinking about a combination of namespaced functions (their
identity is unique), but that they may have the same last part. A module
could then use unqualified function names. A mechanism of resolving the
module's dependencies would wire the module so it finds
the correct version of the function. This way, the puppet logic would
still contain a call to "foo(3)", and the function may come from any of
the referenced modules (say a "stdlib_compat_3" module).

I am also thinking about dependency ranges, and that I suspect that
module authors probably only use a lower version range, thus exposing
the module to potential future breakage.

What I really would like to see it work in the reverse - i.e. that
modules can *provide* an API as well as implement it, and that modules
that depend on other modules declare their dependency on those API
versions - not on the implementation of the API.

In order to achieve this, it may be required to provide modules that
are only there to specify a configuration of other modules. (This in
turn requires a lot more sophistication in the puppet module ecosystem).

In any case, what I want to avoid is that all modules have to be updated
in lock step sync and that module authors can adopt new functionality
faster without sacrificing also supporting the majority of users who are
several steps behind the bleeding edge.

>
>
> See above.
>
> * Should we have "backwards compatibility" as a separate module for
> those that want to use older modules on a newer Puppet? (e.g.
> stdlib-puppet3compat).
>
> That would be a reasonable approach for functions removed from stdlib
> and not otherwise moved anywhere else. It would be sure to cause a
> tremendous mess if done for stdlib functions that were modified.
>

Yes, a mess, for sure :-) unless there is a way to "route" things to an
implementation that has the "old" / expected behavior.

> * Do we have to namespace functions (to allow multiple versions
> of the
> same function for different modules?)
>
>
>
> "Have to"? No. I think it's a good idea, however.
>
> * Do we have to use a scheme such as adding a number after a
> function/type
>
> You mean when new behavior or signature is desired for an existing
> function? Yes, I think the best approach is my alternative 1, which is
> typically implemented by giving the new function a name derived from the
> old one's.
>
> * Do we need to implement module specific loading (allow multiple
> versions of a function).
>
>
>
> I think namespacing functions is a better strategy.
>
See comment above - I was thinking about a combination of namespace and
resolution of dependencies (to define visibility). Using a fully
qualified name could be allowed (it would be an absolute pointer), if
using only the last part, it would resolve to the visible version of
that function. (If ambiguous, user would have to use fully qualified name).

> * What about types and providers? Since they all have to
> co-exist in
> the same catalog it is not really possible to support multiple
> versions. How should we deal with types and providers?
>
>
>
> I think the only significant issue there that is not analogous to the
> situation with functions is third-party providers for existing types.
> That's a thorny issue that I'm not prepared to address at the moment.
>
We can probably leave that out from the discussion around stdlib.

Regards
- henrik

Erik Dalén

unread,
Jan 24, 2014, 5:29:49 AM1/24/14
to Puppet Developers
On 24 January 2014 01:31, Henrik Lindberg <henrik....@cloudsmith.com> wrote:
Thanks John for lots of good feedback, and I realize I should clarify some things... comments inline below...


On 2014-24-01 24:46, John Bollinger wrote:
On Wednesday, January 22, 2014 12:15:31 PM UTC-6, henrik lindberg wrote:
    * How do we keep stdlib in sync with the Puppet version?



Is it necessary to do so?  Few of its functions seem dependent on Puppet
internals, and none of those are among the ones that I think clearly
belong as part of the core product. Most of those dependent on Puppet
internals are in the group "dangerous_hacks" group.  Yank those out, I
say, and make them somebody else's problem.


Well, there is one big problem that we would like to solve, and that is how functions are called. The ambition is to deal with this in Puppet 4 timeframe. The problem with the current calling convention is (among other things) that undef gets translated to an empty string - we would like to stop transforming values that way. The idea is that a function can opt in to the 4x calling convention.

There may not be any functions in stdlib that will benefit from the calling convention change though (have not reviewed for that yet).


count, min & max would definitely benefit from that. They all make some ugly workarounds for it now.

In the case of min & max it is mostly the conversion of numbers to strings (which only happens sometimes in puppet 3.x) they have to workaround. Will this also change in the 4.x function API, so numbers are numbers?

 
--
Erik Dalén

Henrik Lindberg

unread,
Jan 24, 2014, 8:12:05 AM1/24/14
to puppe...@googlegroups.com
On 2014-24-01 11:29, Erik Dalén wrote:

>
> Well, there is one big problem that we would like to solve, and that
> is how functions are called. The ambition is to deal with this in
> Puppet 4 timeframe. The problem with the current calling convention
> is (among other things) that undef gets translated to an empty
> string - we would like to stop transforming values that way. The
> idea is that a function can opt in to the 4x calling convention.
>
> There may not be any functions in stdlib that will benefit from the
> calling convention change though (have not reviewed for that yet).
>
>
> count, min & max would definitely benefit from that. They all make some
> ugly workarounds for it now.
>
> In the case of min & max it is mostly the conversion of numbers to
> strings (which only happens sometimes in puppet 3.x) they have to
> workaround. Will this also change in the 4.x function API, so numbers
> are numbers?
>

Currently (i) In 4x numbers are numbers, but strings are also
automatically coerced to a number when a number is required. The reverse
is not done. The result of a numeric operation (+,- etc) is always numeric.

(i) I say currently (3.5. future evaluator), because no final decision
has been made regarding automatic coercion string -> numeric. i.e.
should we drop this and require explicit string to numeric coercion?

It is questionable how far automatic coercion should go; if declared
type is Array[Integer], and there are strings in the given Array, should
a new array be produced, etc.

The intention is to make the 4x Function API include typed signature
support, and thus if a String is given where a Numeric is required will
automatically coerce the given String to numeric. We have some design
left to do on this, for simple functions it is trivial, but for
functions with several overloaded signatures it may get a bit messy.

- henrik


Daniele Sluijters

unread,
Feb 11, 2014, 2:29:35 PM2/11/14
to puppe...@googlegroups.com
I was browsing through stdlib and I noticed that there's quite a few things in there that make me itch, from a little to a lot very much. Most of them are workarounds for features missing in the Puppet language and though I understand why they exist a few are a bit iffy.

To take an example, file_line is one of those things I consider dangerous. Though useful in the absence of better augeas integration or changes to ParsedFile it's something that can be very easily used to accidentally screw up a system.

I'm also extremely weary of loadyaml, parsejson and parseyaml. Loadyaml imho shouldn't have to exist, just use Hiera for that. I'm also not entirely sure what the use case for parseyaml would be as YAML is hardly a nice format to pass in as a parameter to a class. Parsejson I can understand in the absence of structured facts and until stringify_facts starts defaulting to false.

Then there's the fact that stdlib has a concat function and that there are at least 3 concat modules out there that do something completely different which can be confusing.

All in all, what I'm getting at is that perhaps a few of these things should be moved out of stdlib into something like stdlib-dangerzone and hopefully in time just entirely go away.

Trevor Vaughan

unread,
Feb 11, 2014, 4:59:08 PM2/11/14
to puppe...@googlegroups.com
I'm not sure if I agree completely here.

File_line is something that could be used to accidentally screw up a system but so it pretty much everything else if you don't use it properly. File templates, custom types, etc... Code review and deployment test gating is the only way that I can think of mitigating accidents and this is extremely useful for the one-off "stick a line in here if it doesn't exist" scenario.

I can see the *YAML/JSON files being useful for collecting configuration data on the back end for those that need to import data from elsewhere. That said, I agree that Hiera should be used for this.

The concat function does what I generally expect from stdlib. It exposes base Ruby functionality for use directly within Puppet. I feel that removing this type of function would warrant exposing some of the base ruby functions natively for use on primitive types.

I don't find this confusing because I call the concat type as concat { 'foo': } and the concat function as $foo = concat($bar, [1,2,3,4]) and it would certainly not parse if I tried to shove a hash in there.

Ultimately, I don't feel like we should dictate what might be dangerous to anyone unless it is inherently dangerous (misrepresentative name, shown to be prone to misuse, buggy, etc...). A tool is a tool and it's all in how you use it and what you need in your installation.

Thanks,

Trevor




--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvau...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --
Reply all
Reply to author
Forward
0 new messages