--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHaJdvPrkqRQEMqEgLSUvOy-O4DuL-iNsLrPt74HY7djvw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Hi David,Most of this looks fine (I still need to digest some of it) but I don't know that I'm a huge fan of no munge or validate functionality. This seems to fly in the face of a 'fail fast' mentality.
Also, if everything happens on the client, some of the catalog munging that I do on the fly isn't going to work any longer.
I *do* like the idea of types being pure data but will this bloat the catalog even further?
I still think that moving toward a heap/stack model would serve the Puppet clients best and get past the resource count limitations.
Fundamentally, I want my clients to do as little work as possible since they're meant to be running business processes.
I'll definitely be keeping an eye on this conversation though.Thanks,
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoWyRAW0uuAQwiEOj1ja3F-WgU6auKsgMKGWVTCmwdeKYQ%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHZ0wpz4jjV%3DwP5y%2BDjyqJ%2BNFKmxw4182D9cfxwappCtaA%40mail.gmail.com.
I think I'm slowly talking myself into this.The issue that I've had is that the current Data Types are composable. There's a ticket open about it and hopefully that will fix a lot of my issues.
Unfortunately, it looks like my heap/stack discussion was with Henrik on Slack, so that's not super helpful.Basically, the idea is that, instead of using a pure object model, objects would be created *as necessary* and the linear execution model would be kept in a simple array (stack). The data space (heap) would contain the attributes that are required for each object to execute.This means that the data structure would be extremely efficient and there would be almost no cost to building and manipulating an object graph.The benefit of the object graph is to keep the object model. However, the object model is simply not necessary and has proven not to scale to 10k+ resources. The Heap/Stack model should scale based on the amount of data that you have and would be *extremely* portable to other languages.
The biggest down side (for me) is that I manipulate the catalog graph at run time for various reasons. There would probably be an easy way to work this into the API though since it would just be an array insertion.
In terms of the validate/munge question, munging can *absolutely* be done client side. But do you even want to pass a catalog to the client if you have invalid data that might be used by other resources. Also, I do quite a bit of checking based on the values of other parameters/properties. If this could be maintained reasonably, that would be fine.
I do think that having a global data space would be extremely important to making this feasible. There are a couple of tickets on for this as well but I don't know the numbers off hand. And, of course, if we use the heap/stack model, the data is all in place (but may change based on resource processing).
Hopefully this all makes sense.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoVZZ6Wr2iW04vOTGpQCyfSwdNTo64prpw6MbN3_feJKxw%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
Hi David,many thanks for sharing your ideas.As you already know: personally I don’t say that types and providers are way to difficult.It is more a matter on how to explain them.I will dig through your document during weekend. We can have a chat on Monday in Ghent.But besides this:is your idea somehow related to the libral project?
* It maps 1:1 to the capabilities of PCore, and is similar to the libral interface description (see [libral#1](https://github.com/puppetlabs/libral/pull/2)). This ensures future interoperability between the different parts of the ecosystem.
--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHb9b7HknTGrhMbu5213kpMX3rHg-nwiOKr2jvcqnzTh-Q%40mail.gmail.com.
It seems to me as if you are “hiding” some things which are hard to understand - similar to mk_resource_methods which is useful when following some silent assumptions.
Think of something like the following (yes, I know, “type” is already a reserved word):
type my_module::my_file (
Enum['file', 'absent' ] $ensure,
Stdlib::Absolute_path $path = $title,
String $owner,
){
def create {
exec { "create file ${path}":
command => "touch ${path}",
path => '/usr/bin:/bin',
creates => $path,
}
}
def destroy {
file { $path:
ensure => absent,
}
}
property owner {
exec { "chown ${owner} on ${path}":
command => "chown ${owner} ${path}",
path => '/bin:/usr/bin',
unless => '',
}
}
}
(I know that exec should not be used in this way.)
People with ruby experience will not find types and providers difficult (in the way they are done today).
But Puppet was originally written for Ops people who don’t like scripting or coding.
One improvement I would like to see with the provider API is around getting the current state. I see many beginners including myself trying to add way more complexity than required in the exists? method.One would think that you in order to figure out if the resource exists that you should just query the system and DIY. And I'll pick on my own code as an example: https://github.com/logicminds/bmclib/blob/0.1.0/lib/puppet/provider/bmc/ipmitool.rb#L34However, this is not the case. This is the exact opposite of what you need to do. Instead you need to query the API which is hard to understand and not obvious at first because instinct tells you to get the state yourself.Another improvement which I hope would get addressed is self.prefetch. While self.instances is pretty simple to understand the prefetch method could probably use a better name and an easier way to implement. It was not obvious at first how to implement. So many people don't implement these two methods when they should making their provider more useful.
The type checking looks pretty amazing and simple to understand and implement in the type code.
Additionally, As Trevor mentioned I also have some projects (puppet-debugger and puppet-retrospec) that utilize the current Puppet API to introspec resources and query the catalog please so make sure this new code doesn't break my projects. I'll be happy to test out any new API for third party compatibility.
--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/95e3ecb5-e796-442f-ae73-7c40b67c6bc7%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHbbiA31_QMT2VEBYBN4rJHoUZ8Cf%2BHtpHpCRT0yHZ4UVg%40mail.gmail.com.
Like Martin Afke above, I'm very interested in what could be done using the Puppet Language alone. What use cases can be covered entirely that way?
What is not feasible to ever do with PL?
Would an API that made it possible to write types and providers PL and hand over tiny tasks to another language (Ruby, C, Go, etc.) be of interest? I.e. a clean cut boundary between the things possible (appropriate) to implement with PL and the things that aren't?
Perhaps this is a first step in that direction?
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CAO5TtCtdy3JigiRxehAc2FOG-8vNXDBrUjJyJdqm_UKTb4STXQ%40mail.gmail.com.
I do not understand how extending the puppet dsl on the server helps with resource management on the agent.
Perhaps this is a first step in that direction?In one possible reading, types are already are a extension point of the DSL, and this proposal makes that less painful, and easier to reason about.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHZysK%3D8c9gQ-G%3DXszmLjfjrahxY4e4Stcrm3K-70Jt7fw%40mail.gmail.com.
The type and provider API has been the bane of my existence since I [started writing native resources](https://github.com/DavidS/puppet-mysql-old/commit/d33c7aa10e3a4bd9e97e947c471ee3ed36e9d1e2).
Now, finally, we'll do something about it. I'm currently working on designing a nicer API for types and providers. My primary goals are to provide a smooth and simple ruby developer experience for both scripters and coders. Secondary goals were to eliminate server side code, and make puppet 4 data types available. Currently this is completely aspirational (i.e. no real code has been written), but early private feedback was encouraging.
The `Puppet::SimpleResource.implement()` call receives the `current_state = get()` and `set(current_state, target_state, noop)` methods. `get` returns a list of discovered resources, while `set` takes the target state and enforces those goals on the subject. There is only a single (ruby) object throughout an agent run, that can easily do caching and what ever else is required for a good functioning of the provider. The state descriptions passed around are simple lists of key/value hashes describing resources. This will allow the implementation wide latitude in how to organise itself for simplicity and efficiency.
The `Puppet::SimpleResource.define()` call provides a data-only description of the Type. This is all that is needed on the server side to compile a manifest. Thanks to puppet 4 data type checking, this will already be much more strict (with less effort) than possible with the current APIs, while providing more automatically readable documentation about the meaning of the attributes.
On Tuesday, January 31, 2017 at 10:04:27 AM UTC-6, David Schmitt wrote:
The type and provider API has been the bane of my existence since I [started writing native resources](https://github.com/DavidS/puppet-mysql-old/commit/d33c7aa10e3a4bd9e97e947c471ee3ed36e9d1e2).
Honestly, I've never thought the type / provider system was too bad. Certainly it works pretty well for simple things. Now, I'm more of a Dev who does Ops than an Op who Devs, but from my perspective, the biggest problem with Puppet's types & providers has always been inadequate documentation. The docs in this area are a bit better now than when I started with Puppet around eight years ago, but they have always missed presenting many relevant details about how the host Puppet instance, resource instances, and their associated provider instances interact.
This new effort doesn't seem to be targeting any of that.
def set(current_state, target_state, noop = false)
existing_keys = Hash[current_state.collect { |k| [k[:name], k] }]
target_state.each do |resource|
# additional validation for this resource goes here
current = existing_keys[resource[:name]]
if current && resource[:ensure].to_s == 'absent'
# delete the resource
elsif current && resource[:ensure].to_s == 'present'
# update the resource
elsif !current && resource[:ensure].to_s == 'present'
# create the resource
end
end
end
Now, finally, we'll do something about it. I'm currently working on designing a nicer API for types and providers. My primary goals are to provide a smooth and simple ruby developer experience for both scripters and coders. Secondary goals were to eliminate server side code, and make puppet 4 data types available. Currently this is completely aspirational (i.e. no real code has been written), but early private feedback was encouraging.
I'm confused and dubious: how is it that the guiding goals for this effort are all about form rather than function? Surely, if you intend for the result to serve any particular function(s) then you need to be guided at least in part by which functions those are. Perhaps for you that's tied up in what you mean by "a type and provider API", but I am confident that some of the details of what that means to you differ from what it means to others. You really need to be explicit with this if you want good feedback. Moreover, inasmuch as you seem to imagine this API eventually supplanting the existing one, it may save a lot of future grief and expense to make sure everyone is on the same page early on.
Moreover, none of the goals given are addressed the needs of the API client, which is Puppet itself. Or do you have a different concept of who the client is? If so, then you and I have even more different ideas of what a "type and provider API" is than I had supposed.
Furthermore, as wonderful as "a smooth and simple ruby developer experience for both scripters and coders" sounds, I think it's utterly unrealistic. People who aren't good at writing native code or who just want to avoid it have much different wants and expectations from those who are comfortable with writing native code. In fact, Puppet already has support for "scripters": this is what defined types are all about. People who don't want to be writing Ruby are unlikely ever to have a smooth or simple Ruby developer experience. The most likely outcome of trying to simplify any more than serves coders well is producing something insufficiently powerful / flexible.
The `Puppet::SimpleResource.implement()` call receives the `current_state = get()` and `set(current_state, target_state, noop)` methods. `get` returns a list of discovered resources, while `set` takes the target state and enforces those goals on the subject. There is only a single (ruby) object throughout an agent run, that can easily do caching and what ever else is required for a good functioning of the provider. The state descriptions passed around are simple lists of key/value hashes describing resources. This will allow the implementation wide latitude in how to organise itself for simplicity and efficiency.
So there is no mechanism for obtaining the current state of an individual resource? That seems a bit of an oversight. There are resource types that cannot feasibly be fully enumerated (e.g. files), and even resource types that cannot be enumerated at all.
get()
and do a more customized enforce motion in the set()
." This is a deliberate tradeoff now, to build a simple, robust foundation for future evolution. Adding an optional `find_by_name` function that can do this.Also, though you say elsewhere that the provider's set() method receives a list of resources to update, that is not borne out by your example code.
The `Puppet::SimpleResource.define()` call provides a data-only description of the Type. This is all that is needed on the server side to compile a manifest. Thanks to puppet 4 data type checking, this will already be much more strict (with less effort) than possible with the current APIs, while providing more automatically readable documentation about the meaning of the attributes.
I'm not at all convinced that your concept of a resource type definition indeed covers everything needed -- or at least wanted -- on the server side. For example, I'm all for engaging the type system, but
- The type system does not cover all potential validation needs. In particular, it cannot perform joint validation on multiple parameters.
- Using the type system for any but pretty simple validity checks will run against your objective of providing a simple interface.
Also, if handling types does not involve any server-side code, does that mean that the type system is not engaged to validate parameters until the data reach the client? If so, that in itself is a controversial interpretation of what is and is not needed on the server side.
I'm not prepared to offer feedback on the details at this time, because I don't think you've sufficiently established what the effort is trying to accomplish.
A resource is the basic thing that is managed by puppet.
Each resource has a set of attributes describing its current state. Some
of the attributes can be changed throughout the life-time of the
resource, some attributes only report back, but cannot be changed (see read_only
)others can only be set once during initial creation (see init_only
). To gather information about those resources, and to enact changes in the real world, puppet requires a piece of code to implement this interaction. The implementation can have parameters that influence its mode of operation (see operational_parameters
). To describe all these parts to the infrastructure, and the consumers, the resource Definition (f.k.a. 'type') contains definitions for all of them. The Implementation (f.k.a. 'provider') contains code to get and set the system state.
Puppet::ResourceDefinition.register( name: 'apt_key', docs: <<-EOS, This type provides Puppet with the capabilities to manage GPG keys needed by apt to perform package validation. Apt has it's own GPG keyring that can be manipulated through the `apt-key` command. apt_key { '6F6B15509CF8E59E6E469F327F438280EF8D349F': source => 'http://apt.puppetlabs.com/pubkey.gpg' } **Autorequires**: If Puppet is given the location of a key file which looks like an absolute path this type will autorequire that file. EOS attributes: { ensure: { type: 'Enum[present, absent]', docs: 'Whether this apt key should be present or absent on the target system.' }, id: { type: 'Variant[Pattern[/\A(0x)?[0-9a-fA-F]{8}\Z/], Pattern[/\A(0x)?[0-9a-fA-F]{16}\Z/], Pattern[/\A(0x)?[0-9a-fA-F]{40}\Z/]]', docs: 'The ID of the key you want to manage.', namevar: true, }, # ... created: { type: 'String', docs: 'Date the key was created, in ISO format.', read_only: true, }, }, autorequires: { file: '$source', # will evaluate to the value of the `source` attribute package: 'apt', }, )
The Puppet::ResourceDefinition.register(options)
function takes a Hash with the following top-level keys:
name
: the name of the resource. For autoloading to work, the whole function call needs to go into lib/puppet/type/<name>.rb
.docs
: a doc string that describes the overall working
of the type, gives examples, and explains pre-requisites as well as
known issues.attributes
: an hash mapping attribute names to their details. Each attribute is described by a hash containing the puppet 4 data type
, a docs
string, and whether the attribute is the namevar
, read_only
, and/or init_only
.operational_parameters
: a hash mapping parameter names to puppet data type
s and docs
strings.autorequires
, autobefore
, autosubscribe
, and autonotify
:
a Hash mapping resource types to titles. Currently the titles must
either be constants, or, if the value starts with a dollar sign, a
reference to the value of an attribute.At runtime the current and intended system states for a single resource instance are always represented as ruby Hashes of the resource's attributes, and applicable operational parameters.
The two fundamental operations to manage resources are reading and
writing system state. These operations are implemented in the ResourceImplementation
as get
and set
:
Puppet::ResourceImplementation.register('apt_key') do def get [ 'title': { name: 'title', # ... }, ] end def set(current_state, target_state, noop: false) target_state.each do |title, resource| # ... end end end
The get
method returns a Hash of all resources currently available, keyed by their title. If the get
method raises an exception, the implementation is marked as unavailable
during the current run, and all resources of its type will fail. The
error message will be reported to the user.
The set
method updates resources to the state defined in target_state
. For convenience, current_state
contains the last available system state from a prior get
call. When noop
is set to true, the implementation must not change the system state, but only report what it would change.
The primary runtime environment for the implementation is the puppet
agent, a long-running daemon process. The implementation can also be
used in the puppet apply command, a one-shot version of the agent, or
the puppet resource command, a short-lived CLI process for listing or
managing a single resource type. Other callers who want to access the
implementation will have to emulate those environments. In any case the
registered block will be surfaced in a clean class which will be
instantiated once for each transaction. The implementation can define
any number of helper methods to support itself. To allow for a
transaction to set up the prerequisites for an implementation, and use
it immediately, the provider is instantiated as late as possible. A
transaction will usually call get
once, and may call set
any number of times to effect change. The host object can be used to
cache ephemeral state during execution. The implementation should not
try to cache state beyond the transaction, to avoid interfering with the
agent daemon. In many other cases caching beyond the transaction won't
help anyways, as the hosting process will only manage a single
transaction.
The runtime environment provides some utilities to make the implementation's life easier, and provide a uniform experience for its users.
To use CLI commands in a safe and comfortable manner, the implementation can use the commands
method to access shell commands. You can either use a full path, or a
bare command name. In the latter case puppet will use the system PATH
setting to search for the command. If the commands are not available, an
error will be raised and the resources will fail in this run. The
commands are aware of whether noop is in effect or not, and will skip
the actual execution if necessary.
Puppet::ResourceImplementation.register('apt_key') do commands apt_key: '/usr/bin/apt-key' commands gpg: 'gpg'
This will create methods called apt_get
, and gpg
,
which will take CLI arguments without any escaping, and run them in a
safe environment (clean working directory, clean environment). For
example to call apt-key
to delete a specific key by id:
apt_key 'del', key_id
By default the stdout of the command is logged to debug, while the
stderr is logged to warning. To access the stdout in the implementation,
use the command name with _lines
appended, and process it through the returned Enumerable line-by-line. For example, to process the list of all apt keys:
apt_key_lines(%w{adv --list-keys --with-colons --fingerprint --fixed-list-mode}).collect do |line| # process each line here, and return a result end
Note: the output of the command is streamed through the Enumerable. If the implementation requires the exit value of the command before processing, or wants to cache the output, use
to_a
to read the complete stream in one go.
If the command returns a non-zero exit code, an error is signalled to puppet. If this happens during get
, all managed resources of this type will fail. If this happens during a set
,
all resources that have been scheduled for processing in this call, but
not yet have been marked as a success will be marked as failed. To
avoid this behaviour, call the try_
prefix variant. In this (hypothetical) example, apt-key
signals already deleted keys with an exit code of 1
, which is OK when the implementation is trying to delete the key:
try_apt_key 'del', key_id if [0, 1].contains $?.exitstatus # success, or already deleted else # fail end
The exit code is signalled through the ruby standard variable $?
as a Process::Status
object
The implementation needs to signal changes, successes and failures to the runtime environment. The logger
provides a structured way to do so.
To provide feedback about the overall operation of the implementation, the logger provides the usual set of loglevel methods that take a string, and pass that up to puppet's logging infrastructure:
logger.warning("Unexpected state detected, continuing in degraded mode.")
will result in the following message:
Warning: apt_key: Unexpected state detected, continuing in degraded mode.
See wikipedia and RFC424 for more details.
Most of an implementation's messages are expected to be relative to a specific resource instance, and a specific operation on that instance. For example, to report the change of an attribute:
logger.attribute_changed(title:, attribute:, old_value:, new_value:, message: "Changed #{attribute} from #{old_value.inspect} to #{newvalue.inspect}")
To enable detailed logging without repeating key arguments, and provide consistent error logging, the logger provides logging context methods that capture the current action and resource instance.
logger.updating(title: title) do if key_not_found logger.warning('Original key not found') end # Update the key by calling CLI tool apt_key(...) logger.attribute_changed( attribute: 'content', old_value: nil, new_value: content_hash, message: "Created with content hash #{content_hash}") end
will result in the following messages (of course, with the #{}
sequences replaced by the true values):
Debug: Apt_key[#{title}]: Started updating
Warning: Apt_key[#{title}]: Updating: Original key not found
Debug: Apt_key[#{title}]: Executing 'apt-key ...'
Debug: Apt_key[#{title}]: Successfully executed 'apt-key ...'
Notice: Apt_key[#{title}]: Updating content: Created with content hash #{content_hash}
Notice: Apt_key[#{title}]: Successfully updated
# TODO: update messages to match current log message formats for resource messages
In the case of an exception escaping the block, the error is logged appropriately:
Debug: Apt_key[#{title}]: Started updating
Warning: Apt_key[#{title}]: Updating: Original key not found
Error: Apt_key[#{title}]: Updating failed: #{exception message}
# TODO: update messages to match current log message formats for resource messages
Logging contexts process all exceptions. StandardError
s
are assumed to be regular failures in handling a resources, and they
are swallowed after logging. Everything else is assumed to be a fatal
application-level issue, and is passed up the stack, ending execution.
See the ruby documentation for details on which exceptions are not StandardError
s.
The equivalent long-hand form with manual error handling:
logger.updating(title: title) begin if key_not_found logger.warning(title: title, message: 'Original key not found') end # Update the key by calling CLI tool try_apt_key(...) if $?.exitstatus != 0 logger.error(title: title, "Failed executing apt-key #{...}") else logger.attribute_changed( title: title, attribute: 'content', old_value: nil, new_value: content_hash, message: "Created with content hash #{content_hash}") end logger.changed(title: title) rescue StandardError => e logger.error(title: title, exception: e, message: 'Updating failed') raise unless e.is_a? StandardError end
This example is only for demonstration purposes. In the normal course of operations, implementations should always use the utility functions.
The following action/context methods are available:
creating(title:, message: 'Creating', &block)
updating(title:, message: 'Updating', &block)
deleting(title:, message: 'Deleting', &block)
attribute_changed(title:, attribute:, old_value:, new_value:, message: nil)
created(title:, message: 'Created')
updated(title:, message: 'Updated')
deleted(title:, message: 'Deleted')
fail(title:, message:)
- abort the current context with an error
--------------------
Regards, David
On Feb 27, 2017, at 9:59 AM, David Schmitt <david....@puppet.com> wrote:Commands
To use CLI commands in a safe and comfortable manner, the implementation can use the
commands
method to access shell commands. You can either use a full path, or a bare command name. In the latter case puppet will use the system PATH setting to search for the command. If the commands are not available, an error will be raised and the resources will fail in this run. The commands are aware of whether noop is in effect or not, and will skip the actual execution if necessary.Puppet::ResourceImplementation.register('apt_key') do commands apt_key: '/usr/bin/apt-key' commands gpg: 'gpg'This will create methods called
apt_get
, andgpg
, which will take CLI arguments without any escaping, and run them in a safe environment (clean working directory, clean environment). For example to callapt-key
to delete a specific key by id:
On Feb 27, 2017, at 9:59 AM, David Schmitt <david....@puppet.com> wrote:Commands
To use CLI commands in a safe and comfortable manner, the implementation can use the
commands
method to access shell commands. You can either use a full path, or a bare command name. In the latter case puppet will use the system PATH setting to search for the command. If the commands are not available, an error will be raised and the resources will fail in this run. The commands are aware of whether noop is in effect or not, and will skip the actual execution if necessary.Puppet::ResourceImplementation.register('apt_key') do commands apt_key: '/usr/bin/apt-key' commands gpg: 'gpg'This will create methods called
apt_get
, andgpg
, which will take CLI arguments without any escaping, and run them in a safe environment (clean working directory, clean environment). For example to callapt-key
to delete a specific key by id:If the goal here is safe execution shouldn’t it escape at least shell special characters? My concern is based on some of the providers I work on which take free form strings as arguments and any type that doesn’t fully validate or munge inputs.There is a risk that any provider* today can suffer from shell command injection. It seems that using shelljoin on the argument array would be useful or shellesacpe on each element. However, for a simplified API it seems that it could be done more centrally here instead of pushing it to each implementation to make sure it is done if needed.* That can’t exclude shell characters and doesn’t munge input to escape them.
--Cheers, David
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHbvi5wbFxuARSQYadwUD25-yN7VN8mksm3Hqm%2Bj98AUnw%40mail.gmail.com.
Hi folks,I've written down the design in README format now. This should make for a easier to ingest from a module developer's perspective. Meanwhile I've also had a shot at a hardcoded implementation of this, and it is technically possible to implement this too.
On 27 February 2017 at 18:57, Shawn Ferry <shawn...@oracle.com> wrote:On Feb 27, 2017, at 9:59 AM, David Schmitt <david....@puppet.com> wrote:Commands
To use CLI commands in a safe and comfortable manner, the implementation can use the
commands
method to access shell commands. You can either use a full path, or a bare command name. In the latter case puppet will use the system PATH setting to search for the command. If the commands are not available, an error will be raised and the resources will fail in this run. The commands are aware of whether noop is in effect or not, and will skip the actual execution if necessary.Puppet::ResourceImplementation.register('apt_key') do commands apt_key: '/usr/bin/apt-key' commands gpg: 'gpg'This will create methods called
apt_get
, andgpg
, which will take CLI arguments without any escaping, and run them in a safe environment (clean working directory, clean environment). For example to callapt-key
to delete a specific key by id:If the goal here is safe execution shouldn’t it escape at least shell special characters? My concern is based on some of the providers I work on which take free form strings as arguments and any type that doesn’t fully validate or munge inputs.There is a risk that any provider* today can suffer from shell command injection. It seems that using shelljoin on the argument array would be useful or shellesacpe on each element. However, for a simplified API it seems that it could be done more centrally here instead of pushing it to each implementation to make sure it is done if needed.* That can’t exclude shell characters and doesn’t munge input to escape them.The trick is to not use system() which calls through the shell, but the array variant of spawn() or popen() that doesn't go through a shell. Not only will that be much better performing due to fewer binaries being loaded, but also eliminates all shell parsing from the call chain. Having this wrapped up in the API as the primary mechanism to call CLI commands is intended to help people with that, instead of having them have to find the right ruby incantation themselves.At what point does Puppet::Util::Execution.execute change behavior?RTFM 4.x is an acceptable answer I only just got 4.7 integrated into our environment but glancing at the code doesn't seem to do anything other than Kernel.exec
--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHZ-PYgrqboexxDJzzLPgbxidu%2BZs%3DRYHyc85LBOnr4EfA%40mail.gmail.com.
Cheers, David--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHaPBLuZDzHOV2OSrK_tJQWsFjn%3DK2apdXmRBnN18uUQJA%40mail.gmail.com.
Big picture items
Multiple implementations / implementation selection
In splitting resources into "definition" and "implementation", the proposal adheres to a form similar to the current system's, but it seems to have fundamentally different design goals. I've always interpreted the present type / provider system's separation of resource interface from implementation first and foremost as a means to accommodate multiple implementations. The one most appropriate to the target system is chosen from among those available. I think that's a sound design approach; I like it, and it has served Puppet well. As far as I can determine, however, the proposal loses that completely -- I see no means to support multiple implementations at all, much less means to match an implementation to the target system.
Inadequate interface documentation
As I said earlier, one of the biggest problems with the current system is inadequate documentation. As it now stands, the proposal's documentation does about as well as the docs of the current system. Missing is information about how the runtime environment is intended to use the defined objects and methods -- for example, under what circumstances it will instantiate / invoke them, how many times per transaction (partially addressed in the doc), at what point(s) in the transaction. What may the environment do with the object returned by get() and its contents (i.e., must implementations assume that the environment may modify them)? Is there anything the environment is required to do with them? What may an implementation do with the object passed to set() and its contents? Is there any expectation about relationships between the objects provided by get() and received by put()?
Pushing new responsibilities onto resource implementations
The minimalistic approach to defining the interface for resource implementations has the inevitable effect of making the host environment incapable of some of the tasks that it presently performs, leaving the onus on resource implementations to fill the gaps. For example, since there is no way for the environment to retrieve the state of a specified individual resource, and because some resource types cannot feasibly be enumerated, the environment cannot, in general, determine whether resources are in sync. Only resource implementations can be relied upon to do that, and then only in the context of their put() methods.
Similarly, although I definitely like that a logging utility is made available to resource implementations, it seems that most responsibility for logging now falls on resource implementations as well. Currently, the environment handles a fair amount of that. It seems likely that moving the responsibility to resource implementations will not only make them more complicated to write, but will also reduce the consistency of log messages.
Terminology Changes
The proposal seems to gratuitously change established terminology. What we now call "properties" are to be referred to as "attributes", which itself is currently used in a broader sense. "Types" become "definitions"; "providers" become "implementations". Why? However well or poorly you like the established terminology, it is established. The proposal seems to be offering variations on these existing things, not entirely new things, and changing all the terminology is likely to be more harmful than helpful.
Specific features
Attribute and parameter validation
Validation ought to be performed during catalog building, thus it needs to be adequately addressed by type "definitions". I am dissatisfied with the proposal's provision for that. The Puppet type system is very expressive, but it can also be very arcane. I maintain that it is a mistake to rely exclusively on the type system for validating "attributes" or "operational parameters". Moreover, I think the proposal is missing an opportunity to provide for multiple-attribute, multiple-parameter covalidation. This has been requested on and off for a long time, and this proposal is a great opportunity to finally put it in place.
Transaction progress and state
Resource implementations are afforded only indirect information about transaction context. For example, the docs say that "The implementation should not try to cache state beyond the transaction", but how, exactly, is the implementation supposed to recognize the end of a transaction? It would be highly desirable for implementations to have means to at least opt in to messages about transaction lifecycle and events. For some purposes, it would be useful to have a way for type implementations to be able to communicate with each other within the scope of a transaction.
Details
"Read-only" vs. "init-only"
I understand -- I think -- the distinction you're drawing here, but it's confusing. Surely if a resource does not yet exist, then all its attributes are logically written when it is created / initialized. I could understand that is a plausible exception to read-onlyness if not for the explicit provision for init-only attributes. If you want to maintain both of these cases among the attribute metadata then please rename one of them. For example, perhaps "derived" or "internal" would describe what you mean by "read-only".
Compound resource names
I suppose that compound namevars are a topic that you hoped to avoid in your push for simplification, though it looks like they might not be hard to support via the interface you have already described. This is a weak area for the present system, and one that I think any replacement should support well and and fully.
Attribute / parameter clashes
By specifying attributes and operational parameters via separate hashes, the proposed system affords the opportunity for name collisions between these. That also makes resource definitions less cohesive with Puppet overall, because in the Puppet language, attributes and operational parameters are specified together, in the same syntactic construct. I fully understand the differences between these, but I suggest they not be reflected by a complete separation between their specifications, such as the proposal currently provides.
Resource implementations that become usable during a run
It is only relatively recent that Puppet implemented the long-desired feature of supporting resource providers that become usable as a result of other resources being managed earlier in the same run. It is not clear whether or how that would be supported under the proposal as currently presented, but this is a feature that I do not think we want to lose.
Overall I think this looks pretty good, but I have some questions and comments.For implementations that can't enumerate all existing resources (at least not in any fast way), like for example File, should they fine a get method that returns an empty hash, nil or just not define a get method? nil or not defining would allow special handling of them, so that would probably be preferable.
But shouldn't it really have a way to fetch the state of a single resource in for those implementations? How would this otherwise handle something like "puppet resource file /path/to/file", traverse the entire file system and stat every file?
It seems it would also be handy with some helper method that compared current_state & target_state to filter out the changes that actually need to happen. I would imagine that sort of code would otherwise be duplicated in many implementations.
logger.attribute_changed could really have a default value for message if the implementation provides attribute, old_value & new_value they shouldn't all need to supply "Changed #{attribute} from #{old_value} to #{new_value}".
Will the attribute type fields be able to use type definitions shipped in the same module or present on in environment?
Will they be able to inspect the catalog or communicate between implementations, like in the concat & concat_fragment case?
Will they be able to generate resources? That could probably be done with something similar to the logger methods, but there is nothing like that in the proposal. I assume they won't have a generate method.
Will there be some requirement that these implementations are thread safe so puppet can start managing multiple resources in parallel? Or will they run in some subprocess even?
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHZ-PYgrqboexxDJzzLPgbxidu%2BZs%3DRYHyc85LBOnr4EfA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CAAAzDLcHRk0crAwcXHsP-2QS6Bv1OVHGxQV8_-QoOUnf%2BDvoQg%40mail.gmail.com.To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
As always, if I wait long enough, John will more eloquently provide the response that I wanted to!I've added a few responses inlineBig picture items
Multiple implementations / implementation selection
In splitting resources into "definition" and "implementation", the proposal adheres to a form similar to the current system's, but it seems to have fundamentally different design goals. I've always interpreted the present type / provider system's separation of resource interface from implementation first and foremost as a means to accommodate multiple implementations. The one most appropriate to the target system is chosen from among those available. I think that's a sound design approach; I like it, and it has served Puppet well. As far as I can determine, however, the proposal loses that completely -- I see no means to support multiple implementations at all, much less means to match an implementation to the target system.I hadn't actually caught this, but if not just an oversight, it is indeed concerning. The ability to choose from various providers has been a strong play in Puppet's favor.
Inadequate interface documentation
As I said earlier, one of the biggest problems with the current system is inadequate documentation. As it now stands, the proposal's documentation does about as well as the docs of the current system. Missing is information about how the runtime environment is intended to use the defined objects and methods -- for example, under what circumstances it will instantiate / invoke them, how many times per transaction (partially addressed in the doc), at what point(s) in the transaction. What may the environment do with the object returned by get() and its contents (i.e., must implementations assume that the environment may modify them)? Is there anything the environment is required to do with them? What may an implementation do with the object passed to set() and its contents? Is there any expectation about relationships between the objects provided by get() and received by put()?This is definitely true. Gary's blog has been the seminal resource for figuring out how to write types and providers and, instead of a new interface, it might be good to finish the original docs. That said, I think I remember David saying that part of the intent of this is to provide a language-agnostic interface to managing the system so that users aren't tightly bound to a particular implementation. I think that this is a good goal *but* I would maintain compatibility with existing Type and Provider implementations.
Specific features
Attribute and parameter validation
Validation ought to be performed during catalog building, thus it needs to be adequately addressed by type "definitions". I am dissatisfied with the proposal's provision for that. The Puppet type system is very expressive, but it can also be very arcane. I maintain that it is a mistake to rely exclusively on the type system for validating "attributes" or "operational parameters". Moreover, I think the proposal is missing an opportunity to provide for multiple-attribute, multiple-parameter covalidation. This has been requested on and off for a long time, and this proposal is a great opportunity to finally put it in place.This was one of my biggest worries early on and it has been solidified with the example code. The Type system simply isn't suitable for complex validation at this point. I need logic (sometimes stupid amounts of logic) and, until I can do arbitrary validation with the Type system, I can't use it for my parameters.
On Wed, 1 Mar 2017 at 14:46 Trevor Vaughan <tvau...@onyxpoint.com> wrote:As always, if I wait long enough, John will more eloquently provide the response that I wanted to!I've added a few responses inlineBig picture items
Multiple implementations / implementation selection
In splitting resources into "definition" and "implementation", the proposal adheres to a form similar to the current system's, but it seems to have fundamentally different design goals. I've always interpreted the present type / provider system's separation of resource interface from implementation first and foremost as a means to accommodate multiple implementations. The one most appropriate to the target system is chosen from among those available. I think that's a sound design approach; I like it, and it has served Puppet well. As far as I can determine, however, the proposal loses that completely -- I see no means to support multiple implementations at all, much less means to match an implementation to the target system.I hadn't actually caught this, but if not just an oversight, it is indeed concerning. The ability to choose from various providers has been a strong play in Puppet's favor.Although in some cases it is a bit annoying, for example not being able to install a gem and deb with the same name on a host. This specific case was fixed in https://tickets.puppetlabs.com/browse/PUP-1073 but still exists for other types than package.In many cases the providers actually manage different things, not mutually exclusive variants of the same thing.But having deb_package and rpm_package would definitely just be annoying if you wanted a module that works across multiple distros. Putting both providers into the same implementation file would be a bit ugly, but could work I guess.
Inadequate interface documentation
As I said earlier, one of the biggest problems with the current system is inadequate documentation. As it now stands, the proposal's documentation does about as well as the docs of the current system. Missing is information about how the runtime environment is intended to use the defined objects and methods -- for example, under what circumstances it will instantiate / invoke them, how many times per transaction (partially addressed in the doc), at what point(s) in the transaction. What may the environment do with the object returned by get() and its contents (i.e., must implementations assume that the environment may modify them)? Is there anything the environment is required to do with them? What may an implementation do with the object passed to set() and its contents? Is there any expectation about relationships between the objects provided by get() and received by put()?This is definitely true. Gary's blog has been the seminal resource for figuring out how to write types and providers and, instead of a new interface, it might be good to finish the original docs. That said, I think I remember David saying that part of the intent of this is to provide a language-agnostic interface to managing the system so that users aren't tightly bound to a particular implementation. I think that this is a good goal *but* I would maintain compatibility with existing Type and Provider implementations.There's a bunch of things that can't be solved in any good way in the old type and provider system that looks like it would be pretty easy to solve in this new system. For example this old feature request: https://projects.puppetlabs.com/issues/2198
Specific features
Attribute and parameter validation
Validation ought to be performed during catalog building, thus it needs to be adequately addressed by type "definitions". I am dissatisfied with the proposal's provision for that. The Puppet type system is very expressive, but it can also be very arcane. I maintain that it is a mistake to rely exclusively on the type system for validating "attributes" or "operational parameters". Moreover, I think the proposal is missing an opportunity to provide for multiple-attribute, multiple-parameter covalidation. This has been requested on and off for a long time, and this proposal is a great opportunity to finally put it in place.This was one of my biggest worries early on and it has been solidified with the example code. The Type system simply isn't suitable for complex validation at this point. I need logic (sometimes stupid amounts of logic) and, until I can do arbitrary validation with the Type system, I can't use it for my parameters.If all attributes were defined in a single type it could handle a lot more situations (although the type signature could get massive). Consider this simplified example for a file type:Variant[Struct[{ensure => Enum[link],target => String,}],Struct[{ensure => Enum[file],content => String,}],Struct[{ensure => Enum[file],source => String,}],]That would only allow the target attribute if ensure is "link", and only allow either content or source if ensure is "file".Accounting for all possible attributes on the file type would probably span a few pages though :)
ThanksShawn
Cheers, David--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CALF7fHaPBLuZDzHOV2OSrK_tJQWsFjn%3DK2apdXmRBnN18uUQJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/4089D8E5-E2C5-4251-B445-F548C0CB3DD1%40oracle.com.
On Monday, February 27, 2017 at 8:59:46 AM UTC-6, David Schmitt wrote:Hi folks,I've written down the design in README format now. This should make for a easier to ingest from a module developer's perspective. Meanwhile I've also had a shot at a hardcoded implementation of this, and it is technically possible to implement this too.
Thanks for your efforts on this documentation. It helps clarify your view. I think you have some good ideas here, but I also think that what you've presented, as you've presented it, is not a viable replacement for the type & provider system we have now. Moreover, there remain important areas that I would dearly love to see documented, and one or two longtime sore spots that this proposal seems to be missing the opportunity to address.
Big picture items
Multiple implementations / implementation selection
In splitting resources into "definition" and "implementation", the proposal adheres to a form similar to the current system's, but it seems to have fundamentally different design goals. I've always interpreted the present type / provider system's separation of resource interface from implementation first and foremost as a means to accommodate multiple implementations. The one most appropriate to the target system is chosen from among those available. I think that's a sound design approach; I like it, and it has served Puppet well. As far as I can determine, however, the proposal loses that completely -- I see no means to support multiple implementations at all, much less means to match an implementation to the target system.
Inadequate interface documentation
As I said earlier, one of the biggest problems with the current system is inadequate documentation. As it now stands, the proposal's documentation does about as well as the docs of the current system.
Missing is information about how the runtime environment is intended to use the defined objects and methods -- for example, under what circumstances it will instantiate / invoke them, how many times per transaction (partially addressed in the doc), at what point(s) in the transaction. What may the environment do with the object returned by get() and its contents (i.e., must implementations assume that the environment may modify them)? Is there anything the environment is required to do with them? What may an implementation do with the object passed to set() and its contents? Is there any expectation about relationships between the objects provided by get() and received by put()?
Similarly, are the "utilities" provided by the runtime environment supposed to be the only mechanism by which implementations interact with the host Ruby environment?
The only mechanism by which implementations interact with the target system at all?
Pushing new responsibilities onto resource implementations
The minimalistic approach to defining the interface for resource implementations has the inevitable effect of making the host environment incapable of some of the tasks that it presently performs, leaving the onus on resource implementations to fill the gaps. For example, since there is no way for the environment to retrieve the state of a specified individual resource, and because some resource types cannot feasibly be enumerated, the environment cannot, in general, determine whether resources are in sync. Only resource implementations can be relied upon to do that, and then only in the context of their put() methods.
Similarly, although I definitely like that a logging utility is made available to resource implementations, it seems that most responsibility for logging now falls on resource implementations as well. Currently, the environment handles a fair amount of that. It seems likely that moving the responsibility to resource implementations will not only make them more complicated to write, but will also reduce the consistency of log messages.
Terminology Changes
The proposal seems to gratuitously change established terminology. What we now call "properties" are to be referred to as "attributes", which itself is currently used in a broader sense. "Types" become "definitions"; "providers" become "implementations". Why? However well or poorly you like the established terminology, it is established. The proposal seems to be offering variations on these existing things, not entirely new things, and changing all the terminology is likely to be more harmful than helpful.
Specific features
Attribute and parameter validation
Validation ought to be performed during catalog building, thus it needs to be adequately addressed by type "definitions". I am dissatisfied with the proposal's provision for that. The Puppet type system is very expressive, but it can also be very arcane. I maintain that it is a mistake to rely exclusively on the type system for validating "attributes" or "operational parameters". Moreover, I think the proposal is missing an opportunity to provide for multiple-attribute, multiple-parameter covalidation. This has been requested on and off for a long time, and this proposal is a great opportunity to finally put it in place.
Transaction progress and state
Resource implementations are afforded only indirect information about transaction context. For example, the docs say that "The implementation should not try to cache state beyond the transaction", but how, exactly, is the implementation supposed to recognize the end of a transaction?
It would be highly desirable for implementations to have means to at least opt in to messages about transaction lifecycle and events.
For some purposes, it would be useful to have a way for type implementations to be able to communicate with each other within the scope of a transaction.
Details
"Read-only" vs. "init-only"
I understand -- I think -- the distinction you're drawing here, but it's confusing. Surely if a resource does not yet exist, then all its attributes are logically written when it is created / initialized. I could understand that is a plausible exception to read-onlyness if not for the explicit provision for init-only attributes. If you want to maintain both of these cases among the attribute metadata then please rename one of them. For example, perhaps "derived" or "internal" would describe what you mean by "read-only".
Compound resource names
I suppose that compound namevars are a topic that you hoped to avoid in your push for simplification, though it looks like they might not be hard to support via the interface you have already described. This is a weak area for the present system, and one that I think any replacement should support well and and fully.
Attribute / parameter clashes
By specifying attributes and operational parameters via separate hashes, the proposed system affords the opportunity for name collisions between these. That also makes resource definitions less cohesive with Puppet overall, because in the Puppet language, attributes and operational parameters are specified together, in the same syntactic construct. I fully understand the differences between these, but I suggest they not be reflected by a complete separation between their specifications, such as the proposal currently provides.
Resource implementations that become usable during a run
It is only relatively recent that Puppet implemented the long-desired feature of supporting resource providers that become usable as a result of other resources being managed earlier in the same run. It is not clear whether or how that would be supported under the proposal as currently presented, but this is a feature that I do not think we want to lose.
To allow for a transaction to set up the prerequisites for an implementation, and use it immediately, the provider is instantiated as late as possible.
Environment for commands
If resource implementations are expected to use the 'commands' utility for running system commands, then that utility needs to provide a mechanism for specifying environment variables that must be set in the environment of those commands.
To pass additional environment variables through to the command, pass a hash
of them as `env:`:
```ruby
apt_key 'del', key_id, env: { 'LC_ALL': 'C' }
```
It is furthermore highly desirable that it be possible for the values and maybe even the names of those variables to be drawn from the values of a resource's operational parameters.
Host object
The doc refers in a couple of places to a "host object". What is this? What are its methods and properties? How is it to be used?
Overall
I can see that a lot of thought and effort has gone into this proposal. I've focused mostly on criticisms, but I want to acknowledge that there's some good work here.
Nevertheless, I regret to say that I dislike the overall proposal in its current form. It could be improved to some extent by more complete documentation, but documentation notwithstanding, I think it focuses too much on simplification and on the resource itself, and not enough on how resources will be used by and interact with the larger system. I do not doubt that the proposed API can be implemented, and that's indeed a mark in its favor. If it is intended only as a simpler alternative to the present system, then many of my more significant concerns are irrelevant. To the extent that this is being positioned to ultimately replace the current system, however, it doesn't provide everything it needs to do, it misses providing some things it should do, and I'm dissatisfied with the way it provides some of the things it does.
As always, if I wait long enough, John will more eloquently provide the response that I wanted to!I've added a few responses inlineBig picture items
Multiple implementations / implementation selection
In splitting resources into "definition" and "implementation", the proposal adheres to a form similar to the current system's, but it seems to have fundamentally different design goals. I've always interpreted the present type / provider system's separation of resource interface from implementation first and foremost as a means to accommodate multiple implementations. The one most appropriate to the target system is chosen from among those available. I think that's a sound design approach; I like it, and it has served Puppet well. As far as I can determine, however, the proposal loses that completely -- I see no means to support multiple implementations at all, much less means to match an implementation to the target system.I hadn't actually caught this, but if not just an oversight, it is indeed concerning. The ability to choose from various providers has been a strong play in Puppet's favor.
Inadequate interface documentation
As I said earlier, one of the biggest problems with the current system is inadequate documentation. As it now stands, the proposal's documentation does about as well as the docs of the current system. Missing is information about how the runtime environment is intended to use the defined objects and methods -- for example, under what circumstances it will instantiate / invoke them, how many times per transaction (partially addressed in the doc), at what point(s) in the transaction. What may the environment do with the object returned by get() and its contents (i.e., must implementations assume that the environment may modify them)? Is there anything the environment is required to do with them? What may an implementation do with the object passed to set() and its contents? Is there any expectation about relationships between the objects provided by get() and received by put()?This is definitely true. Gary's blog has been the seminal resource for figuring out how to write types and providers and, instead of a new interface, it might be good to finish the original docs. That said, I think I remember David saying that part of the intent of this is to provide a language-agnostic interface to managing the system so that users aren't tightly bound to a particular implementation. I think that this is a good goal *but* I would maintain compatibility with existing Type and Provider implementations.
I've also put in requests for the ability to maintain global state across the catalog compilation. If the brunt of the work is moved to the client, this may be more feasible but I think that it's perfectly possible in the current system with a namespaced caching system.
Also, I would like to see the resource utilization on the clients made better and maybe this can be addressed during the update.
Pushing new responsibilities onto resource implementations
The minimalistic approach to defining the interface for resource implementations has the inevitable effect of making the host environment incapable of some of the tasks that it presently performs, leaving the onus on resource implementations to fill the gaps. For example, since there is no way for the environment to retrieve the state of a specified individual resource, and because some resource types cannot feasibly be enumerated, the environment cannot, in general, determine whether resources are in sync. Only resource implementations can be relied upon to do that, and then only in the context of their put() methods.I'm assuming (possibly incorrectly) that there would be a standard Ruby back end example provided. I'll also throw down the gauntlet and say that it should be the File resource. Being one of the oldest, and most janky, native types, I feel that it should be the seminal example. If you do 'easy' ones, I'll look at this askance, if you tackle (and improve) the core types, I'll have something solid to reference.
Similarly, although I definitely like that a logging utility is made available to resource implementations, it seems that most responsibility for logging now falls on resource implementations as well. Currently, the environment handles a fair amount of that. It seems likely that moving the responsibility to resource implementations will not only make them more complicated to write, but will also reduce the consistency of log messages.Seems like a reasonable request for a stable exposed logging API
Terminology Changes
The proposal seems to gratuitously change established terminology. What we now call "properties" are to be referred to as "attributes", which itself is currently used in a broader sense. "Types" become "definitions"; "providers" become "implementations". Why? However well or poorly you like the established terminology, it is established. The proposal seems to be offering variations on these existing things, not entirely new things, and changing all the terminology is likely to be more harmful than helpful.I've got to say that I didn't notice this until you mentioned it but, now that you have, I'm also scratching my head. The current names make sense and are documented. I do however, think that the new names are more accessible to new users.
Specific features
Attribute and parameter validation
Validation ought to be performed during catalog building, thus it needs to be adequately addressed by type "definitions". I am dissatisfied with the proposal's provision for that. The Puppet type system is very expressive, but it can also be very arcane. I maintain that it is a mistake to rely exclusively on the type system for validating "attributes" or "operational parameters". Moreover, I think the proposal is missing an opportunity to provide for multiple-attribute, multiple-parameter covalidation. This has been requested on and off for a long time, and this proposal is a great opportunity to finally put it in place.This was one of my biggest worries early on and it has been solidified with the example code. The Type system simply isn't suitable for complex validation at this point. I need logic (sometimes stupid amounts of logic) and, until I can do arbitrary validation with the Type system, I can't use it for my parameters.
Also, this is starting to look suspiciously like the MCollective DDL, which doesn't bother me but I find amusing.
Transaction progress and state
Resource implementations are afforded only indirect information about transaction context. For example, the docs say that "The implementation should not try to cache state beyond the transaction", but how, exactly, is the implementation supposed to recognize the end of a transaction? It would be highly desirable for implementations to have means to at least opt in to messages about transaction lifecycle and events. For some purposes, it would be useful to have a way for type implementations to be able to communicate with each other within the scope of a transaction.This is absolutely necessary! As much as we would like to keep things in nice little boxes, they have interactions and we *need* those interactions. Resources should know the following:1) Where they are in the catalog compile stage and what they can access2) Where they are in the list of similar resource types and what they can access3) Where they are in the graph and what they can accessAs we've grafted this ability into Puppet, we've gained the ability to do thing like native 'concat' as well as rolling up resources to be able to fail unless all resource data is properly defined. I very strongly believe that a floating data type that is catalog accessible would be a good fundamental addition to the Puppet language.
Details
"Read-only" vs. "init-only"
I understand -- I think -- the distinction you're drawing here, but it's confusing. Surely if a resource does not yet exist, then all its attributes are logically written when it is created / initialized. I could understand that is a plausible exception to read-onlyness if not for the explicit provision for init-only attributes. If you want to maintain both of these cases among the attribute metadata then please rename one of them. For example, perhaps "derived" or "internal" would describe what you mean by "read-only".
Compound resource names
I suppose that compound namevars are a topic that you hoped to avoid in your push for simplification, though it looks like they might not be hard to support via the interface you have already described. This is a weak area for the present system, and one that I think any replacement should support well and and fully.Fundamentally, Puppet uses relational database concepts in the basic resource model and, if thought about as such, can use the usual tricks to support uniqueness using multi-key resources.
Attribute / parameter clashes
By specifying attributes and operational parameters via separate hashes, the proposed system affords the opportunity for name collisions between these. That also makes resource definitions less cohesive with Puppet overall, because in the Puppet language, attributes and operational parameters are specified together, in the same syntactic construct. I fully understand the differences between these, but I suggest they not be reflected by a complete separation between their specifications, such as the proposal currently provides.I think I'm OK with this provided that the validator flags any name collisions as a compile failure similar to what properties and parameters do now.
Resource implementations that become usable during a run
It is only relatively recent that Puppet implemented the long-desired feature of supporting resource providers that become usable as a result of other resources being managed earlier in the same run. It is not clear whether or how that would be supported under the proposal as currently presented, but this is a feature that I do not think we want to lose.Ditto. I really need this capability to remain.
On Wed, 1 Mar 2017 at 14:46 Trevor Vaughan <tvau...@onyxpoint.com> wrote:As always, if I wait long enough, John will more eloquently provide the response that I wanted to!I've added a few responses inlineBig picture items
Multiple implementations / implementation selection
In splitting resources into "definition" and "implementation", the proposal adheres to a form similar to the current system's, but it seems to have fundamentally different design goals. I've always interpreted the present type / provider system's separation of resource interface from implementation first and foremost as a means to accommodate multiple implementations. The one most appropriate to the target system is chosen from among those available. I think that's a sound design approach; I like it, and it has served Puppet well. As far as I can determine, however, the proposal loses that completely -- I see no means to support multiple implementations at all, much less means to match an implementation to the target system.I hadn't actually caught this, but if not just an oversight, it is indeed concerning. The ability to choose from various providers has been a strong play in Puppet's favor.Although in some cases it is a bit annoying, for example not being able to install a gem and deb with the same name on a host. This specific case was fixed in https://tickets.puppetlabs.com/browse/PUP-1073 but still exists for other types than package.In many cases the providers actually manage different things, not mutually exclusive variants of the same thing.But having deb_package and rpm_package would definitely just be annoying if you wanted a module that works across multiple distros. Putting both providers into the same implementation file would be a bit ugly, but could work I guess.Inadequate interface documentation
As I said earlier, one of the biggest problems with the current system is inadequate documentation. As it now stands, the proposal's documentation does about as well as the docs of the current system. Missing is information about how the runtime environment is intended to use the defined objects and methods -- for example, under what circumstances it will instantiate / invoke them, how many times per transaction (partially addressed in the doc), at what point(s) in the transaction. What may the environment do with the object returned by get() and its contents (i.e., must implementations assume that the environment may modify them)? Is there anything the environment is required to do with them? What may an implementation do with the object passed to set() and its contents? Is there any expectation about relationships between the objects provided by get() and received by put()?This is definitely true. Gary's blog has been the seminal resource for figuring out how to write types and providers and, instead of a new interface, it might be good to finish the original docs. That said, I think I remember David saying that part of the intent of this is to provide a language-agnostic interface to managing the system so that users aren't tightly bound to a particular implementation. I think that this is a good goal *but* I would maintain compatibility with existing Type and Provider implementations.There's a bunch of things that can't be solved in any good way in the old type and provider system that looks like it would be pretty easy to solve in this new system. For example this old feature request: https://projects.puppetlabs.com/issues/2198
Specific features
Attribute and parameter validation
Validation ought to be performed during catalog building, thus it needs to be adequately addressed by type "definitions". I am dissatisfied with the proposal's provision for that. The Puppet type system is very expressive, but it can also be very arcane. I maintain that it is a mistake to rely exclusively on the type system for validating "attributes" or "operational parameters". Moreover, I think the proposal is missing an opportunity to provide for multiple-attribute, multiple-parameter covalidation. This has been requested on and off for a long time, and this proposal is a great opportunity to finally put it in place.This was one of my biggest worries early on and it has been solidified with the example code. The Type system simply isn't suitable for complex validation at this point. I need logic (sometimes stupid amounts of logic) and, until I can do arbitrary validation with the Type system, I can't use it for my parameters.If all attributes were defined in a single type it could handle a lot more situations (although the type signature could get massive). Consider this simplified example for a file type:Variant[Struct[{ensure => Enum[link],target => String,}],Struct[{ensure => Enum[file],content => String,}],Struct[{ensure => Enum[file],source => String,}],]That would only allow the target attribute if ensure is "link", and only allow either content or source if ensure is "file".Accounting for all possible attributes on the file type would probably span a few pages though :)
Overall I think this looks pretty good, but I have some questions and comments.For implementations that can't enumerate all existing resources (at least not in any fast way), like for example File, should they fine a get method that returns an empty hash, nil or just not define a get method? nil or not defining would allow special handling of them, so that would probably be preferable.But shouldn't it really have a way to fetch the state of a single resource in for those implementations? How would this otherwise handle something like "puppet resource file /path/to/file", traverse the entire file system and stat every file?
It seems it would also be handy with some helper method that compared current_state & target_state to filter out the changes that actually need to happen. I would imagine that sort of code would otherwise be duplicated in many implementations.
logger.attribute_changed could really have a default value for message if the implementation provides attribute, old_value & new_value they shouldn't all need to supply "Changed #{attribute} from #{old_value} to #{new_value}".
Will the attribute type fields be able to use type definitions shipped in the same module or present on in environment?
Will they be able to inspect the catalog or communicate between implementations, like in the concat & concat_fragment case?
Will they be able to generate resources? That could probably be done with something similar to the logger methods, but there is nothing like that in the proposal. I assume they won't have a generate method.
Will there be some requirement that these implementations are thread safe so puppet can start managing multiple resources in parallel? Or will they run in some subprocess even?
Big picture items
Multiple implementations / implementation selection
In splitting resources into "definition" and "implementation", the proposal adheres to a form similar to the current system's, but it seems to have fundamentally different design goals. I've always interpreted the present type / provider system's separation of resource interface from implementation first and foremost as a means to accommodate multiple implementations. The one most appropriate to the target system is chosen from among those available. I think that's a sound design approach; I like it, and it has served Puppet well. As far as I can determine, however, the proposal loses that completely -- I see no means to support multiple implementations at all, much less means to match an implementation to the target system.I hadn't actually caught this, but if not just an oversight, it is indeed concerning. The ability to choose from various providers has been a strong play in Puppet's favor.