Cfgmgmt: ParsedFile vs Augeas discussion

151 views
Skip to first unread message

Daniele Sluijters

unread,
Feb 5, 2014, 7:17:32 AM2/5/14
to puppe...@googlegroups.com
Hi everyone,

For those who made it out to Gent, it was nice getting to know you all and I hope you all
returned safe and sound.

At Cfgmgmt a discussion started about ParsedFile and how just about everyone would prefer
to see that gone and replaced by something more workable. It quite quickly became clear that
there was a strong preference to replace it with (something like) Augeas.

I've raised a ticket, https://tickets.puppetlabs.com/browse/PUP-1590, with what I remember
from that discussion.

I'd like to ask everyone to chime in. Try and keep the discussion here on the mailing lists and
if you agree with the general sentiment of the ticket feel free to vote for it.

-- 
Daniele Sluijters

Trevor Vaughan

unread,
Feb 5, 2014, 9:38:18 AM2/5/14
to puppe...@googlegroups.com
If you can make it as provably performant as ParsedFile and not require a PhD in Computer Science language theory to write complex lenses, then +1.

That said, I've found it generally less error prone and more flexible to just model files in Ruby and be done with it. Particularly for complex files.

What I generally want:

1) Low I/O: Do all of my operations on a file at once, don't keep opening, copying, writing, and closing files.
1a) Don't leave my files in a half configured state.
2) If there is difficulty parsing a file, let me know what's going on, where and why.
3) Handle unknown fields gracefully (ParsedFile does this well, Augeas depends on the lens)
4) Make most file modelling operations as simple as possible.

I like the augeasproviders project but I can see where many shops would find it quite difficult to maintain over time and, of course, it still has my concerns with #1.

Thanks,

Trevor




--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/28235664-253f-42b1-bdc6-e7f65015d8e9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvau...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --

Ashley Penney

unread,
Feb 5, 2014, 12:59:52 PM2/5/14
to puppe...@googlegroups.com
Having just spent the last day solid reading parsedfile, writing notes, and following it down the rabbithole I can't find myself arguing against this.  I am planning to at least experiment with trying to a/ document this heavily b/ clean it up and make it easier and more usable because everyone hates it currently for a good reason.  It won't be dramatically better because of the requirements of backwards compatibility, but we can at least tidy it up a little.

I just thought I'd mention https://github.com/adrienthebo/puppet-filemapper for people who haven't seen it, Adrien has made a pretty awesome alternative to parsedfile.


On Wed, Feb 5, 2014 at 7:17 AM, Daniele Sluijters <daniele....@gmail.com> wrote:

--

Felix Frank

unread,
Feb 5, 2014, 2:20:58 PM2/5/14
to puppe...@googlegroups.com
On 02/05/2014 06:59 PM, Ashley Penney wrote:
> Having just spent the last day solid reading parsedfile, writing
> notes, and following it down the rabbithole I can't find myself
> arguing against this. I am planning to at least experiment with
> trying to a/ document this heavily b/ clean it up and make it easier
> and more usable because everyone hates it currently for a good reason.
> It won't be dramatically better because of the requirements of
> backwards compatibility, but we can at least tidy it up a little.

This resembles my feeling quite accurately. I've butted heads with
parsedfile quite a bit as well, and it can make you sad.

Can you elaborate on the possible compatibility concerns? I don't have a
clear idea in what way those could limit re-implementations.

Thanks,
Felix

Ashley Penney

unread,
Feb 5, 2014, 2:52:29 PM2/5/14
to puppe...@googlegroups.com
On Wed, Feb 5, 2014 at 2:20 PM, Felix Frank <Felix...@alumni.tu-berlin.de> wrote:

This resembles my feeling quite accurately. I've butted heads with
parsedfile quite a bit as well, and it can make you sad.

Can you elaborate on the possible compatibility concerns? I don't have a
clear idea in what way those could limit re-implementations.

I'd like to but I actually don't know yet, I haven't dug into the providers using it to see exactly what is considered the "public API".  My concern was too many providers are going to override a lot of the internal functionality making it tricky to refactor to a significant deal, but that was just a worry, not actually confirmed by reality.  This is just me playing around with the code and trying to get a feel for how it works in my mind and what scope we have to rework it without breaking things.

I suppose this brings up a good question - does anyone reading have a custom provider based on ParsedFile that they don't have anywhere public?  I'm trying to find a list of things outside of core relying on it so I can see how they are using it, and so far I only really know of inifile. 

Luke Kanies

unread,
Feb 5, 2014, 3:23:04 PM2/5/14
to puppe...@googlegroups.com
On Feb 5, 2014, at 8:52 PM, Ashley Penney <ape...@gmail.com> wrote:

On Wed, Feb 5, 2014 at 2:20 PM, Felix Frank <Felix...@alumni.tu-berlin.de> wrote:

This resembles my feeling quite accurately. I've butted heads with
parsedfile quite a bit as well, and it can make you sad.

Can you elaborate on the possible compatibility concerns? I don't have a
clear idea in what way those could limit re-implementations.

I'd like to but I actually don't know yet, I haven't dug into the providers using it to see exactly what is considered the "public API".  My concern was too many providers are going to override a lot of the internal functionality making it tricky to refactor to a significant deal, but that was just a worry, not actually confirmed by reality.  This is just me playing around with the code and trying to get a feel for how it works in my mind and what scope we have to rework it without breaking things.

I expect if you reduced the feature set to just what things like hosts and passwd need, and skipped cron (which is by far the most complicated file type it supports) and ssh (which also has to support multiple files), then you'd get a much simpler, more maintainable library.

I suppose this brings up a good question - does anyone reading have a custom provider based on ParsedFile that they don't have anywhere public?  I'm trying to find a list of things outside of core relying on it so I can see how they are using it, and so far I only really know of inifile. 

Does compatibility really matter?  Can't you just leave ParsedFile in there, and start throwing deprecation notices once we've successfully ported all core stuff over to another back-end?

Felix Frank

unread,
Feb 5, 2014, 5:59:55 PM2/5/14
to puppe...@googlegroups.com
On 02/05/2014 09:23 PM, Luke Kanies wrote:
> I expect if you reduced the feature set to just what things like hosts
> and passwd need, and skipped cron (which is by far the most
> complicated file type it supports)

Yeah, though after recent refactoring (not my doing, Stefan gave it a
good once-over when fixing #16809), the crontab provider doesn't rely
very heavily on the parsedfile commodities anymore, anyway. I guess the
fileparsing module was actually overtaxed with the shifting patterns
found in today's crontabs.

The good news is (hopefully), moving the crontab provider away from
parsedfile will likely not cost much compared to the simpler files.

But disentangling it from whatever should replace parsedfile is probably
a good idea.

Cheers,
Felix

Raphaël Pinson

unread,
Feb 9, 2014, 6:09:34 PM2/9/14
to puppe...@googlegroups.com

Hi Trevor,

1) there is an issue open at https://github.com/hercules-team/augeasproviders/issues/67. I've been thinking about how to fix this and have come up with a few ideas. Currently, a new Augeas handler is spawned for every call to augopen, which means the file is opened for exists?, create, destroy, and for each property getter or setter. That explains the huge amount of file openings (and possibly writings, too) you can get with these providers. Solutions include:

- using one augeas handler per resource, reusing it for all methods, and possibly using flush to save. This is probably a good idea anyway as it will avoid applying some properties when others may fail.
- using one augeas handler for all resources. While this would drastically reduce the amount of times files are opened, it is most likely a bad idea as this means a single resource breaking the Augeas tree will cause all resources to fail saving.
- "prefetching" all resources from a given target file when a resource first requests it. I'm not really sure how that would be implemented, but it would at least reduce the amount of file reading. Coupling this with a flush method per resource would lead to a nice solution imo.

2) Augeas providers already report when files fail to parse. The Augeas tree clearly indicates which line and character failed to parse, so this can easily be improved.

3) that is a tricky one. As you said, it mostly depends on the lens itself. Some lenses are very strict (with all known parameters explicitly listed) while others are quite loose. It is often up to the lens authors to decide on their policy, but some formats also don't leave much of a choice (because of entry types, typically strings vs lists)

4) I suppose you're referring to mapping properties to the Augeas API?


All this said, while Augeas now has most of the lenses required to replace all parsedfile providers in the Puppet core, I would still be in favor of keeping some kind of parsedfile (or equivalent, such as Adrien's new implementation) mechanism in the core for cases that Augeas might have a hard time addressing (or devs that would rather not use it).


Cheers,

Raphaël Pinson

Trevor Vaughan

unread,
Feb 10, 2014, 9:45:02 AM2/10/14
to puppe...@googlegroups.com
Hi Raphaël,

1) I'm a fan of the first bullet. Just having everything run through flush would be a huge bonus.

2) Good to hear!

3) Yeah, I definitely know that it depends on the lens. Unfortunately, I've just had too many issues over time with lenses missing various features *or* being too loose.

4) Unfortunately, I'm referring both to mapping the properties and to the Augueas language itself. The learning curve is high on creating new custom types and the learning curve is even higher an creating new Augeas lenses. Each time I've wanted to create an Augeas lens I ended up frustrated with the error handling and ease of testing and just wrote it in Ruby because it was faster. I 100% love the idea, I just find it frustrating in reality.

Thanks,

Trevor


--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.

Raphaël Pinson

unread,
Feb 10, 2014, 9:59:58 AM2/10/14
to puppe...@googlegroups.com
Hi Trevor,


I started working on the flush part. I have a dev branch (which doesn't pass all tests just now) at https://github.com/raphink/augeasproviders/tree/dev/aug_one_handler

Feel free to test it and let me know if that improves your situation with the number of files opened by Augeasproviders.


On Monday, February 10, 2014 3:45:02 PM UTC+1, Trevor Vaughan wrote:
Hi Raphaël,

1) I'm a fan of the first bullet. Just having everything run through flush would be a huge bonus.

2) Good to hear!

3) Yeah, I definitely know that it depends on the lens. Unfortunately, I've just had too many issues over time with lenses missing various features *or* being too loose.


As far as I'm concerned, I consider Augeas to be a parser before anything else. So I care that my lenses parse as much as possible, and produce valid files. Nowadays, I'd only make lenses strict if there is a need to strictly identify parameters and their entry types.

 
4) Unfortunately, I'm referring both to mapping the properties and to the Augueas language itself. The learning curve is high on creating new custom types and the learning curve is even higher an creating new Augeas lenses. Each time I've wanted to create an Augeas lens I ended up frustrated with the error handling and ease of testing and just wrote it in Ruby because it was faster. I 100% love the idea, I just find it frustrating in reality.


I understand your frustration. Augeas comes with many lenses already, so most Augeas providers won't need a new lens. On the positive side, I'm planning on organizing a full-fledge Augeas course around Puppetconf (4 days, including Augeas basics, writing facts, functions and providers with the Augeas ruby lib, as well as writing new lenses).


Cheers,

Raphaël

Trevor Vaughan

unread,
Feb 10, 2014, 10:14:59 AM2/10/14
to puppe...@googlegroups.com

I started working on the flush part. I have a dev branch (which doesn't pass all tests just now) at https://github.com/raphink/augeasproviders/tree/dev/aug_one_handler

Feel free to test it and let me know if that improves your situation with the number of files opened by Augeasproviders.

Awesome, I'll have to check it out!
 

3) Yeah, I definitely know that it depends on the lens. Unfortunately, I've just had too many issues over time with lenses missing various features *or* being too loose.


As far as I'm concerned, I consider Augeas to be a parser before anything else. So I care that my lenses parse as much as possible, and produce valid files. Nowadays, I'd only make lenses strict if there is a need to strictly identify parameters and their entry types.


Fair enough and +1 to this approach.
 
 
4) Unfortunately, I'm referring both to mapping the properties and to the Augueas language itself. The learning curve is high on creating new custom types and the learning curve is even higher an creating new Augeas lenses. Each time I've wanted to create an Augeas lens I ended up frustrated with the error handling and ease of testing and just wrote it in Ruby because it was faster. I 100% love the idea, I just find it frustrating in reality.


I understand your frustration. Augeas comes with many lenses already, so most Augeas providers won't need a new lens. On the positive side, I'm planning on organizing a full-fledge Augeas course around Puppetconf (4 days, including Augeas basics, writing facts, functions and providers with the Augeas ruby lib, as well as writing new lenses).


Excellent! I hope to be there.
 
Thanks,

Trevor

Raphaël Pinson

unread,
Feb 10, 2014, 10:22:58 AM2/10/14
to puppe...@googlegroups.com

On Monday, February 10, 2014 4:14:59 PM UTC+1, Trevor Vaughan wrote:


I started working on the flush part. I have a dev branch (which doesn't pass all tests just now) at https://github.com/raphink/augeasproviders/tree/dev/aug_one_handler

Feel free to test it and let me know if that improves your situation with the number of files opened by Augeasproviders.

Awesome, I'll have to check it out!


For the record, here are my tests using the unit tests on the providers:

# master branch (current HEAD):
$ strace -e trace=open /home/rpinson/.rvm/rubies/ruby-1.9.2-p320/bin/ruby -S rspec spec/unit/augeasproviders/provider_spec.rb spec/unit/puppet/apache_setenv_spec.rb spec/unit/puppet/host_spec.rb spec/unit/puppet/kernel_parameter_grub2_spec.rb spec/unit/puppet/kernel_parameter_grub_spec.rb spec/unit/puppet/mailalias_spec.rb spec/unit/puppet/mounttab_fstab_spec.rb spec/unit/puppet/mounttab_vfstab_spec.rb spec/unit/puppet/nrpe_command_spec.rb spec/unit/puppet/pg_hba_spec.rb spec/unit/puppet/puppet_auth_spec.rb spec/unit/puppet/rsyslog_spec.rb spec/unit/puppet/shellvar_spec.rb spec/unit/puppet/shellvar_type_spec.rb spec/unit/puppet/sshd_config_spec.rb spec/unit/puppet/sshd_config_subsystem_spec.rb spec/unit/puppet/sysctl_spec.rb spec/unit/puppet/syslog_spec.rb 2>&1 | grep -c /tmp/target
2571

# dev/aug_one_handler branch:
$ strace -e trace=open /home/rpinson/.rvm/rubies/ruby-1.9.2-p320/bin/ruby -S rspec spec/unit/augeasproviders/provider_spec.rb spec/unit/puppet/apache_setenv_spec.rb spec/unit/puppet/host_spec.rb spec/unit/puppet/kernel_parameter_grub2_spec.rb spec/unit/puppet/kernel_parameter_grub_spec.rb spec/unit/puppet/mailalias_spec.rb spec/unit/puppet/mounttab_fstab_spec.rb spec/unit/puppet/mounttab_vfstab_spec.rb spec/unit/puppet/nrpe_command_spec.rb spec/unit/puppet/pg_hba_spec.rb spec/unit/puppet/puppet_auth_spec.rb spec/unit/puppet/rsyslog_spec.rb spec/unit/puppet/shellvar_spec.rb spec/unit/puppet/shellvar_type_spec.rb spec/unit/puppet/sshd_config_spec.rb spec/unit/puppet/sshd_config_subsystem_spec.rb spec/unit/puppet/sysctl_spec.rb spec/unit/puppet/syslog_spec.rb 2>&1 | grep -c /tmp/target
1352


It's not amazing, but it already divides the number of open files by two.


Cheers,

Raphaël

Raphaël Pinson

unread,
Feb 27, 2014, 6:15:29 AM2/27/14
to puppe...@googlegroups.com
Hello,


I kept working on this. The current branch (and associated PR at https://github.com/hercules-team/augeasproviders/pull/80) shares one Augeas handler per provider. This was made possible by Puppet 3.4's post_resource_eval class method, which lets me close the handler once all resources are applied. This branch is almost ready to be merged, and I don't see how I could still reduce the amount of file openings (Augeas needs to read the file before each write to merge the tree, and saving needs to be done per resource in flush, otherwise it will affect other resources and dependencies).


Cheers,

Raphaël

Trevor Vaughan

unread,
Feb 28, 2014, 10:28:03 AM2/28/14
to puppe...@googlegroups.com
Thanks for all of your work on this, I'll test it out as soon as I am able!

Trevor


--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

Raphaël Pinson

unread,
Apr 9, 2014, 3:56:38 AM4/9/14
to puppe...@googlegroups.com
Hello,


Just an update on this:

* The flush & post_resource_eval code has been committed in HEAD. It's still unreleased, but you can easily use it in the git repo. It improves perfs for Puppet 3.4+;
* Improvements have been made to error reporting;
* The first step splitting the lib (making a base type and provider) is currently at the PR stage at https://github.com/hercules-team/augeasproviders/pull/91


Again, ideas & feedback are welcome.


Raphaël
Reply all
Reply to author
Forward
0 new messages