Fixing #1372

24 views
Skip to first unread message

llowder

unread,
Nov 26, 2012, 11:28:00 AM11/26/12
to puppe...@googlegroups.com
After doing some digging and poking around, I have found out why it is doing what it is doing, and why include and param syntaxes work differently.

There are a few different ways to fix this, but the implications are somewhat wide reaching and there isn't an easy fix.

The basic problem is that nodes get loaded as classes - "node ubuntu" and "node /ubuntu/" both load a class named "ubuntu" while "node /ubunt./" loads a class named "ubunt.".

Using param syntax ( class { 'ubuntu': } ) the class gets loaded regardless of which type of node def you use, while using include syntax, it only gets loaded for the wild card regex node name ( node /ubunt./ ).

There are two general ways this can be fixed.

1) Make include syntax work like param syntax, where the class always gets loaded.
2) Disallow nodes and classes to have the same name all together

2 isn't that good of an option, and would not be an option in the 2.7 or 3.x lines under SemVer, as it would be backwards incompatible, so 4.x would be the minimum target for it.

For 1.. there are a few ways I have been able to come up with, but none they all have potentially far reaching implications.

a) munge the class names if the class is a node
b) use a new scope level for nodes (this would also allow node level variables to be accessible in classes, which they currently are not)

Both of these options have potential to break things. I somewhat like option a, but I am not familiar with the code to be able to be sure that all access points to that name would properly munge/demunge so as to be transparent / not break existing code, if that is even possible.

With option b, I also cannot think of any way to do it that would be compatible with existing code.

Does anyone have any thoughts / ideas? Is this even something that "needs" to be fixed in the 2.7.x or 3.x code bases?

This bug isn't something that impacts me personally - I just thought it looked interesting. I don't have any uses cases that would need to do this, but I can see how in some cases it would be useful, especially when it comes to bootstrapping.


Details of #1372 are at http://projects.puppetlabs.com/issues/1372

Jeff McCune

unread,
Nov 26, 2012, 12:56:05 PM11/26/12
to puppe...@googlegroups.com
On Mon, Nov 26, 2012 at 8:28 AM, llowder <llow...@gmail.com> wrote:
> After doing some digging and poking around, I have found out why it is doing
> what it is doing, and why include and param syntaxes work differently.

Thanks for taking the time and effort to look into this!

> There are a few different ways to fix this, but the implications are
> somewhat wide reaching and there isn't an easy fix.
>
> The basic problem is that nodes get loaded as classes - "node ubuntu" and
> "node /ubuntu/" both load a class named "ubuntu" while "node /ubunt./" loads
> a class named "ubunt.".

Do we know why nodes get loaded as classes? My first reaction while
reviewing this is that I don't fully understand why a node and a class
are conceptually muddled up together. If we're not sure why, or it
was an accident, perhaps we could simply make a node completely
separate from a class. This separation should prevent the conflict,
no?

-Jeff

Andy Parker

unread,
Nov 26, 2012, 12:58:04 PM11/26/12
to puppe...@googlegroups.com
On Mon, Nov 26, 2012 at 8:28 AM, llowder <llow...@gmail.com> wrote:
After doing some digging and poking around, I have found out why it is doing what it is doing, and why include and param syntaxes work differently.

There are a few different ways to fix this, but the implications are somewhat wide reaching and there isn't an easy fix.

The basic problem is that nodes get loaded as classes - "node ubuntu" and "node /ubuntu/" both load a class named "ubuntu" while "node /ubunt./" loads a class named "ubunt.".

Using param syntax ( class { 'ubuntu': } ) the class gets loaded regardless of which type of node def you use, while using include syntax, it only gets loaded for the wild card regex node name ( node /ubunt./ ).

There are two general ways this can be fixed.

1) Make include syntax work like param syntax, where the class always gets loaded.
2) Disallow nodes and classes to have the same name all together

2 isn't that good of an option, and would not be an option in the 2.7 or 3.x lines under SemVer, as it would be backwards incompatible, so 4.x would be the minimum target for it.


Yeah, option 2 doesn't seem ideal. It also doesn't seem right that we put in a restriction like that simply because the internals can't tell things apart. Bad UX.
 
For 1.. there are a few ways I have been able to come up with, but none they all have potentially far reaching implications.

a) munge the class names if the class is a node
b) use a new scope level for nodes (this would also allow node level variables to be accessible in classes, which they currently are not)

Both of these options have potential to break things. I somewhat like option a, but I am not familiar with the code to be able to be sure that all access points to that name would properly munge/demunge so as to be transparent / not break existing code, if that is even possible.

I remember we talked about this, but why can't we just make "include foo" do the exact same things as "class { foo: }"?
 

With option b, I also cannot think of any way to do it that would be compatible with existing code.

Does anyone have any thoughts / ideas? Is this even something that "needs" to be fixed in the 2.7.x or 3.x code bases?

It is something that needs to be resolved at some point, but I don't think the timing on it is critical.
 

This bug isn't something that impacts me personally - I just thought it looked interesting. I don't have any uses cases that would need to do this, but I can see how in some cases it would be useful, especially when it comes to bootstrapping.


You do like jumping in at the deep end, don't you :)
 

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To view this discussion on the web visit https://groups.google.com/d/msg/puppet-dev/-/FR2eGTDbCjUJ.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.

llowder

unread,
Nov 26, 2012, 2:34:56 PM11/26/12
to puppe...@googlegroups.com


On Monday, November 26, 2012 11:58:07 AM UTC-6, Andy Parker wrote:
On Mon, Nov 26, 2012 at 8:28 AM, llowder <llow...@gmail.com> wrote:
After doing some digging and poking around, I have found out why it is doing what it is doing, and why include and param syntaxes work differently.

There are a few different ways to fix this, but the implications are somewhat wide reaching and there isn't an easy fix.

The basic problem is that nodes get loaded as classes - "node ubuntu" and "node /ubuntu/" both load a class named "ubuntu" while "node /ubunt./" loads a class named "ubunt.".

Using param syntax ( class { 'ubuntu': } ) the class gets loaded regardless of which type of node def you use, while using include syntax, it only gets loaded for the wild card regex node name ( node /ubunt./ ).

There are two general ways this can be fixed.

1) Make include syntax work like param syntax, where the class always gets loaded.
2) Disallow nodes and classes to have the same name all together

2 isn't that good of an option, and would not be an option in the 2.7 or 3.x lines under SemVer, as it would be backwards incompatible, so 4.x would be the minimum target for it.


Yeah, option 2 doesn't seem ideal. It also doesn't seem right that we put in a restriction like that simply because the internals can't tell things apart. Bad UX.
 
For 1.. there are a few ways I have been able to come up with, but none they all have potentially far reaching implications.

a) munge the class names if the class is a node
b) use a new scope level for nodes (this would also allow node level variables to be accessible in classes, which they currently are not)

Both of these options have potential to break things. I somewhat like option a, but I am not familiar with the code to be able to be sure that all access points to that name would properly munge/demunge so as to be transparent / not break existing code, if that is even possible.

I remember we talked about this, but why can't we just make "include foo" do the exact same things as "class { foo: }"?

The question is.. how?  include will check to see if the class is already loaded, and if it is not, then it loads the class, if it has been, then it skips it and moves on to the next.

Param syntax does not do this check, and will throw an error if the class is already loaded.

What I don't understand is why having two classes "ubuntu" and "ubuntu" doesn't cause an error, yet does cause the include to misfire.

One of the things that was mentioned in IRC involved checking when the class was loaded to see if it was a node or an actual class, but I am unsure how to do that.

In puppet/lib/puppet/parser/ast/resource.rb it has:

          scope.compiler.add_resource(scope, resource)
          scope.compiler.evaluate_classes([resource_title],scope,false) if fully_qualified_type == 'class'

"fully_qualified_type" is set with the line: fully_qualified_type, resource_titles = scope.resolve_type_and_titles(type, resource_titles)

In the include function, it just has:

klasses = compiler.evaluate_classes(vals, self, false)

I did find in the compiler where I can check the type of a resource being added to a catalog, but not yet figured out how to use that do something useful

Erik Dalén

unread,
Nov 27, 2012, 4:26:53 AM11/27/12
to puppe...@googlegroups.com
On Monday 26 November 2012 at 18:58, Andy Parker wrote:
> On Mon, Nov 26, 2012 at 8:28 AM, llowder <llow...@gmail.com (mailto:llow...@gmail.com)> wrote:
> > After doing some digging and poking around, I have found out why it is doing what it is doing, and why include and param syntaxes work differently.
> >
> > There are a few different ways to fix this, but the implications are somewhat wide reaching and there isn't an easy fix.
> >
> > The basic problem is that nodes get loaded as classes - "node ubuntu" and "node /ubuntu/" both load a class named "ubuntu" while "node /ubunt./" loads a class named "ubunt.".
> >
> > Using param syntax ( class { 'ubuntu': } ) the class gets loaded regardless of which type of node def you use, while using include syntax, it only gets loaded for the wild card regex node name ( node /ubunt./ ).
> >
> > There are two general ways this can be fixed.
> >
> > 1) Make include syntax work like param syntax, where the class always gets loaded.
> > 2) Disallow nodes and classes to have the same name all together
> >
> > 2 isn't that good of an option, and would not be an option in the 2.7 or 3.x lines under SemVer, as it would be backwards incompatible, so 4.x would be the minimum target for it.
>
> Yeah, option 2 doesn't seem ideal. It also doesn't seem right that we put in a restriction like that simply because the internals can't tell things apart. Bad UX.

Well, as you can reference a variable in the node ubuntu using $ubuntu::variable and you can reference a variable in the class ubuntu using $ubuntu::variable I don't see how you could allow both at the same time.

I think we either have to always rename the node scope to something like "node" and not allow classes with that name (would also make it easy to reference variables in the node scope as it has a fixed name).

Or skip having a special node scope and just merge it with the top scope, which would also make variables in it work more like variables in a ENC.

Neither of these options are entirely backwards compatible though, but I'm not sure any solution to this problem can be entirely backwards compatible.

--
Erik Dalén



Andy Parker

unread,
Nov 27, 2012, 2:10:00 PM11/27/12
to puppe...@googlegroups.com
On Tue, Nov 27, 2012 at 1:26 AM, Erik Dalén <erik.gus...@gmail.com> wrote:
On Monday 26 November 2012 at 18:58, Andy Parker wrote:
> On Mon, Nov 26, 2012 at 8:28 AM, llowder <llow...@gmail.com (mailto:llow...@gmail.com)> wrote:
> > After doing some digging and poking around, I have found out why it is doing what it is doing, and why include and param syntaxes work differently.
> >
> > There are a few different ways to fix this, but the implications are somewhat wide reaching and there isn't an easy fix.
> >
> > The basic problem is that nodes get loaded as classes - "node ubuntu" and "node /ubuntu/" both load a class named "ubuntu" while "node /ubunt./" loads a class named "ubunt.".
> >
> > Using param syntax ( class { 'ubuntu': } ) the class gets loaded regardless of which type of node def you use, while using include syntax, it only gets loaded for the wild card regex node name ( node /ubunt./ ).
> >
> > There are two general ways this can be fixed.
> >
> > 1) Make include syntax work like param syntax, where the class always gets loaded.
> > 2) Disallow nodes and classes to have the same name all together
> >
> > 2 isn't that good of an option, and would not be an option in the 2.7 or 3.x lines under SemVer, as it would be backwards incompatible, so 4.x would be the minimum target for it.
>
> Yeah, option 2 doesn't seem ideal. It also doesn't seem right that we put in a restriction like that simply because the internals can't tell things apart. Bad UX.

Well, as you can reference a variable in the node ubuntu using $ubuntu::variable and you can reference a variable in the class ubuntu using $ubuntu::variable I don't see how you could allow both at the same time.

I don't seem to be able to reference a node variable directly:

> bundle exec puppet apply -e 'node aparker {
$a = "node"
include foo
}
class foo { notice($aparker::a) }
'
Warning: Scope(Class[Foo]): Could not look up qualified variable 'aparker::a'; class aparker could not be found
Scope(Class[Foo]):
Finished catalog run in 0.03 seconds
 

I think we either have to always rename the node scope to something like "node" and not allow classes with that name (would also make it easy to reference variables in the node scope as it has a fixed name).

Or skip having a special node scope and just merge it with the top scope, which would also make variables in it work more like variables in a ENC.

Neither of these options are entirely backwards compatible though, but I'm not sure any solution to this problem can be entirely backwards compatible.


There are all sorts of problems with combining top and node scope. A few months ago I proposed that, but it is a large change that breaks things that people depend on.
 
--
Erik Dalén




--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.

llowder

unread,
Nov 27, 2012, 2:17:38 PM11/27/12
to puppe...@googlegroups.com


On Tuesday, November 27, 2012 1:10:02 PM UTC-6, Andy Parker wrote:
On Tue, Nov 27, 2012 at 1:26 AM, Erik Dalén <erik.gus...@gmail.com> wrote:
On Monday 26 November 2012 at 18:58, Andy Parker wrote:
> On Mon, Nov 26, 2012 at 8:28 AM, llowder <llow...@gmail.com (mailto:llow...@gmail.com)> wrote:
> > After doing some digging and poking around, I have found out why it is doing what it is doing, and why include and param syntaxes work differently.
> >
> > There are a few different ways to fix this, but the implications are somewhat wide reaching and there isn't an easy fix.
> >
> > The basic problem is that nodes get loaded as classes - "node ubuntu" and "node /ubuntu/" both load a class named "ubuntu" while "node /ubunt./" loads a class named "ubunt.".
> >
> > Using param syntax ( class { 'ubuntu': } ) the class gets loaded regardless of which type of node def you use, while using include syntax, it only gets loaded for the wild card regex node name ( node /ubunt./ ).
> >
> > There are two general ways this can be fixed.
> >
> > 1) Make include syntax work like param syntax, where the class always gets loaded.
> > 2) Disallow nodes and classes to have the same name all together
> >
> > 2 isn't that good of an option, and would not be an option in the 2.7 or 3.x lines under SemVer, as it would be backwards incompatible, so 4.x would be the minimum target for it.
>
> Yeah, option 2 doesn't seem ideal. It also doesn't seem right that we put in a restriction like that simply because the internals can't tell things apart. Bad UX.

Well, as you can reference a variable in the node ubuntu using $ubuntu::variable and you can reference a variable in the class ubuntu using $ubuntu::variable I don't see how you could allow both at the same time.

I don't seem to be able to reference a node variable directly:

> bundle exec puppet apply -e 'node aparker {
$a = "node"
include foo
}
class foo { notice($aparker::a) }
'
Warning: Scope(Class[Foo]): Could not look up qualified variable 'aparker::a'; class aparker could not be found
Scope(Class[Foo]):
Finished catalog run in 0.03 seconds
 


You can't. A while back there appearantly some discussion as to whether or not top scope or node scope should be used when $::foo was used. Top scope won, and node scope became only accessible in that actual node - outside of classes.

It would be nice if there was a way to access $node::foo though, but I think hiera now provides a much cleaner way of providing that sort of thing, at least for all the use cases I can think of (node level 'global' defaults that differ from node to node enough to not be a pure topscope global)

Erik Dalén

unread,
Nov 27, 2012, 3:17:47 PM11/27/12
to puppe...@googlegroups.com
On Tuesday 27 November 2012 at 20:17, llowder wrote:
>
>
> On Tuesday, November 27, 2012 1:10:02 PM UTC-6, Andy Parker wrote:
> > On Tue, Nov 27, 2012 at 1:26 AM, Erik Dalén <erik.gus...@gmail.com (javascript:)> wrote:
> > > On Monday 26 November 2012 at 18:58, Andy Parker wrote:
> > > > On Mon, Nov 26, 2012 at 8:28 AM, llowder <llow...@gmail.com (javascript:) (mailto:llow...@gmail.com (javascript:))> wrote:
> > > > > After doing some digging and poking around, I have found out why it is doing what it is doing, and why include and param syntaxes work differently.
> > > > >
> > > > > There are a few different ways to fix this, but the implications are somewhat wide reaching and there isn't an easy fix.
> > > > >
> > > > > The basic problem is that nodes get loaded as classes - "node ubuntu" and "node /ubuntu/" both load a class named "ubuntu" while "node /ubunt./" loads a class named "ubunt.".
> > > > >
> > > > > Using param syntax ( class { 'ubuntu': } ) the class gets loaded regardless of which type of node def you use, while using include syntax, it only gets loaded for the wild card regex node name ( node /ubunt./ ).
> > > > >
> > > > > There are two general ways this can be fixed.
> > > > >
> > > > > 1) Make include syntax work like param syntax, where the class always gets loaded.
> > > > > 2) Disallow nodes and classes to have the same name all together
> > > > >
> > > > > 2 isn't that good of an option, and would not be an option in the 2.7 or 3.x lines under SemVer, as it would be backwards incompatible, so 4.x would be the minimum target for it.
> > > >
> > > > Yeah, option 2 doesn't seem ideal. It also doesn't seem right that we put in a restriction like that simply because the internals can't tell things apart. Bad UX.
> > >
> > > Well, as you can reference a variable in the node ubuntu using $ubuntu::variable and you can reference a variable in the class ubuntu using $ubuntu::variable I don't see how you could allow both at the same time.
> >
> > I don't seem to be able to reference a node variable directly:
> >
> > > bundle exec puppet apply -e 'node aparker {
> > $a = "node"
> > include foo
> > }
> > class foo { notice($aparker::a) }
> > '
> > Warning: Scope(Class[Foo]): Could not look up qualified variable 'aparker::a'; class aparker could not be found
> > Scope(Class[Foo]):
> > Finished catalog run in 0.03 seconds
> >
>
>
>
> You can't. A while back there appearantly some discussion as to whether or not top scope or node scope should be used when $::foo was used. Top scope won, and node scope became only accessible in that actual node - outside of classes.
>
> It would be nice if there was a way to access $node::foo though, but I think hiera now provides a much cleaner way of providing that sort of thing, at least for all the use cases I can think of (node level 'global' defaults that differ from node to node enough to not be a pure topscope global)
>

I was sure that worked, been a while since I used nodes at all :)

But it seems you can lookup a node scope variable outside of the node scope. But I think it is a bit confusing design that a variable defined in a node scope doesn't work the same as a variable defined for the node in a ENC. Can also make it a bit trickier to switch between the two.
Example from http://docs.puppetlabs.com/guides/scope_and_puppet.html which seems to work.

$var = "from topscope"
node default {
$var = "from node"
include lookup_example
}
class lookup_example {
notify { $var: } # will be "from node"
notify { $::var: } # will be "from topscope"
}
>
> > >
> > > I think we either have to always rename the node scope to something like "node" and not allow classes with that name (would also make it easy to reference variables in the node scope as it has a fixed name).
> > >
> > > Or skip having a special node scope and just merge it with the top scope, which would also make variables in it work more like variables in a ENC.
> > >
> > > Neither of these options are entirely backwards compatible though, but I'm not sure any solution to this problem can be entirely backwards compatible.
> >
> > There are all sorts of problems with combining top and node scope. A few months ago I proposed that, but it is a large change that breaks things that people depend on.
> >
> > > --
> > > Erik Dalén
> > >
> > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
> > > To post to this group, send email to puppe...@googlegroups.com (javascript:).
> > > To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com (javascript:).
> > > For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/puppet-dev/-/v1HFPFu99acJ.
> To post to this group, send email to puppe...@googlegroups.com (mailto:puppe...@googlegroups.com).
> To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com (mailto:puppet-dev+...@googlegroups.com).

Andy Parker

unread,
Nov 27, 2012, 3:23:53 PM11/27/12
to puppe...@googlegroups.com
Yes, that is the expected behavior. There is a lot of historical reason for this and also the reason that the current scoping algorithm is getting called "2-step scoping" internally :)

The rules for looking up a variable boil down to:

  * if the variable contains '::' then split it into the class name and the variable name. Lookup the class and then read the variable out of that classes scope object (yes, there are some odd things you can do with that)
  * otherwise look up by walking scopes in the following order looking for the first definition found: current scope, inherited class scopes (all the way up inheritance), bound node scope, inherited node scopes, topscope.

All in all, this doesn't pose a problem for the issue being discussed, since you can't directly reference a node scope and so the collision between node and class should be able  to be cleared up without problem (at least from that perspective)
 
>
> > >
> > > I think we either have to always rename the node scope to something like "node" and not allow classes with that name (would also make it easy to reference variables in the node scope as it has a fixed name).
> > >
> > > Or skip having a special node scope and just merge it with the top scope, which would also make variables in it work more like variables in a ENC.
> > >
> > > Neither of these options are entirely backwards compatible though, but I'm not sure any solution to this problem can be entirely backwards compatible.
> >
> > There are all sorts of problems with combining top and node scope. A few months ago I proposed that, but it is a large change that breaks things that people depend on.
> >
> > > --
> > > Erik Dalén
> > >
> > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
> > > To post to this group, send email to puppe...@googlegroups.com (javascript:).
> > > To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com (javascript:).
> > > For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/puppet-dev/-/v1HFPFu99acJ.
> To post to this group, send email to puppe...@googlegroups.com (mailto:puppe...@googlegroups.com).
> To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com (mailto:puppet-dev+...@googlegroups.com).
> For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.



--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages