Jira (PUP-10096) Parameters of resources are base64 encoded in puppetdb if created as raw byte strings by ruby functions

25 views
Skip to first unread message

Reinhard Vicinus (JIRA)

unread,
Oct 8, 2019, 2:30:04 PM10/8/19
to puppe...@googlegroups.com
Reinhard Vicinus created an issue
 
Puppet / Bug PUP-10096
Parameters of resources are base64 encoded in puppetdb if created as raw byte strings by ruby functions
Issue Type: Bug Bug
Affects Versions: PUP 6.9.0
Assignee: Unassigned
Created: 2019/10/08 11:29 AM
Priority: Normal Normal
Reporter: Reinhard Vicinus

Puppet Version: 6.10.0
Puppet Server Version: 6.7.0
PuppetDB Version: 6.7.0
OS Name/Version: Ubuntu 18.04.

If a ruby function returns a raw byte string, the puppet server handles it without problems, but if the returned value is used as the value of a parameter of a resource the parameter is stored base64 encoded in the puppetdb even if it only contains utf-8 conform characters.

This issue only appears with puppet 6, puppet 5 is not affected.

To reproduce this issue you need a puppetserver with a puppetdb. Then you need to place the ruby function were the puppetserver can find it:

require 'resolv'
 
module Puppet::Parser::Functions
  newfunction(:rdnslookup, :type => :rvalue) do |args|
    begin
      result = Resolv.new.getname(args[0])
    rescue Exception => e
      warning("No reverse lookup for #{args[0]}: #{e}")
      return nil
    end
    return result
  end
end

and use this short code for the manifest:

$v = rdnslookup('8.8.8.8')
notify { "type: ${type($v)}": }
file { $v:
  path => '/tmp/test.txt',
  content => $v,
}

After running a node against the puppetserver you should be able to find the resource in the puppetdb postgresql database with the postgresql query described in the following section.

Actual Behavior:

select * from resource_params where resource in (select resource from catalog_resources where type = 'File' and title = 'dns.google');
 
                  resource                  |  name   |       value        
--------------------------------------------+---------+--------------------
 \xefad334d873aace70a136edb5a013cf4ccfea364 | alias   | ["/tmp/test.txt"]
 \xefad334d873aace70a136edb5a013cf4ccfea364 | content | "ZG5zLmdvb2dsZQ=="
 \xefad334d873aace70a136edb5a013cf4ccfea364 | path    | "/tmp/test.txt"
(3 rows

I would expect, that the content parameter is not base64 encoded.

Desired Behavior:

select * from resource_params where resource in (select resource from catalog_resources where type = 'File' and title = 'dns.google');
 
                  resource                  |  name   |       value       
--------------------------------------------+---------+-------------------
 \x9f6834bb815051a24387e82bd0845a8b46164400 | alias   | ["/tmp/test.txt"]
 \x9f6834bb815051a24387e82bd0845a8b46164400 | content | "dns.google"
 \x9f6834bb815051a24387e82bd0845a8b46164400 | path    | "/tmp/test.txt"
(3 rows)

If .encode('UTF-8') is append to the result of the ruby function the desired behavior is achieved. But because it worked with puppet 5 and the issue is very annoyingly to debug, I hope that the issue can be fixed in puppet.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Henrik Lindberg (JIRA)

unread,
Oct 8, 2019, 5:21:03 PM10/8/19
to puppe...@googlegroups.com
Henrik Lindberg commented on Bug PUP-10096
 
Re: Parameters of resources are base64 encoded in puppetdb if created as raw byte strings by ruby functions

Since the function is clearly meant to return "text" it should encode the return string as an UTF-8 string instead of ASCII-8BIT. Then it is totally clear what it is returning.

It could be argued that if ASCII-8BIT encoded text can be changed to UTF-8 without errors then it should be automatically treated as "text" (like in Puppet before version 6 - where we also had difficulty with non compliant UTF-8 strings being used). The issue is that there is a fairly large risk that binary content could represent "garbage text" and be displayed as such.

Auto conversion is generally evil and is usually the cause of unforeseen problems - remember all the funny stuff that happened with automatic string to number conversions?

Reinhard Vicinus (JIRA)

unread,
Oct 8, 2019, 5:59:02 PM10/8/19
to puppe...@googlegroups.com

I don't know if the return value should be auto converted. But in my opinion whatever is done should be done consistently and at the moment, the parameter value gets base64 encoded but the title does not even if it is the same value.

Also I would expect that there is no difference between the catalog submitted to the puppetdb and the catalog stored under /opt/puppetlabs/puppet/cache/client_data/catalog/${fqdn}.json but at the moment there are no base64 encoded values in the node catalog, only in the puppetdb catalog.

Henrik Lindberg (JIRA)

unread,
Oct 8, 2019, 6:14:03 PM10/8/19
to puppe...@googlegroups.com

Thanks Reinhard Vicinus - if you are absolutely sure there is no Binary in the catalog sent to the agent it must be a matter of the ASCII-8BIT getting converted en route to PDB in a different way. That is a problem to look into - it should be done the same way for sure. (Maybe Puppet is already doing what was suggested in terms of auto-conversion on one of the paths; when the catalog is being serialized, but not on the other).

In any case if the function rdnslookup() was changed to return UTF-8 it would prevent the problem. (I think the function should be fixed as that is the prudent thing to do for any function returning what is known to be "text".

Josh Cooper (JIRA)

unread,
Oct 8, 2019, 9:04:03 PM10/8/19
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-10096

The problem stems from ruby's Resolv.getname function which returns a ruby ASCII-8BIT string:

irb(main):001:0> require 'resolv'
=> true
irb(main):002:0> name = Resolv.getname("172.217.14.196")
=> "sea30s01-in-f4.1e100.net"
irb(main):003:0> name.encoding
=> #<Encoding:ASCII-8BIT>

Note it's the same as a ruby binary string:

irb(main):008:0> Encoding::ASCII_8BIT == Encoding::BINARY
=> true

Puppet preserves the binary-ness of the string throughout the catalog compilation process, and converts it to base64 when storing in puppetdb. IIRC postgres can't store null bytes specifically.

Nick Lewis pointed out:

Internally, programs that manipulate domain names should represent them as sequences of labels, where each label is a length octet followed by an octet string.

By convention, domain names can be stored with arbitrary case, but domain name comparisons for all present domain functions are done in a case-insensitive manner, assuming an ASCII character set, and a high order zero bit.

So it is reasonable for the ruby API to return a binary string. However, the function should coerce the string to UTF-8 prior to returning. This can be accomplished via:

result = Resolv.new.getname(args[0])
result.force_encoding(Encoding::UTF_8)
result.scrub!

Ideally it shouldn't be so hard to debug this issue. We can't prevent or warn if any function returns binary data. For example, the binary_file is supposed to return a Binary data type. But perhaps we can leverage the function's return_type. If a function doesn't declare that it returns a Binary string, but returns a String with encoding Encoding::ASCII_8BIT then warn (maybe in strict mode only?), as it should be returning a Binary instead? That way it's clear the function needs to return a String with a non-binary encoding or a Binary value.

Josh Cooper (JIRA)

unread,
Oct 9, 2019, 1:57:04 PM10/9/19
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-10096

From https://en.wikipedia.org/wiki/Punycode:

While the Domain Name System (DNS) technically supports arbitrary sequences of octets in domain name labels, the DNS standards recommend the use of the LDH subset of ASCII conventionally used for host names, and require that string comparisons between DNS domain names should be case-insensitive.

Given that I'd recommend coercing the name as US-ASCII not UTF-8:

result = Resolv.new.getname(args[0])
result.force_encoding(Encoding::US_ASCII)
result.scrub!

Reinhard Vicinus (JIRA)

unread,
Oct 9, 2019, 3:51:05 PM10/9/19
to puppe...@googlegroups.com

Henrik Lindberg Ok, the reason why the value was not binary encoded in the catalog on the node is that there is still a puppet 5 agent running. With a puppet 6 agent the value is binary encoded. This is a case were afterwards I am asking myself why I spend hours debugging the puppetserver/puppetdb part and did not try what happened if I updated the puppet agent on the node...

I'll change the ruby function as suggested by Josh Cooper: Thanks for looking up the correct way to encode DNS domain names.

But I tested some things and I found a further problem:

The puppetdb does not store, that the value was binary encoded. So if you export a resource with a binary string value and import it on another node the value is base64 encoded, because in the puppetdb postgresql database is no information stored, that the value was binary in the catalog and only the base64 value is stored:

$v = rdnslookup('8.8.8.8')
 
case $::fqdn {
  'node1': {
    @@file { $v:
      path => '/tmp/test.txt',
      content => $v,
    }
  }
  'node2': {
    File <<| title == "dns.google" |>>
  }  
}

Results in:

> puppet agent --test --environment=rvlocal
Info: Using configured environment 'rvlocal'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for node2
Info: Applying configuration version 'rvlocal heads/production-0-gee541e2af'
Notice: /Stage[main]/Main/File[dns.google]/content: 
--- /tmp/test.txt       2019-10-09 20:30:15.015623862 +0200
+++ /tmp/puppet-file20191009-22428-jvchtb       2019-10-09 20:53:27.431359605 +0200
@@ -1 +1 @@
-dns.google
\ No newline at end of file
+ZG5zLmdvb2dsZQ==
\ No newline at end of file

But this is more a puppetdb bug as a puppet bug? Should I open a new bug in the puppetdb project area for this issue?

Another behavior I think could be a bug is, that the following code behaves differently:

$converts_string = String.new(rdnslookup('8.8.8.8'), '%s')
$does_not_convert_string = String.new(rdnslookup('8.8.8.8'))

The first line creates a new puppet conform string, the second one returns the same binary string. I personally would expect that both create puppet conform strings.

Coming back to the original problem reported in the bug report my opinion is, that auto conversion is not the solution to this issue. I expect of puppet that if the type returned of a value is String, that the value behaves like a puppet string (that it is internally handled differently by puppet is in my opinion acceptable). As far as I can tell this is the case at the moment except for the puppetdb bug with exported resources. So in my opinion this ticket can be closed.

Henrik Lindberg (JIRA)

unread,
Oct 9, 2019, 4:56:03 PM10/9/19
to puppe...@googlegroups.com

Reinhard Vicinus thanks for all the detective work on this and the detailed reports. It really helps!

It would be great to file another PUP ticket for the String new with / without %s - both of them should result in an UTF-8 encoded String as that is expected of all strings in the puppet language.

The PDB issue is also a separate issue - please file a ticket for that - it basically means it is not possible to roundtrip (export/import) rich data (which is sad).

That leaves some kind of "auto convert" feature - which we all seem to agree is a bad idea and that this should be fixed at the source (in the function in question). So I am ok with closing this - maybe Josh Cooper wants something different.

Josh Cooper (JIRA)

unread,
Oct 9, 2019, 5:45:02 PM10/9/19
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-10096

we all seem to agree is a bad idea and that this should be fixed at the source (in the function in question)

To help with that, maybe we can warn if a function returns a ruby string with an ascii-8bit/binary encoding, so that module authors know to either force a non-binary encoding (when it can be determined based on the use case), or force the function to return a Binary puppet string and have it declare that using return_type?

Josh Cooper (JIRA)

unread,
Oct 10, 2019, 1:58:03 PM10/10/19
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-10096

This seems to handle the usual case where the function doesn't define a return_type but returns a binary ruby string. I'm not sure about the performance impact though. Thoughts Henrik Lindberg?

diff --git a/lib/puppet/pops/functions/dispatch.rb b/lib/puppet/pops/functions/dispatch.rb
index 3be753d534..a940f5123a 100644
--- a/lib/puppet/pops/functions/dispatch.rb
+++ b/lib/puppet/pops/functions/dispatch.rb
@@ -59,7 +59,13 @@ class Dispatch < Evaluator::CallableSignature
   def invoke(instance, calling_scope, args, &block)
     result = instance.send(@method_name, *weave(calling_scope, args), &block)
     return_type = @type.return_type
-    Types::TypeAsserter.assert_instance_of(nil, return_type, result) { "value returned from function '#{@method_name}'" } unless return_type.nil?
+    if return_type.nil?
+      if result.is_a?(String) && result.encoding == Encoding::BINARY
+        Puppet.warn_once('deprecations', "function_binary_return_type", _("Function '#{@method_name}' returned a binary string, but did not declare `return_type 'Binary'`"))
+      end
+    else
+      Types::TypeAsserter.assert_instance_of(nil, return_type, result) { "value returned from function '#{@method_name}'" }
+    end
     result
   end
 

Henrik Lindberg (JIRA)

unread,
Oct 10, 2019, 2:21:04 PM10/10/19
to puppe...@googlegroups.com

Well, what about arrays with BINARY in them, or hashes, or any nested stuff... Means we have to fully check every returned value. (We would already do that when a function defines a return type though). In any case, a complete recursive scan is needed to be accurate.
We may already be doing that for 3.x functions btw since we have to scrub the return from those for :undef crap IIRC.

We rewrite all 3x returns in convert_return() method in runtime3_converter.rb. It gets called from the wrapper that turns a 3x function into a 4x function - like this

Puppet::Pops::Evaluator::Runtime3FunctionArgumentConverter.convert_return(self.send(real_fname, args[0]))

So, performance wise it would be no big deal to add an extra check there for strings. It is however only for 3x functions.

For 4.x I am not sure what the best approach would be. I thought about making a BINARY string not be a Puppet String....
but I think it may be bad to do that. It would solve one problem, but not handle functions with a return type not constraining the result to not include whatever type BINARY string would have (even the Runtime type is an Any type if we were to not recognize it at all.
Also, it would make checking of something is a String slower - which is not good.

Henrik Lindberg (JIRA)

unread,
Oct 10, 2019, 2:29:03 PM10/10/19
to puppe...@googlegroups.com

Maybe add something similar in debug mode for 4.x functions? At least you can find the culprits that way.

Rob Braden (JIRA)

unread,
Oct 14, 2019, 1:04:04 PM10/14/19
to puppe...@googlegroups.com

Rob Braden (JIRA)

unread,
Oct 14, 2019, 1:04:05 PM10/14/19
to puppe...@googlegroups.com

Josh Cooper (Jira)

unread,
Feb 8, 2022, 2:17:02 AM2/8/22
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Team: Froyo
This message was sent by Atlassian Jira (v8.20.2#820002-sha1:829506d)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages