Jira (PUP-10158) Filename with non-UTF8-encoding corrupts state.yaml

0 views
Skip to first unread message

Torsten Roloff (JIRA)

unread,
Dec 4, 2019, 6:21:04 AM12/4/19
to puppe...@googlegroups.com
Torsten Roloff created an issue
 
Puppet / Bug PUP-10158
Filename with non-UTF8-encoding corrupts state.yaml
Issue Type: Bug Bug
Affects Versions: PUP 6.10.1
Assignee: Unassigned
Attachments: test_from.zip
Components: Windows
Created: 2019/12/04 3:20 AM
Priority: Minor Minor
Reporter: Torsten Roloff

Puppet Version: Agent 5.5.14
Puppet Server Version: 6.7.2-1stretch
OS Name/Version: Windows 7 **

I'm not an encoding nor puppet expert, this description is what happens from my point of view:

I use a simple file resource to copy a directory test_from.zip to a windows client like this:

file { 'c:/test_to':
  ensure => directory,
  backup => false,
  force => true,
  purge => true,
  recurse => true,
  source => "c:/test_from",
  }

Unfortunately, the source contains some files with non-unicode characters in their filenames (ASCII-encoded?).

These filenames are written to state.yaml and transactionstore.yaml during the agent run.

During the next run, puppet tries to read the yaml files (in /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/yaml.rb Puppet::Util::Yaml.safe_load_file) and gives an error as shown below. This may be because safe_load_file assumes the content of the state files to be utf-8 (Puppet::(FileSystem.read(filename, :encoding => 'bom|utf-8')).

I know it is/may be mainly an issue of the wrongly encoded filenames, but maybe it is possible to make puppet a little bit more robust.

Desired Behavior:

Agent run should finish without errors.

Actual Behavior:

First agent run finishes without errors.

From the second run, I get this error:

c:\ProgramData\PuppetLabs\puppet\cache\state>"c:\Program Files\Puppet Labs\Puppet\bin\puppet.bat" agent --test
Error: Checksumfile C:/ProgramData/PuppetLabs/puppet/cache/state/state.yaml is corrupt ((C:/ProgramData/PuppetLabs/puppet/cache/state/state.yaml): invalid leading UTF-8 octet at line 1 column 1); replacing
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for roloffvm.net.adk.de
Error: Transaction store file C:/ProgramData/PuppetLabs/puppet/cache/state/transactionstore.yaml is corrupt ((C:/ProgramData/PuppetLabs/puppet/cache/state/transactionstore.yaml): invalid leading UTF-8 octet at line 1 column 1); replacing
Wrapped exception:
(C:/ProgramData/PuppetLabs/puppet/cache/state/transactionstore.yaml): invalid leading UTF-8 octet at line 1 column 1

Workaroud

Find all files with "wrong" filenames in source directory and rename them. I copied the state.yaml to a linux machine and issued something like:

:~$ grep -axv '.*' state.yaml 
 
File[c:/test_from/Nat. Bek. Freih�ndige Vergabe.doc]:
File[c:/test_from/Nat. Bek. Freih�ndige Vergabe.rtf]:

Then, I renamed these files.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Josh Cooper (JIRA)

unread,
Dec 4, 2019, 11:30:05 AM12/4/19
to puppe...@googlegroups.com

Josh Cooper (JIRA)

unread,
Dec 4, 2019, 11:30:07 AM12/4/19
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-10158
 
Re: Filename with non-UTF8-encoding corrupts state.yaml

Puppet assumes content written to state.yaml and transactionstore.yaml are UTF-8 encoded. It sounds like the data is being written to those files not as UTF-8, but using ruby's default encoding which on German Windows (I think) is Encoding::CP1252. This occurs in:

    Puppet::Util::Yaml.dump(@new_data, Puppet[:transactionstorefile])

and

      Puppet::Util::Yaml.dump(@@state, Puppet[:statefile])

We need to ensure strings in @new_data and @@state hashes are utf-8 encoded during serialization.

Josh Cooper (Jira)

unread,
Nov 17, 2020, 10:44:03 AM11/17/20
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Epic Link: PUP-7548
This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)
Atlassian logo

Josh Cooper (Jira)

unread,
Nov 17, 2020, 10:44:03 AM11/17/20
to puppe...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages