Jira (PUP-6447) Allow UTF8 files to have a leading BOM to be more tolerant of files produced on Windows

2 views
Skip to first unread message

Ethan Brown (JIRA)

unread,
Jun 27, 2016, 7:11:05 PM6/27/16
to puppe...@googlegroups.com
Ethan Brown updated an issue
 
Puppet / Bug PUP-6447
Allow UTF8 files to have a leading BOM to be more tolerant of files produced on Windows
Change By: Ethan Brown
Fix Version/s: PUP 4.6.x
Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.4.13#64028-sha1:b7939e9)
Atlassian logo

Ethan Brown (JIRA)

unread,
Jun 27, 2016, 7:11:10 PM6/27/16
to puppe...@googlegroups.com
Ethan Brown created an issue
Issue Type: Bug Bug
Assignee: Unassigned
Components: Windows
Created: 2016/06/27 4:10 PM
Labels: windows i18n utf-8
Priority: Normal Normal
Reporter: Ethan Brown

UTF-8 files produced on Windows often contain a BOM. Puppet now reads most files as utf-8, but doesn't allow for them to have a leading BOM.

This can be a problem for end users on Windows, where they might need to go through workarounds to produce content without a BOM. An easy solution is to modify any calls specifying utf-8 when opening files for read, to instead specify bom|utf-8. Ruby will handle the BOM correctly in this case.

Ethan Brown (JIRA)

unread,
Jun 27, 2016, 7:12:10 PM6/27/16
to puppe...@googlegroups.com
Ethan Brown updated an issue
Change By: Ethan Brown
UTF-8 files produced on Windows often contain a BOM.  Puppet now reads most files as utf-8, but doesn't allow for them to have a leading BOM.

This can be a problem for end users on Windows, where they might need to go through workarounds to produce content without a BOM.   For instance, PowerShell can write files as UTF-8 with the {{Out-File}} cmdlet, specifying {{-Encoding UTF8}} - but the files are written with a BOM, which ends up blowing up Ruby if we try to read the files in Puppet.

  An easy solution is to modify any calls specifying {{utf-8}} when opening files for read, to instead specify {{bom|utf-8}}.  Ruby will handle the BOM correctly in this case.

Ethan Brown (JIRA)

unread,
Jun 27, 2016, 8:40:05 PM6/27/16
to puppe...@googlegroups.com
Ethan Brown updated an issue
UTF-8 files produced on Windows often contain a BOM.  Puppet now reads most files as utf-8, but doesn't allow for them to have a leading BOM.

This can be a problem for end users on Windows, where they might need to go through workarounds to produce content without a BOM.  For instance, PowerShell can write files as UTF-8 with the {{Out-File}} cmdlet, specifying {{-Encoding UTF8}} - but the files are written with a BOM, which ends up blowing up Ruby if we try to read the files in Puppet  (under certain circumstances) .


An easy solution is to modify any calls specifying {{utf-8}} when opening files for read, to instead specify {{bom|utf-8}}.  Ruby will handle the BOM correctly in this case.


Note that the Ruby {{YAML.load_file}} API already handles the BOM situation correctly, so {{YAML}} files will generally be excluded from this problem.

Henrik Lindberg (JIRA)

unread,
Jun 28, 2016, 6:28:05 AM6/28/16
to puppe...@googlegroups.com

Craig Gomes (JIRA)

unread,
Jul 18, 2016, 4:55:05 PM7/18/16
to puppe...@googlegroups.com

Ethan Brown (JIRA)

unread,
Aug 22, 2016, 4:35:04 PM8/22/16
to puppe...@googlegroups.com

Ethan Brown (JIRA)

unread,
Aug 22, 2016, 4:35:04 PM8/22/16
to puppe...@googlegroups.com
Ethan Brown updated an issue
Change By: Ethan Brown
Fix Version/s: PUP 4.6.z
Fix Version/s: PUP 4.8.0

Ethan Brown (JIRA)

unread,
Sep 21, 2016, 6:28:03 PM9/21/16
to puppe...@googlegroups.com
Ethan Brown updated an issue
Change By: Ethan Brown
Sprint: Windows 2016-09-21
This message was sent by Atlassian JIRA (v6.4.14#64029-sha1:ae256fe)
Atlassian logo

Ethan Brown (JIRA)

unread,
Sep 21, 2016, 6:28:03 PM9/21/16
to puppe...@googlegroups.com

Ethan Brown (JIRA)

unread,
Sep 21, 2016, 6:28:03 PM9/21/16
to puppe...@googlegroups.com

Erick Banks (JIRA)

unread,
Dec 2, 2016, 12:12:03 AM12/2/16
to puppe...@googlegroups.com
Erick Banks commented on Bug PUP-6447
 
Re: Allow UTF8 files to have a leading BOM to be more tolerant of files produced on Windows

Encountered this again in Windows UTF-8 testing. Saved DSC manifest "UTF-8 w/ BOM" and errored out with:
Error: Could not parse for environment production: Syntax error at '∩' at C:/Use
rs/Administrator/Desktop/blah.pp:1:1 on node x9bshj31wdcfefy.delivery.puppetlabs
.net

Saved to "UTF-8" and executed same manifest successfully.

Jeremy Adams (JIRA)

unread,
Dec 2, 2016, 12:18:03 AM12/2/16
to puppe...@googlegroups.com
Jeremy Adams commented on Bug PUP-6447

Recently when doing an integration in javascript that wrote a facter fact in JSON, I encountered this issue. Facter would not tolerate UTF-8 with a BOM for the structured JSON fact.

var command = "New-Item -path '" + factsDir + "' -type directory -force \n" +
                      "$MyPath = '" + factsFile + "' \n" +
                      "$MyFile = " + oneLineFacts + " \n" +
                      "$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False) \n" +
                      "[System.IO.File]::WriteAllLines($MyPath, $MyFile, $Utf8NoBomEncoding)";

Ethan Brown (JIRA)

unread,
May 16, 2017, 7:08:02 PM5/16/17
to puppe...@googlegroups.com

Ethan Brown (JIRA)

unread,
Dec 12, 2017, 2:04:04 AM12/12/17
to puppe...@googlegroups.com
Ethan Brown updated an issue
Change By: Ethan Brown
Priority: Normal Major
This message was sent by Atlassian JIRA (v7.0.2#70111-sha1:88534db)
Atlassian logo

Ethan Brown (JIRA)

unread,
Dec 12, 2017, 10:22:03 AM12/12/17
to puppe...@googlegroups.com
Ethan Brown commented on Bug PUP-6447
 
Re: Allow UTF8 files to have a leading BOM to be more tolerant of files produced on Windows

Note that this issue has specifically been raised recently around using ERB templates for PowerShell files in MODULES-1996.

The call to Puppet::FileSystem.read_preserve_line_endings doesn't use the form of read that allows BOMs when reading files - see https://github.com/puppetlabs/puppet/blob/5b8386ce9edf944ceee8328526dbb9f238c1403a/lib/puppet/file_system/windows.rb#L109 and https://github.com/puppetlabs/puppet/blob/5b8386ce9edf944ceee8328526dbb9f238c1403a/lib/puppet/file_system/file_impl.rb#L80

It should be straightforward enough to tactically replace that one problem area (in a separate ticket).

Josh Cooper (Jira)

unread,
Jun 6, 2020, 7:59:03 PM6/6/20
to puppe...@googlegroups.com
Josh Cooper updated an issue
 
Change By: Josh Cooper
Team: Coremunity Night's Watch
This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)
Atlassian logo

Josh Cooper (Jira)

unread,
Jun 17, 2021, 11:57:02 AM6/17/21
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Epic Link: PUP-6719
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages