Jira (PUP-9043) file resource - mtime checksum does not seem to work

2 views
Skip to first unread message

Jeff Sparrow (JIRA)

unread,
Aug 6, 2018, 3:26:03 PM8/6/18
to puppe...@googlegroups.com
Jeff Sparrow created an issue
 
Puppet / Bug PUP-9043
file resource - mtime checksum does not seem to work
Issue Type: Bug Bug
Assignee: Unassigned
Created: 2018/08/06 12:25 PM
Labels: file checksum artifactory http
Priority: Normal Normal
Reporter: Jeff Sparrow

Puppet Version:  4.10.9
Puppet Server Version:  2.8.0
OS Name/Version:  Windows 201x

Describe your issue in as much detail as possible…

The mtime checksum for a file resource, does not seem to work with Artifactory.  I don't know if this extends into oter http resources, but we are seeing this issue with Artifactory for sure.  

I am aware of PUP-6114 for md5 checksum fixes, but this is for mtime.  I am not sure if there is already a ticket open for this or not, but I didnt see one, so here this is.

We have 70 some servers that all have this issue.  Here is a file we are obtaining from artifactory, as you can see it has a last-modified time of:

 

Last-Modified: Sat, 04 Aug 2018 08:19:42 GMT

 

 

Here is a list of the server times of that file.  The size of the file is drastically different on every server as well.  Its a 1.4GB file and some servers only have a file size of 100KB.

 

acdc04: 
CreationTime : 8/6/2018 7:23:31 AM
LastWriteTime : 8/6/2018 7:23:31 AM
LastAccessTime : 8/6/2018 7:22:24 AM
acdc22: 
CreationTime : 8/6/2018 7:23:43 AM
LastWriteTime : 8/6/2018 7:23:43 AM
LastAccessTime : 8/6/2018 7:22:23 AM
acdc10:
CreationTime : 8/6/2018 7:24:51 AM
LastWriteTime : 8/6/2018 7:24:51 AM
LastAccessTime : 8/6/2018 7:19:49 AM
acdc01:
CreationTime : 8/6/2018 7:23:33 AM
LastWriteTime : 8/6/2018 7:23:33 AM
LastAccessTime : 8/6/2018 7:22:23 AM
acdc15:
CreationTime : 8/6/2018 7:24:25 AM
LastWriteTime : 8/6/2018 7:24:24 AM
LastAccessTime : 8/6/2018 7:19:47 AM
acdc06:
CreationTime : 8/6/2018 7:24:35 AM
LastWriteTime : 8/6/2018 7:24:35 AM
LastAccessTime : 8/6/2018 7:20:10 AM
acdc07:
CreationTime : 8/6/2018 7:23:35 AM
LastWriteTime : 8/6/2018 7:23:35 AM
LastAccessTime : 8/6/2018 7:20:19 AM
acdc21:
CreationTime : 8/6/2018 7:25:20 AM
LastWriteTime : 8/6/2018 7:25:20 AM
LastAccessTime : 8/6/2018 7:19:59 AM
acdc25:
CreationTime : 8/6/2018 7:24:42 AM
LastWriteTime : 8/6/2018 7:24:41 AM
LastAccessTime : 8/6/2018 7:19:48 AM
acdc23:
CreationTime : 8/6/2018 7:25:04 AM
LastWriteTime : 8/6/2018 7:25:03 AM
LastAccessTime : 8/6/2018 7:22:23 AM

 

Describe steps to reproduce…

Download a file from a Artifactory http URI

No matter how many times you run Puppet it thinks the file is correct.  

Desired Behavior: 

The file is the correct file and that Puppet manages it correctly.  That it is checked each Puppet run and verified.  

Actual Behavior:

The file doesnt get checked locally, so its always incorrect.  The checksums dont work, so the file is always incorrect. 

 

It would appear the md5 checksum portion of Artifactory is handled in PUP-6114 - I am not sure if the mtime portion is tracked elsewhere.  

 

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Jorie Tappa (JIRA)

unread,
Aug 6, 2018, 5:02:02 PM8/6/18
to puppe...@googlegroups.com

Josh Cooper (JIRA)

unread,
Aug 6, 2018, 7:39:02 PM8/6/18
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-9043
 
Re: file resource - mtime checksum does not seem to work

I can't reproduce this issue directly. Using puppet 4.10.9 and puppetserver-2.8.1-1.el7.noarch with manifest:

file { '/tmp/pl-build-tools-release-el-4.noarch.rpm':
  ensure => file,
  checksum => mtime,
  source => 'https://artifactory.delivery.puppetlabs.net/artifactory/rpm__local/build-tools/yum/pl-build-tools-release-el-4.noarch.rpm',
}

Puppet downloads the file the first time. The second time it doesn't because the local file's mtime is greater than or equal to the desired time. Running with --http_debug shows the agent making the HEAD request, but doesn't make any changes because the file is insync:

<- "HEAD /artifactory/rpm__local/build-tools/yum/pl-build-tools-release-el-4.noarch.rpm HTTP/1.1\r\nAccept: */*\r\nUser-Agent: Ruby\r\nConnection: close\r\nHost: artifactory.delivery.puppetlabs.net\r\n\r\n"
-> "HTTP/1.1 200 OK\r\n"
-> "Server: Artifactory/X.X.X\r\n"
-> "X-Artifactory-Id: afec80514c436b4eadb7d267da0a18a8b6e8377d\r\n"
-> "X-Artifactory-Node-Id: artifactory-app-prod-1\r\n"
-> "Last-Modified: Tue, 06 Feb 2018 18:42:51 GMT\r\n"
-> "ETag: 4b28a8c16ff30c8544f6a60b5b8d06f7f69ae064\r\n"
-> "X-Checksum-Sha1: 4b28a8c16ff30c8544f6a60b5b8d06f7f69ae064\r\n"
-> "X-Checksum-Sha256: d3fb5294021a4ca6beb77f3c1cd4a991ecc901c566687dc682e2195aa19e898d\r\n"
-> "X-Checksum-Md5: f0c3c0df32913a67e5337706df844f4f\r\n"
-> "Accept-Ranges: bytes\r\n"
-> "X-Artifactory-Filename: pl-build-tools-release-el-4.noarch.rpm\r\n"
-> "Content-Disposition: attachment; filename=\"pl-build-tools-release-el-4.noarch.rpm\"; filename*=UTF-8''pl-build-tools-release-el-4.noarch.rpm\r\n"
-> "Content-Type: application/x-rpm\r\n"
-> "Content-Length: 9580\r\n"
-> "Date: Mon, 06 Aug 2018 22:33:04 GMT\r\n"
-> "Connection: close\r\n"
-> "\r\n"
Conn close
Notice: Applied catalog in 0.53 seconds

However, i've noticed that if the connection is disconnected during download, ruby will ignore the resulting EOFError, which will cause puppet to write out a partial file. See https://github.com/ruby/ruby/blob/v2_4_4/lib/net/http/response.rb#L293 and https://github.com/ruby/ruby/blob/v2_4_4/lib/net/protocol.rb#L129.

And since puppet completes the file download, it will rename the temporary truncated file over the real file, and will not update the file again (until the mtime changes on the server).

Since puppet is downloading a large file, it could very well be affected by transient network connections. Can you verify that the corrupted files are indeed truncated?

Josh Cooper (JIRA)

unread,
Aug 7, 2018, 3:22:01 AM8/7/18
to puppe...@googlegroups.com

Jeff Sparrow (JIRA)

unread,
Aug 7, 2018, 6:54:02 AM8/7/18
to puppe...@googlegroups.com
Jeff Sparrow commented on Bug PUP-9043

Josh Cooper - I think you hit it right on the head when you said this:

However, i've noticed that if the connection is disconnected during download, ruby will ignore the resulting EOFError, which will cause puppet to write out a partial file. See https://github.com/ruby/ruby/blob/v2_4_4/lib/net/http/response.rb#L293 and https://github.com/ruby/ruby/blob/v2_4_4/lib/net/protocol.rb#L129.

And since puppet completes the file download, it will rename the temporary truncated file over the real file, and will not update the file again (until the mtime changes on the server).

We dug into this a bit further yesterday.  That is exactly what has happened on our side, it appears.  The Puppet run was done during the transfer of that file to artifactory.  It is transferred around 6am every day, then this specific group runs puppet around 8-9am.  However, on this day, the file wasnt transferred to Artifactory until around 8am.  At 1.4GB it took a couple seconds, and it appears only part of the file was downloaded during this teams Puppet runs.   Then, as you alluded to, Puppet will never update the file again, because the mtime locally is greater than or equal to the remote file.  Since the mtime is actually greater, but the file was incomplete, it would never be resolved again; except through manual intervention.   Thus, between no md5 availability for artifactory, or mtime not being rechecked, the file will always be bad.  

Just curious, is there a reason that the greater than **portion of mtime checking exists?  I understand there would be a slight time difference for transfer, but it would seem that if its an extended period of time, say beyond an hour or so difference, that the file has likely been changed (or in our case is incorrect all together) and thus puppet would then view the file as not being the same?

Jeff Sparrow (JIRA)

unread,
Mar 22, 2019, 9:38:04 AM3/22/19
to puppe...@googlegroups.com
Jeff Sparrow commented on Bug PUP-9043

Any idea when or what puppet/ruby build this will be fixed in? It continues to be an issues with multiple vendors.

Josh Cooper (JIRA)

unread,
Aug 8, 2019, 5:08:03 PM8/8/19
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-9043

Specifying custom headers for artifactory is handled in PUP-6114, so I think the only remaining issue for this ticket is the ruby silent truncation problem. I'll look at the ruby PR I submitted awhile ago and see if I can move that forward. Worst case, we may need to handle that ourselves in our Connection#get and Connection#request_get methods.

Reply all
Reply to author
Forward
0 new messages