Jira (PUP-3238) puppet reports "end of file reached" if server closes HTTP connection

28 views
Skip to first unread message

Mihkel Ader (JIRA)

unread,
Nov 4, 2014, 8:06:24 AM11/4/14
to puppe...@googlegroups.com
Mihkel Ader commented on Bug PUP-3238
 
Re: puppet reports "end of file reached" if server closes HTTP connection

+1 on connection retry feature.

For us, Puppet still occasionally fails with the "end of file reached" message in remote sites with high network latency. Apache keepalive timeout has been bumped to 15 seconds. Running Puppet 3.7.2. No problems when keepalives are disabled.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.3.7#6337-sha1:2ed701e)
Atlassian logo

Nick Moriarty (JIRA)

unread,
Dec 19, 2014, 6:03:21 AM12/19/14
to puppe...@googlegroups.com
Nick Moriarty commented on Bug PUP-3238

We're also seeing "end of file" sometimes with KeepAlive. We're using 4 seconds / 10 seconds on client / server respectively. Clients are running 3.7.2 and .3, masters are running 3.7.3.

We're seeing bursts of failed runs on the affected hosts, around seven hours apart but not correlated with each other.

I've added local debugging and I'm running tcpdump on a client and test master.

What I believe I'm seeing on the failed runs (waiting on more data) is:

  • Everything seems normal, and the connection is cached after downloading facts, but before "Loading facts".
  • The client sends an SSL "Encrypted Alert", which I believe contains a request to close the connection (at present I've not decrypted this as we're using DH - turned this off to get more data now).
  • The server acknowledges and sends a FIN/ACK to close the connection, which the client acknowledges immediately.
  • The client picks up the connection (which hasn't reached its keepalive timeout) and tries to use it (it sends successfully).
  • The web server never sees / ignores the request as the connection is already half-closed.
  • An exception is thrown from /usr/lib/ruby/1.8/net/protocol.rb:135:in `sysread' when the client tries to read from the (closed) incoming stream - it does this without timing out.

I'm still trying to find out why this is happening reliably in bursts (but I can't seem to force it to happen), as it looks like the client is deciding to close the connection, but it's definitely not being done via the Puppet connection pool class.

The issue would be alleviated if the connection pool checked that connections were actually still fully open before selecting them - it seems valid that the server should be allowed to close the connection of its own accord, and the client shouldn't then try to re-use the connection. This might be something that can be checked on the HTTP object when determining which connections are active.

This message was sent by Atlassian JIRA (v6.3.10#6340-sha1:7ea293a)
Atlassian logo

Nick Moriarty (JIRA)

unread,
Dec 22, 2014, 5:36:27 AM12/22/14
to puppe...@googlegroups.com
Nick Moriarty commented on Bug PUP-3238

Updates on my previous comment.

We're not seeing the above problem on all of our environments; it appears to be constrained to only some. I added some extra debugging and have discovered the following:

  • The client-issued "Encrypted Alert" described above is indeed a Close Notify (determined by changing our test server to use a non-DH cipher and packet sniffing).
  • This appears to be the first sign of any difference between normal runs and those where the connection is dropped.
  • Having added extra debugging around fact retrieval, I can confirm that the Close Notify is sent after all facts have been retrieved, but before the debug line "Failed to load library 'msgpack' for feature 'msgpack'".

The connection close definitely seems to be client-initiated.

Nick Moriarty (JIRA)

unread,
Dec 24, 2014, 6:56:26 AM12/24/14
to puppe...@googlegroups.com
Nick Moriarty commented on Bug PUP-3238

I've finally tracked down the cause of the client-initiated connection closures causing our issue. These were caused by bad file descriptor usage in a custom fact; in Ruby 1.8, IO objects seem to be very low-level, and the Open3 library can return illegal but valid-looking IO objects from a call to popen3 even if the underlying command fails to run. It looks like these objects can end up relating to other, existing IOs; in our case, this was sometimes the HTTP socket. Closing of the connection appeared to be a side-effect of these invalid IOs. I've fixed the associated fact, and the problem seems to have subsided.

I'd still like to see a retry feature and/or a test of HTTP connections to see if they're actually still open before using them.

Aaron Armstrong (JIRA)

unread,
Dec 29, 2014, 6:05:29 PM12/29/14
to puppe...@googlegroups.com

Josh Cooper (JIRA)

unread,
Jan 7, 2015, 8:11:31 PM1/7/15
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-3238
 
Re: puppet reports "end of file reached" if server closes HTTP connection

Thanks Nick Moriarty for debugging the issue you were seeing. As far as retrying http connections, there has been some active debate about that on PUP-2526. However, note that POST requests are not idempotent and should not be retried. And since the error you were seeing was due to a POST request:

/usr/lib/ruby/site_ruby/1.8/puppet/network/http/connection.rb:87:in `post'

http retry would not have helped.

I looked into testing HTTP connection prior to reusing them. The difficulty I ran into there was finding a valid request that didn't require data to be transferred, and also one that worked without requiring a client cert. It may be possible to issue a HEAD request for the certificate authority certificate, e.g. HEAD /production/certificate/ca HTTP/1.1, though that could result in a substantial number of requests, especially during pluginsync.

I think this particular issue most often comes up when facts take a long time to resolve, e.g. about as long as the client http keepalive timeout. Another improvement might be to do something like:

pool do
  pluginsync
end
 
evaluate facts 
 
pool do
  retrieve catalog
  apply catalog
  send report
end

Josh Cooper (JIRA)

unread,
Mar 31, 2015, 6:14:12 PM3/31/15
to puppe...@googlegroups.com
Josh Cooper updated an issue
 
Change By: Josh Cooper
Scrum Team: Client Platform
This message was sent by Atlassian JIRA (v6.3.15#6346-sha1:dbc023d)
Atlassian logo

Jean Bond (JIRA)

unread,
Apr 2, 2015, 5:16:20 PM4/2/15
to puppe...@googlegroups.com
Jean Bond commented on Bug PUP-3238
 
Re: puppet reports "end of file reached" if server closes HTTP connection

@josh Is there any Docs work needed on this at this time? It's hanging out on our Kanban board because of the DOCS component, but I'm not seeing what action is required from us just now.

Nick Moriarty (JIRA)

unread,
Apr 8, 2015, 6:15:46 AM4/8/15
to puppe...@googlegroups.com
Nick Moriarty commented on Bug PUP-3238

For info (in case anyone else runs into this), I found the source of another odd issue causing sporadic disconnects (symptom was sporadic "end of file reached" on various file resources. Our masters run Apache 2.4 with KeepAlive (with an appropriate timeout with a good margin), and were configured to use mod_reqtimeout with default request timeouts. There seems to be an interesting interaction between KeepAlive and reqtimeout which causes some strangeness and can result in connections dropping. I never saw any substantial delays in the logs (certainly nothing near the configured timeouts), and I did find a reference to this on the Apache bug tracker. https://bz.apache.org/bugzilla/show_bug.cgi?id=56729

We've disabled reqtimeout for now to prevent this cropping up.

Testing HTTP connections looks to be moderately difficult as it appears to require examining the internal state of HTTP, and possibly making at least a non-blocking read for at least one byte from the underlying socket - this results in either Errno::EAGAIN (if there's no data), or EOFError (if the socket was closed by the server).

Jean Bond (JIRA)

unread,
Apr 8, 2015, 4:09:56 PM4/8/15
to puppe...@googlegroups.com

Robert Scheer (JIRA)

unread,
Sep 22, 2015, 10:26:36 AM9/22/15
to puppe...@googlegroups.com
Robert Scheer commented on Bug PUP-3238
 
Re: puppet reports "end of file reached" if server closes HTTP connection

I have this issue with a puppet master server running Puppetserver (version 1.1) so Apache settings don't apply. How could I mitigate this problem?

This message was sent by Atlassian JIRA (v6.4.11#64026-sha1:78f6ec4)
Atlassian logo

Christopher Price (JIRA)

unread,
Sep 24, 2015, 10:30:04 AM9/24/15
to puppe...@googlegroups.com

Josh Cooper does this seem like what you are asking about?:

https://github.com/puppetlabs/trapperkeeper-webserver-jetty9/blob/master/doc/jetty-config.md#idle-timeout-milliseconds

If so, I'm not sure if we've decided how to get those docs into the official PL docs yet.

Josh Cooper (JIRA)

unread,
Sep 24, 2015, 1:07:10 PM9/24/15
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-3238

Thanks Christopher Price, that sounds like the one. Robert Scheer can you try setting idle-timeout-milliseconds using the link above?

Robert Scheer (JIRA)

unread,
Sep 25, 2015, 8:31:03 AM9/25/15
to puppe...@googlegroups.com
Robert Scheer commented on Bug PUP-3238

Thanks Josh Cooper, I will. Also upon further examination I found that the "end of file reached" message reported by a puppet client corresponds with a Java exception "java.nio.channels.WritePendingException: null" in puppetserver.log exactly as found in SERVER-819 . This is the case for the almost twenty failures I checked, out of some 600 per day (we have 500+ nodes that check in twice per hour).

Robert Scheer (JIRA)

unread,
Sep 28, 2015, 6:10:03 AM9/28/15
to puppe...@googlegroups.com
Robert Scheer commented on Bug PUP-3238

Setting idle-timeout-milliseconds worked: the amount of failure reports dropped to 3 in the past 44 hours after doubling its value. What it is that causes our puppet clients to be so slow remains to be seen, but our immidiate problem is solved now. Thanks!

Adam Winberg (JIRA)

unread,
Oct 16, 2015, 1:35:05 AM10/16/15
to puppe...@googlegroups.com
Adam Winberg commented on Bug PUP-3238

I have switched from apache/passenger to puppetserver and also get this intermittently. The problem arise during certain times when our nfs server is under heavy load, and since we serve our modules from nfs the puppet runs are generally quite slow during these times. However, apache/passenger could cope just fine with this, slower runs but no timeouts, and I hope I can configure puppetserver to do this too. Unfortunately, increasing idle-timeout-milliseconds to 40min had no effect, the "java.nio.channels.WritePendingException: null" and "end of file reached" messages appears at the same time as before. I also set connect-timeout-milliseconds to 5min, but no effect.

Any ideas?

Adam Winberg (JIRA)

unread,
Oct 19, 2015, 2:03:03 AM10/19/15
to puppe...@googlegroups.com
Adam Winberg commented on Bug PUP-3238

I don't see any timeout values at play here. For example I have two nodes which started their agent runs at different times (one minute apart) but both terminated with "end of file reached" at the same second.

Logs from puppet agents:

Oct 18 16:09:09 lxserv363 puppet-agent[62991]: Retrieving pluginfacts
Oct 18 16:09:54 lxserv363 puppet-agent[62991]: Retrieving plugin
Oct 18 16:10:34 lxserv363 puppet-agent[62991]: Loading facts
Oct 18 16:10:34 lxserv363 puppet-agent[62991]: Loading facts
Oct 18 16:12:33 lxserv363 puppet-agent[62991]: Could not retrieve catalog from remote server: end of file reached

 
Oct 18 16:08:15 lxserv1055 puppet-agent[16951]: Retrieving pluginfacts
Oct 18 16:08:27 lxserv1055 puppet-agent[16951]: Retrieving plugin
Oct 18 16:09:46 lxserv1055 puppet-agent[16951]: Loading facts
Oct 18 16:12:33 lxserv1055 puppet-agent[16951]: Could not retrieve catalog from remote server: end of file reached

Logs from puppetserver:

2015-10-18 16:12:33,693 WARN  [o.e.j.s.HttpChannel] /production/catalog/lxserv1055.smhi.se
java.nio.channels.WritePendingException: null
	at org.eclipse.jetty.server.HttpConnection$SendCallback.reset(HttpConnection.java:624) ~[puppet-server-release.jar:na]
 
2015-10-18 16:12:33,693 WARN  [o.e.j.s.HttpChannel] /production/catalog/lxserv363.smhi.se
java.nio.channels.WritePendingException: null
	at org.eclipse.jetty.server.HttpConnection$SendCallback.reset(HttpConnection.java:624) ~[puppet-server-release.jar:na]

This is with

idle-timeout-milliseconds: 2400000
connect-timeout-milliseconds: 300000

on the puppetserver.

Are there any other timeout values I should be looking at?

Christopher Price (JIRA)

unread,
Oct 19, 2015, 5:37:04 PM10/19/15
to puppe...@googlegroups.com

Adam Winberg unfortunately I don't have any immediate guesses as to what timeout settings from our code could be at play here. Are your agents connecting through a load balancer?

Adam Winberg (JIRA)

unread,
Oct 20, 2015, 12:51:03 AM10/20/15
to puppe...@googlegroups.com
Adam Winberg commented on Bug PUP-3238

no, no load balancer. I use foreman as ENC and for reporting. I am still trying to pinpoint where the problem lies, I do believe its inherent to some overall infrastructure load in our environment, but since apache/passenger managed to get the job done i would think puppetserver should too.

Christopher Price (JIRA)

unread,
Oct 20, 2015, 10:27:05 AM10/20/15
to puppe...@googlegroups.com

Adam Winberg yeah. I'd be surprised if there wasn't some Jetty setting, or something related to Jetty, that would get it sorted. If we can come up with a reliable repro case, we would be happy to investigate it further on our end.

The fact that both of your agents timed out at the same time makes it seem like some kind of network hiccup or something.

Adam Winberg (JIRA)

unread,
Oct 20, 2015, 2:00:08 PM10/20/15
to puppe...@googlegroups.com
Adam Winberg commented on Bug PUP-3238

right now I have reverted back to apache/passenger just to confirm that this plays nicer than puppetserver in my env. Gonna let this run for a day or two. I do prefer the puppetserver jvm approach though, less moving parts and thus easier to manage - provided it works, of course.

Reproducing is hard. Well, not for me, I just have to wait a couple of hours, but since I can't properly define why my puppetserver gets into trouble in the first place it's not that easy for you guys to debug. Undeniably there is a lot of congestion on the puppetserver at some times, with catalog requests building up far faster than puppetserver can manage to serve them. I'm gonna think about networking some more.

Christopher Price (JIRA)

unread,
Oct 20, 2015, 3:32:04 PM10/20/15
to puppe...@googlegroups.com

OK, thanks. Keep us posted; we're definitely interested in fixing it if we find something we can sink our teeth into.

If you're interested you could skim the Jetty configuration docs and see if anything jumps out at you. We have a mechanism that we could use to help you test out any of the settings listed there if you were inclined to do so:

http://www.eclipse.org/jetty/documentation/current/configuring-connectors.html

Another thing we could try at some point is to help you build a jar that contains a newer version of Jetty than the one we currently ship, and see if that changes anything. We had a few weird network issues pop up under thundering herd situations about a year ago and a Jetty upgrade resolved them, but they didn't look exactly like this so it's hard to say whether it would help or not.

Adam Winberg (JIRA)

unread,
Oct 23, 2015, 4:51:03 AM10/23/15
to puppe...@googlegroups.com
Adam Winberg commented on Bug PUP-3238

Using apache/passenger I have only had one puppet run where ~70 agent runs failed. It's still really slow at those times, but manage to keep it together more so than puppetserver (with my current configuration).

The only parameters in the jetty config docs you provided that seem interesting to me is the idleTimeout and the stopTimeout. The former I believe is already set (idle-timeout-milliseconds), but the latter may be interesting, is this already available and what's its default value?

Adam Winberg (JIRA)

unread,
Oct 23, 2015, 6:23:05 AM10/23/15
to puppe...@googlegroups.com
Adam Winberg commented on Bug PUP-3238

I'm also seeing the occasional

WARN  [o.e.j.s.HttpChannel] Could not send response error 500: java.lang.IllegalStateException: org.eclipse.jetty.util.SharedBlockingCallback$BlockerTimeoutException

message in the puppetserver.log when agent runs are aborted. Then I found this jetty bug report:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=444031

Don't know if thats relevant with the jetty version puppetserver is using, but it sounds like it may be a similar problem.

Christopher Price (JIRA)

unread,
Oct 23, 2015, 10:40:08 AM10/23/15
to puppe...@googlegroups.com

Reading up on stopTimeout, it sounds like it just affects how long the server waits to forcefully kill threads during a shutdown. So, that sounds like a long shot, unfortunately.

What version of the JDK are you using? And remind me what version of Puppet Server?

There are some newer builds of Jetty out there; we're currently using 9.2.10, and they are up to 9.2.13 in the 9.2.x series, and 9.3.5 on the new 9.3.x series. I could pretty easily put a build together for you that included 9.3.5, but it looks like it has a minimum JDK version of 8.

It also might be useful to get some JMX metrics out of your Jetty setup when it's under high load. At some point I could put a script together to try to collect those for you, but TBH it'll probably be a few weeks before we have bandwidth for that because we have a big release date coming up.

Usually when we've seen those WritePending it's been because (from the perspective of the server) the client closed the connection. So if you're seeing that happen on multiple agents at the same time, I'd be suspicious of a load balancer or a router severing the connection; perhaps temporarily, and perhaps in a way that Apache is more patient about waiting for a recovery from.

If you are interested in testing a build with a newer Jetty, let us know. Or if you are able to narrow down what other things might be going on in your environment so that we can set up a repro case over here.

Adam Winberg (JIRA)

unread,
Oct 23, 2015, 11:12:16 AM10/23/15
to puppe...@googlegroups.com
Adam Winberg commented on Bug PUP-3238

I'm using puppetserver 1.1.1, which has a dependency (i'm on RHEL) to java-1.7.0-openjdk (of which I am running latest patch), so thats the one I'm using. It's no problem for me to run java8 instead, though.

I'm gonna do some more digging regarding networking and such. I'll get back to you about that updated jetty build.

Josh Cooper (JIRA)

unread,
Jan 6, 2016, 12:09:02 AM1/6/16
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-3238

For folks watching this ticket, Marc Fournier submitted a PR to puppet to enable TCP keepalive https://github.com/puppetlabs/puppet/pull/4508 which should allow the agent to detect when the remote side closes the idle connection, and not reuse a stale connection.

This message was sent by Atlassian JIRA (v6.4.12#64027-sha1:e3691cc)
Atlassian logo

Richard Pijnenburg (JIRA)

unread,
Apr 13, 2016, 5:03:05 AM4/13/16
to puppe...@googlegroups.com

We backported the PR from Marc Fournier to our Puppet 3.8.5 install and still shows the same issue with Puppet Server 1.1.3
We also increased the `idle-timeout-milliseconds` which had no impact either.

This message was sent by Atlassian JIRA (v6.4.13#64028-sha1:b7939e9)
Atlassian logo

Marc Fournier (JIRA)

unread,
Apr 13, 2016, 5:47:04 AM4/13/16
to puppe...@googlegroups.com
Marc Fournier commented on Bug PUP-3238

Richard Pijnenburg nb: this patch was about TCP keepalives, not HTTP Keepalive. And the goal was to force the agent to fail eventually. So I'm not really sure how it relates to this issue.

Adrien Thebo (JIRA)

unread,
May 16, 2017, 6:42:06 PM5/16/17
to puppe...@googlegroups.com
Adrien Thebo commented on Bug PUP-3238

It looks like there are a handful of related, but separate cases that can trigger this behavior. Unfortunately I don't think we've been able to find a way to consistently reproduce this in order to resolve it, so we're a bit dead in the water on this ticket. Could someone provide steps to reproduce this? A while back I wrote a tool to simulate various adverse network conditions (https://github.com/adrienthebo/netstomp), maybe something like this could be used to provide a reliable reproduction case?

This message was sent by Atlassian JIRA (v6.4.14#64029-sha1:ae256fe)
Atlassian logo

Adrien Thebo (JIRA)

unread,
May 16, 2017, 6:43:05 PM5/16/17
to puppe...@googlegroups.com

Moses Mendoza (JIRA)

unread,
May 18, 2017, 1:49:25 PM5/18/17
to puppe...@googlegroups.com

Jacob Helwig (JIRA)

unread,
Dec 7, 2017, 5:25:05 PM12/7/17
to puppe...@googlegroups.com
Jacob Helwig updated an issue
Change By: Jacob Helwig
Sub-team: Coremunity
This message was sent by Atlassian JIRA (v7.0.2#70111-sha1:88534db)
Atlassian logo

Reid Vandewiele (JIRA)

unread,
Mar 12, 2018, 12:05:06 PM3/12/18
to puppe...@googlegroups.com
Reid Vandewiele commented on Bug PUP-3238
 
Re: puppet reports "end of file reached" if server closes HTTP connection

Reproduction

This is one way to reproduce an "end of file" error, by running a load balancer to terminate connections after 1 second.

On a test system, standard PE install, apply this puppetlabs/haproxy configuration. I did this by just editing site.pp and making a node block for my master.

https://gist.github.com/reidmv/2fb505344895009e82a8ba7c5dd0e7d9

This will kill/timeout all connections over 1 second.

Next, run Puppet against the proxy port (9140).

puppet agent -t --http_debug --masterport 9140 --trace

This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Reid Vandewiele (JIRA)

unread,
Mar 14, 2018, 6:45:04 PM3/14/18
to puppe...@googlegroups.com

Adding PR that doesn't change behavior, but significantly improves visibility into what is happening when an exception is raised in this way.

https://github.com/puppetlabs/puppet/pull/6723

Reid Vandewiele (JIRA)

unread,
Mar 14, 2018, 7:16:04 PM3/14/18
to puppe...@googlegroups.com

Austin Boyd (JIRA)

unread,
Mar 14, 2018, 7:38:05 PM3/14/18
to puppe...@googlegroups.com
Austin Boyd updated an issue
Change By: Austin Boyd
Zendesk Ticket IDs: 28991
Zendesk Ticket Count: 1

Austin Boyd (JIRA)

unread,
Mar 14, 2018, 7:38:07 PM3/14/18
to puppe...@googlegroups.com
Austin Boyd updated an issue
Change By: Austin Boyd
Labels: LTS-Triage jira_escalated

Austin Boyd (JIRA)

unread,
Mar 15, 2018, 6:06:04 AM3/15/18
to puppe...@googlegroups.com
Austin Boyd updated an issue
Change By: Austin Boyd
Labels: LTS-Triage jira_escalated
Zendesk Ticket IDs: 28991
Zendesk Ticket Count: 1

Josh Cooper (JIRA)

unread,
Mar 19, 2018, 12:37:05 PM3/19/18
to puppe...@googlegroups.com

Josh Cooper (JIRA)

unread,
Mar 19, 2018, 12:37:07 PM3/19/18
to puppe...@googlegroups.com

Eric Delaney (JIRA)

unread,
Mar 19, 2018, 1:42:04 PM3/19/18
to puppe...@googlegroups.com

Josh Cooper (JIRA)

unread,
May 1, 2018, 1:37:06 PM5/1/18
to puppe...@googlegroups.com

Josh Cooper (JIRA)

unread,
May 1, 2018, 2:20:06 PM5/1/18
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-3238
 
Re: puppet reports "end of file reached" if server closes HTTP connection

Coming back to this after awhile. Puppet may try to reuse a connection that was actively closed by the remote side (eg due to a load balancer sending FIN or RST) or the connection is in a half-open state (and the agent never received notification). Puppet could do an IO.select on the socket to detect an error condition in the first case, but that won't help the second. The only way to do that is to try to write something to the socket, rescue the exception, and retry. We could probably implement something similar to what httpclient does, to wrap requests in a protect_keep_alive_disconnected block, and force a new connection to be created if the request fails. However, I think that should only occur if there is an exception at the TCP layer, not the application layer, otherwise we might retry a non-idempotent REST method, e.g. PUT.

A slightly different issue is that puppet uses an infinite read timeout by default. If puppet sends a request, e.g. for a catalog, and is waiting for a response, but ends up in a half-open state, then the agent will hang indefinitely (unless runtimeout is set to a non-zero value, see PUP-7517). In Puppet 6, we should change both of those default timeouts to something less than infinite (see PUP-8683).

Kenn Hussey (JIRA)

unread,
Jun 21, 2018, 12:49:04 PM6/21/18
to puppe...@googlegroups.com

Josh Cooper (JIRA)

unread,
Jun 26, 2018, 1:49:08 PM6/26/18
to puppe...@googlegroups.com

Josh Cooper (JIRA)

unread,
Jun 26, 2018, 1:49:08 PM6/26/18
to puppe...@googlegroups.com

Rob Braden (JIRA)

unread,
Aug 13, 2018, 5:10:05 PM8/13/18
to puppe...@googlegroups.com

Melissa Stone (JIRA)

unread,
Feb 21, 2019, 5:14:05 PM2/21/19
to puppe...@googlegroups.com

Josh Cooper (Jira)

unread,
Mar 3, 2020, 1:36:04 PM3/3/20
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-3238

Ruby does the following to detect if a connection has been closed: https://github.com/ruby/ruby/blob/14dd377e51408ef07e03c27f95ff6b0e186df022/lib/net/http.rb#L1581, such as if you issue multiple HTTP requests on the same connection.

Implementing that check would help the first case above (where the remote side closes the connection and we receive the FIN), but it won't help the second (where we never receive the FIN). That said I think the change would be beneficial.

This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)
Atlassian logo

Josh Cooper (Jira)

unread,
Mar 4, 2020, 2:55:03 PM3/4/20
to puppe...@googlegroups.com

Josh Cooper (Jira)

unread,
Mar 13, 2020, 7:39:04 PM3/13/20
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Sprint: Coremunity Grooming Hopper

Josh Cooper (Jira)

unread,
Mar 30, 2020, 7:13:03 PM3/30/20
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-3238
 
Re: puppet reports "end of file reached" if server closes HTTP connection

I submitted a PR to improve our test for ruby's EOFError behavior, since ruby can automatically retry idempotent requests, which may mask the EOF problem (see PUP-3905)

That said, this ticket was filed against puppet 3.7 and ruby 1.8.7, and as we've seen, ruby's behavior has changed over the years. Puppet 5.5.x ships with ruby 2.3.8 and puppet 6 requires ruby 2.3 or above, and both of those ruby versions contain the ruby fix to reconnect after EOF. So it appears this ticket has been overtaken by recent events, and I think we should close this ticket.

If there are cases where a more recent ruby/puppet incorrectly handles this case (when the remote side closes the connection) please reopen this ticket.

Reply all
Reply to author
Forward
0 new messages