@deads2k @lavalamp @liggitt @sttts @k8s-mirror-api-machinery-bugs
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
But it is strange that the kubelet could reconnect seeing that by netstat.
/kind bug
But the kube-proxy and kube-controller-manager will not reconnect to another apiserver.
they should once the dead/idle TCP connection times out
@liggitt The tcp keepalive is ok in this case. It can not sense exception.
I think introduce app level keepalive between server and client for long living watch connections can solve this.
APIServer send heartbeat periodically, and client will be aware that the connection is dead if not receiving heartbeat for a timeout.
how to deal with keepalive timeout? Maybe client can close the underlying connection directly, but I am not sure if there is any risk for http2 multiplexing.
I think the cost of keepalive is bearable in order to make kubernetes more reliable.
This has come up several times and I still haven't seen evidence that we still have a problem if the TCP keepalive is configured and working as expected?
And if HTTP/2's caching is an issue, timing out the watch at an app level will just lead to reopening a watch over the same dead but cached connection.
And if HTTP/2's caching is an issue, timing out the watch at an app level will just lead to reopening a watch over the same dead but cached connection.
Yes, can force reconnect only by closing the previous tcp connection explicitly.
No, we can close connections when keepalive timeout. Then a newer request, no matter watch/get/list
, will lead to http client to get a connection, and since no connection can use now. It has to dial again to establish a new connection. The new connection is possibly with a new healthy apiserver.
opened a pr #65087, and have tested it simply.
hoisting conversation from the kubelet-specific issue #48638 (comment) to here
Since this issue has been re-opened, would there be any value in me re-opening my PR for this commit? Monzo has been running this patch in production since last July and it has eliminated this problem entirely, for all uses of
client-go
.
I still have several reservations about that fix:
- it relies on behavior that appears to be undefined (it changes properties on the result of
net.TCPConn#File()
, documented as "The returned os.File's file descriptor is different from the connection's. Attempting to change properties of the original using this duplicate may or may not have the desired effect.")- calling
net.TCPConn#File()
has implications I'm unsure of: "On Unix systems this will cause the SetDeadline methods to stop working."- it appears to only trigger closing in response to unacknowledged outgoing data... that doesn't seem like it would help clients (like kube-proxy) with long-running receive-only watch stream connections
that said, if @kubernetes/sig-api-machinery-pr-reviews and/or @kubernetes/sig-network-pr-reviews feel strongly that is the correct direction to pursue, that would be helpful to hear.
Few notes on these very valid concerns:
- https://golang.org/pkg/net/#TCPConn.File is returning dup'ed filedescriptor, which AFAIK share all underneath structures in kernel, except for entry in file descriptor table, so either can be used with same results. Program should be aware not to try to use them simultaneously though, for exact same reasons.
- today returned filedescriptor is set to blocking mode. Probably it can be mitigated by setting it back to nonblocking mode. In Go 1.11 returned fd is going to be in same blocking/unblocking mode as it was before .File() call: golang/go#24942
- Maybe it will not help simple watchers, I am not familiar with Informers internals, but I was under the impression that they are not only watching, but also periodically resyncing state, these resync would trigger outgoing data transfer which would then be detected .
Indeed: as far as I understand, the behaviour is not undefined, it's just defined in Linux rather than in Go. I think the Go docs could be clearer on this. Here's the relevant section from
dup(2)
:After a successful return from one of these system calls, the old and new file descriptors may be used interchangeably. They refer to the same open file description (see
open(2)
) and thus share file offset and file status flags; for example, if the file offset is modified by usinglseek(2)
on one of the descriptors, the offset is also changed for the other.The two descriptors do not share file descriptor flags (the close-on-exec flag).
My code doesn't modify flags after obtaining the fd, instead its only use is in a call to
setsockopt(2)
. The docs for that call are fairly clear that it modifies properties of the socket referred to by the descriptor, not the descriptor itself:
getsockopt()
andsetsockopt()
manipulate options for the socket referred to by the file descriptorsockfd
.I agree that the original descriptor being set to blocking mode is annoying. Go's code is clear that this will not prevent anything from working, just that more OS threads may be required for I/O:
Given that a single Kubelet (or otherwise use of client-go) establishes a small number of long-lived connections to the apiservers, and that this will be fixed in Go 1.11, I don't think this is a significant issue.
I am happy for this to be fixed in another way, but given we know that this works and does not require invasive changes to the apiserver to achieve, I think it is a reasonable solution. I have heard from several production users of Kubernetes that this has bitten them in the same way it bit us.
See my comment here: #65087 (comment)
Is there a reason why we can't start using the http2 request reliability mechanisms? (https://http2.github.io/http2-spec/#Reliability)
Hi @lavalamp @liggitt what's the status of this?
When doing maintenance on our master nodes and/or removing a master node we ran into numerous issues tracing back to stuck TCP connections.
The first was with the kubelet -> api-server connection which seems to have been fixed by #63492 however we also encountered issues with many other components which leverage client-go to watch resources, such as Prometheus and CoreDNS.
Is there a timeline for the broader client-go fix? Many of our cluster services rely on watchers to the api-server so this issue really cascades down affecting the whole cluster and requires rolling everything to re-start a fresh TCP connection.
I'm also willing to lend help with the fix if you need!
@jekohk from an extremely cursory search, it appears that we'll need to switch to using the golang.org/x/net/http2 package, and for every connection, periodically call https://godoc.org/golang.org/x/net/http2#ClientConn.Ping. There doesn't look to be a way to automatically get the connection pool type there to do this for us; I wonder if they'd consider adding that or accepting something to add it?
It is really appreciated if golang http2 allows us to send Ping frame in our cases.
@lavalamp Do you know if there is any plan for this?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
As bookmark has been implemented ref: #75474
Have looked int to the bookmark implementation carefully, the core mechanism to send resource version to clients periodically.
Which is similar to #65087, which added an eventtype HeartBeat EventType = "HEARTBEAT"
. Whatever i want to say, we can leverage this event to detect if a connection is alive. (Since it is impossible to send http2 ping frame currently, as talked many times ). I think this is the most simple way to detect hanging connection now.
want to hear your comments @deads2k @lavalamp @liggitt @wojtek-t @hormes @jpbetz @xiang90 @resouer
Depending on how frequently you would like to send it, it may have significant performance implications.
I would much rather prefer using http2 Ping for that purpose.
The bookmarks event is sent around 1s periodically now , right? This would not introduce another extra event sending, we just check on client side, if it does not see any events received in like 10s, close its connection as kubelet does.
The bookmarks event is sent around 1s periodically now , right?
No, it's currently being send only ~2-3s before the watch will timeout. With 1s period, we won't keep up.
I see that, here it just send bookmark event to a buffer.
Yes - and then we efficiently check which watches will terminate soon and send bookmark only to those.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
—
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
heyo, I figured I'd bump this issue because I'm currently running into this issue with our services. It's rough because I essentially cannot detect when the stream fails, so I'm required to make our applications that use watch events restart the stream constantly in order to ensure it's live.
We're waiting for go1.15. @caesarxuchao made a fix for this in the upstream go std library. Unfortunately, it's basically impossible to fix this without that change.
This is already tracked here: kubernetes/client-go#374 (comment)
Apparently we've gone 2 years without de-duping these two issues, oops!
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
—
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Closed #65012.
This has been resolved with http2 keepalive.
Can you point to where that fix is? I would assume a http2 keep-alive timeout would cause the connection to restart after X seconds of inactivity, but it would just timeout and restart constantly, where as I'd really love something that sends a ping frame in order to continually keep the connection alive or timeout if one is not received in return.