[JIRA] (JENKINS-53431) Agent attempts reconnect after termination. Leaves Kube pod in failed state

124 views
Skip to first unread message

mckaymatt@gmail.com (JIRA)

unread,
Sep 5, 2018, 12:31:03 PM9/5/18
to jenkinsc...@googlegroups.com
Matt McKay created an issue
 
Jenkins / Bug JENKINS-53431
Agent attempts reconnect after termination. Leaves Kube pod in failed state
Issue Type: Bug Bug
Assignee: Carlos Sanchez
Attachments: active.txt, node_info, pod_describe, pod_log
Components: core, kubernetes-plugin
Created: 2018-09-05 16:30
Environment: Operating system
- BUILD_ID=10452.109.0
- NAME="Container-Optimized OS"
- GOOGLE_CRASH_ID=Lakitu
- VERSION_ID=66
- PRETTY_NAME="Container-Optimized OS from Google"
- VERSION=66
- GOOGLE_METRICS_PRODUCT_ID=26
- HOME_URL="https://cloud.google.com/compute/docs/containers/vm-image/"

master image: jenkins/jenkins:2.121.3
Remoting version: 3.25
Kubernetes 1.10.6-gke.2
JAVA_VERSION=8u181
JENKINS_VERSION=2.121.3

ace-editor:1.1:not-pinned
ant:1.8:not-pinned
antisamy-markup-formatter:1.5:not-pinned
apache-httpcomponents-client-4-api:4.5.5-3.0:not-pinned
audit-trail:2.3:not-pinned
authentication-tokens:1.3:not-pinned
aws-credentials:1.23:not-pinned
aws-java-sdk:1.11.341:not-pinned
bouncycastle-api:2.17:not-pinned
branch-api:2.0.20:not-pinned
build-failure-analyzer:1.20.0:not-pinned
build-name-setter:1.6.9:not-pinned
build-timeout:1.19:not-pinned
cloudbees-folder:6.5.1:not-pinned
command-launcher:1.2:not-pinned
conditional-buildstep:1.3.6:not-pinned
credentials:2.1.18:not-pinned
credentials-binding:1.16:not-pinned
disable-failed-job:1.15:not-pinned
display-url-api:2.2.0:not-pinned
docker-commons:1.13:not-pinned
docker-workflow:1.17:not-pinned
durable-task:1.25:not-pinned
envinject:2.1.6:not-pinned
envinject-api:1.5:not-pinned
external-monitor-job:1.7:not-pinned
failedJobDeactivator:1.2.1:not-pinned
git:3.9.1:not-pinned
git-client:2.7.3:not-pinned
git-parameter:0.9.4:not-pinned
git-server:1.7:not-pinned
github:1.29.2:not-pinned
github-api:1.92:not-pinned
github-branch-source:2.3.6:not-pinned
github-oauth:0.29:not-pinned
google-container-registry-auth:0.3:not-pinned
google-oauth-plugin:0.6:not-pinned
gradle:1.29:not-pinned
handlebars:1.1.1:not-pinned
icon-shim:2.0.3:not-pinned
jackson2-api:2.8.11.3:not-pinned
javadoc:1.4:not-pinned
jdk-tool:1.1:not-pinned
job-dsl:1.70:not-pinned
jobConfigHistory:2.18:not-pinned
jquery:1.12.4-0:not-pinned
jquery-detached:1.2.1:not-pinned
jsch:0.1.54.2:not-pinned
junit:1.24:not-pinned
kubernetes:1.12.4:not-pinned
kubernetes-credentials:0.3.1:not-pinned
ldap:1.20:not-pinned
mailer:1.21:not-pinned
mapdb-api:1.0.9.0:not-pinned
matrix-auth:2.3:not-pinned
matrix-project:1.13:not-pinned
maven-plugin:3.1.2:not-pinned
metrics:4.0.2.2:not-pinned
momentjs:1.1.1:not-pinned
nodelabelparameter:1.7.2:not-pinned
oauth-credentials:0.3:not-pinned
pam-auth:1.4:not-pinned
parameterized-trigger:2.35.2:not-pinned
pipeline-build-step:2.7:not-pinned
pipeline-github-lib:1.0:not-pinned
pipeline-graph-analysis:1.7:not-pinned
pipeline-input-step:2.8:not-pinned
pipeline-milestone-step:1.3.1:not-pinned
pipeline-model-api:1.3.2:not-pinned
pipeline-model-declarative-agent:1.1.1:not-pinned
pipeline-model-definition:1.3.2:not-pinned
pipeline-model-extensions:1.3.2:not-pinned
pipeline-rest-api:2.10:not-pinned
pipeline-stage-step:2.3:not-pinned
pipeline-stage-tags-metadata:1.3.2:not-pinned
pipeline-stage-view:2.10:not-pinned
plain-credentials:1.4:not-pinned
postbuild-task:1.8:not-pinned
rebuild:1.28:not-pinned
resource-disposer:0.12:not-pinned
run-condition:1.2:not-pinned
scm-api:2.2.7:not-pinned
script-security:1.46:not-pinned
slack:2.3:not-pinned
ssh-agent:1.16:not-pinned
ssh-credentials:1.14:not-pinned
structs:1.14:not-pinned
subversion:2.11.1:not-pinned
support-core:2.49:not-pinned
testng-plugin:1.15:not-pinned
timestamper:1.8.10:not-pinned
token-macro:2.5:not-pinned
variant:1.1:not-pinned
windows-slaves:1.3.1:not-pinned
workflow-aggregator:2.5:not-pinned
workflow-api:2.29:not-pinned
workflow-basic-steps:2.10:not-pinned
workflow-cps:2.54:not-pinned
workflow-cps-global-lib:2.10:not-pinned
workflow-durable-task-step:2.21:not-pinned
workflow-job:2.24:not-pinned
workflow-multibranch:2.20:not-pinned
workflow-scm-step:2.6:not-pinned
workflow-step-api:2.16:not-pinned
workflow-support:2.20:not-pinned
ws-cleanup:0.34:not-pinned
Priority: Minor Minor
Reporter: Matt McKay

I noticed about a month ago that I had many un-deleted pods in Kubernetes. The jobs didn't fail according to Jenkins, but the pods exited non-zero in kubernetes. I'm not sure what version this started with, but it's happening on the latest.

 

kubectl --namespace=jenkins get po -a
NAME READY STATUS RESTARTS AGE
jenkins-75fd5fdbb-jkrmf 1/1 Running 0 53d
jenkins-slave-jxk94-729sl 0/1 Error 0 3h
jenkins-slave-jxk94-bq6kp 0/1 Error 0 25m
jenkins-slave-jxk94-jrdgb 0/1 Error 0 35m
jenkins-slave-jxk94-w0tdh 0/1 Error 0 2m

This bug isn't critical since it's not failing my builds, but something is going wrong.

 

This is the log from the agent:

 

Warning: JnlpProtocol3 is disabled by default, use JNLP_PROTOCOL_OPTS to alter the behavior
Warning: SECRET is defined twice in command-line arguments and the environment variable
Warning: AGENT_NAME is defined twice in command-line arguments and the environment variable
Sep 05, 2018 3:57:44 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: jenkins-slave-jxk94-6c6sd
Sep 05, 2018 3:57:44 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Sep 05, 2018 3:57:44 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.25
Sep 05, 2018 3:57:44 PM hudson.remoting.Engine startEngine
WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://jenkins-discovery/]
Sep 05, 2018 3:57:45 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
  Agent address: jenkins-discovery
  Agent port:    50000
  Identity:      96:81:a1:68:84:20:aa:12:1f:2b:97:b0:c5:2f:de:25
Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins-discovery:50000
Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Sep 05, 2018 3:57:45 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: 96:81:a1:68:84:20:aa:12:1f:2b:97:b0:c5:2f:de:25
Sep 05, 2018 3:57:46 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Sep 05, 2018 3:57:48 PM org.jenkinsci.remoting.util.AnonymousClassWarnings warn
WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.envinject.EnvInjectComputerListener$2; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
Sep 05, 2018 3:57:49 PM org.jenkinsci.remoting.util.AnonymousClassWarnings warn
WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.gitclient.Git$1; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
Sep 05, 2018 3:57:51 PM org.jenkinsci.remoting.util.AnonymousClassWarnings warn
WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
Sep 05, 2018 4:07:36 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated

^^ This terminates because I abort the build. The log then continues:

 

 

Sep 05, 2018 4:07:46 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1 onReconnect
INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@581415ae
Sep 05, 2018 4:07:49 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: jenkins-slave-jxk94-6c6sd
Sep 05, 2018 4:07:49 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Sep 05, 2018 4:07:49 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.25
Sep 05, 2018 4:07:49 PM hudson.remoting.Engine startEngine
WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
Sep 05, 2018 4:07:49 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://jenkins-discovery/]
Sep 05, 2018 4:07:49 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
  Agent address: jenkins-discovery
  Agent port:    50000
  Identity:      96:81:a1:68:84:20:aa:12:1f:2b:97:b0:c5:2f:de:25
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins-discovery:50000
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: 96:81:a1:68:84:20:aa:12:1f:2b:97:b0:c5:2f:de:25
Sep 05, 2018 4:07:50 PM org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer onRecv
INFO: [JNLP4-connect connection to jenkins-discovery/10.55.243.214:50000] Local headers refused by remote: Unknown client name: jenkins-slave-jxk94-6c6sd
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Protocol JNLP4-connect encountered an unexpected exception
java.util.concurrent.ExecutionException: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: jenkins-slave-jxk94-6c6sd
	at org.jenkinsci.remoting.util.SettableFuture.get(SettableFuture.java:223)
	at hudson.remoting.Engine.innerRun(Engine.java:614)
	at hudson.remoting.Engine.run(Engine.java:474)
Caused by: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: jenkins-slave-jxk94-6c6sd
	at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.newAbortCause(ConnectionHeadersFilterLayer.java:378)
	at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.onRecvClosed(ConnectionHeadersFilterLayer.java:433)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:172)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
	at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
	at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1500(BIONetworkLayer.java:48)
	at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:247)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
	at java.lang.Thread.run(Thread.java:748)
	Suppressed: java.nio.channels.ClosedChannelException
		... 7 moreSep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins-discovery:50000
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP4-plaintext not supported, skipping
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Protocol JNLP3-connect is not enabled, skipping
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP2-connect not supported, skipping
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP-connect not supported, skipping
Sep 05, 2018 4:07:50 PM hudson.remoting.jnlp.Main$CuiListener error
SEVERE: The server rejected the connection: None of the protocols were accepted
java.lang.Exception: The server rejected the connection: None of the protocols were accepted
	at hudson.remoting.Engine.onConnectionRejected(Engine.java:675)
	at hudson.remoting.Engine.innerRun(Engine.java:639)
	at hudson.remoting.Engine.run(Engine.java:474)

It attempts to reconnect but is met with "Unknown client name".  Is there some reason it attempts to reconnect and then errors when it cannot?

I can launch a 4 job parallel pipeline and if I abort it I end up with 5 Errored pods. One is the pipeline job itself, with four of the sub-jobs. This can also happen when builds succeed.

Is this normal behavior for the agent to keep trying to connect after being terminated?

 

 

 

 

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

mckaymatt@gmail.com (JIRA)

unread,
Sep 10, 2018, 4:53:02 PM9/10/18
to jenkinsc...@googlegroups.com
Matt McKay commented on Bug JENKINS-53431
 
Re: Agent attempts reconnect after termination. Leaves Kube pod in failed state

After updating everything again, this issue seems to have resolved itself.

mckaymatt@gmail.com (JIRA)

unread,
Sep 10, 2018, 4:53:02 PM9/10/18
to jenkinsc...@googlegroups.com
Matt McKay resolved as Fixed
 

This seems to have gone away on in it's own.

Change By: Matt McKay
Status: Open Resolved
Resolution: Fixed
Reply all
Reply to author
Forward
0 new messages