[JIRA] (JENKINS-58513) Flake in RestartPipelineTest#terminatedPodAfterRestart

1 vue
Accéder directement au premier message non lu

vincent@latombe.net (JIRA)

non lue,
16 juil. 2019, 07:42:0316/07/2019
à jenkinsc...@googlegroups.com
Vincent Latombe created an issue
 
Jenkins / Bug JENKINS-58513
Flake in RestartPipelineTest#terminatedPodAfterRestart
Issue Type: Bug Bug
Assignee: Carlos Sanchez
Components: kubernetes-plugin
Created: 2019-07-16 11:41
Priority: Minor Minor
Reporter: Vincent Latombe

I'm getting test failures from time to time with the following error.

Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 227.45 s <<< FAILURE! - in org.csanchez.jenkins.plugins.kubernetes.pipeline.RestartPipelineTest
 [ERROR] terminatedPodAfterRestart(org.csanchez.jenkins.plugins.kubernetes.pipeline.RestartPipelineTest)  Time elapsed: 52.479 s  <<< FAILURE!
 java.lang.AssertionError: 
 
 Expected: a string containing " was deleted, but do not have a node body to cancel"
      but: was "Started
 Running in Durability level: MAX_SURVIVABILITY
 [Pipeline] Start of Pipeline
 [Pipeline] podTemplate
 [Pipeline] {
 [Pipeline] node
 Agent terminatedpodafterrestart-kjsfm-7krt9 is provisioned from template Kubernetes Pod Template
 Agent specification [Kubernetes Pod Template] (terminatedPodAfterRestart): 
 ---
 apiVersion: "v1"
 kind: "Pod"
 metadata:
   annotations:
     buildUrl: "http://100.96.12.163:44000/jenkins/job/terminated%20Pod%20After%20Restart/1/"
   labels:
     jenkins: "slave"
     BUILD_NUMBER: "2"
     test: "terminatedPodAfterRestart"
     jenkins/terminatedPodAfterRestart: "true"
     BRANCH_NAME: "PR-546"
     class: "RestartPipelineTest"
   name: "terminatedpodafterrestart-kjsfm-7krt9"
 spec:
   containers:
   - command:
     - "/bin/cat"
     env:
     - name: "JENKINS_SECRET"
       value: "984e0b6a03b49c6ad8c2c8e24cef61e2dd279e885c46cb7278e0f56d8b920880"
     - name: "JENKINS_AGENT_NAME"
       value: "terminatedpodafterrestart-kjsfm-7krt9"
     - name: "JENKINS_NAME"
       value: "terminatedpodafterrestart-kjsfm-7krt9"
     - name: "JENKINS_URL"
       value: "http://100.96.12.163:44000/jenkins/"
     image: "busybox"
     imagePullPolicy: "IfNotPresent"
     name: "busybox"
     resources:
       limits: {}
       requests: {}
     securityContext:
       privileged: false
     tty: true
     volumeMounts:
     - mountPath: "/home/jenkins"
       name: "workspace-volume"
       readOnly: false
     workingDir: "/home/jenkins"
   - env:
     - name: "JENKINS_SECRET"
       value: "984e0b6a03b49c6ad8c2c8e24cef61e2dd279e885c46cb7278e0f56d8b920880"
     - name: "JENKINS_AGENT_NAME"
       value: "terminatedpodafterrestart-kjsfm-7krt9"
     - name: "JENKINS_NAME"
       value: "terminatedpodafterrestart-kjsfm-7krt9"
     - name: "JENKINS_URL"
       value: "http://100.96.12.163:44000/jenkins/"
     image: "jenkins/jnlp-slave:alpine"
     name: "jnlp"
     volumeMounts:
     - mountPath: "/home/jenkins"
       name: "workspace-volume"
       readOnly: false
   nodeSelector: {}
   restartPolicy: "Never"
   volumes:
   - emptyDir: {}
     name: "workspace-volume"
 
 Running on terminatedpodafterrestart-kjsfm-7krt9 in /home/jenkins/workspace/terminated Pod After Restart
 [Pipeline] {
 [Pipeline] container
 [Pipeline] {
 [Pipeline] sh
 + sleep 9999999
 Resuming build at Tue Jul 16 11:34:37 UTC 2019 after Jenkins restart
 Waiting to resume part of terminated Pod After Restart #1: In the quiet period. Expires in 0 ms
 Still trying to load Looking for path named ‘/home/jenkins/workspace/terminated Pod After Restart’ on computer named ‘terminatedpodafterrestart-kjsfm-7krt9’
 Waiting to resume part of terminated Pod After Restart #1: ‘terminatedpodafterrestart-kjsfm-7krt9’ is offline
 Still trying to load Looking for path named ‘/home/jenkins/workspace/terminated Pod After Restart’ on computer named ‘terminatedpodafterrestart-kjsfm-7krt9’
 Waiting to resume part of terminated Pod After Restart #1: ‘terminatedpodafterrestart-kjsfm-7krt9’ is offline
 Still trying to load Looking for path named ‘/home/jenkins/workspace/terminated Pod After Restart’ on computer named ‘terminatedpodafterrestart-kjsfm-7krt9’
 Waiting to resume part of terminated Pod After Restart #1: ‘terminatedpodafterrestart-kjsfm-7krt9’ is offline
 Still trying to load Looking for path named ‘/home/jenkins/workspace/terminated Pod After Restart’ on computer named ‘terminatedpodafterrestart-kjsfm-7krt9’
 Ready to run at Tue Jul 16 11:34:49 UTC 2019
 Agent terminatedpodafterrestart-kjsfm-7krt9 is provisioned from template Kubernetes Pod Template
 Agent specification [Kubernetes Pod Template] (terminatedPodAfterRestart): 
 ---
 apiVersion: "v1"
 kind: "Pod"
 metadata:
   annotations:
     buildUrl: "http://100.96.12.163:44000/jenkins/job/terminated%20Pod%20After%20Restart/1/"
   labels:
     jenkins: "slave"
     BUILD_NUMBER: "2"
     test: "terminatedPodAfterRestart"
     jenkins/terminatedPodAfterRestart: "true"
     BRANCH_NAME: "PR-546"
     class: "RestartPipelineTest"
   name: "terminatedpodafterrestart-kjsfm-7krt9"
 spec:
   containers:
   - command:
     - "/bin/cat"
     env:
     - name: "JENKINS_SECRET"
       value: "984e0b6a03b49c6ad8c2c8e24cef61e2dd279e885c46cb7278e0f56d8b920880"
     - name: "JENKINS_AGENT_NAME"
       value: "terminatedpodafterrestart-kjsfm-7krt9"
     - name: "JENKINS_NAME"
       value: "terminatedpodafterrestart-kjsfm-7krt9"
     - name: "JENKINS_URL"
       value: "http://localhost:44000/jenkins/"
     image: "busybox"
     imagePullPolicy: "IfNotPresent"
     name: "busybox"
     resources:
       limits: {}
       requests: {}
     securityContext:
       privileged: false
     tty: true
     volumeMounts:
     - mountPath: "/home/jenkins"
       name: "workspace-volume"
       readOnly: false
     workingDir: "/home/jenkins"
   - env:
     - name: "JENKINS_SECRET"
       value: "984e0b6a03b49c6ad8c2c8e24cef61e2dd279e885c46cb7278e0f56d8b920880"
     - name: "JENKINS_AGENT_NAME"
       value: "terminatedpodafterrestart-kjsfm-7krt9"
     - name: "JENKINS_NAME"
       value: "terminatedpodafterrestart-kjsfm-7krt9"
     - name: "JENKINS_URL"
       value: "http://localhost:44000/jenkins/"
     image: "jenkins/jnlp-slave:alpine"
     name: "jnlp"
     volumeMounts:
     - mountPath: "/home/jenkins"
       name: "workspace-volume"
       readOnly: false
   nodeSelector: {}
   restartPolicy: "Never"
   volumes:
   - emptyDir: {}
     name: "workspace-volume"
 
 terminatedpodafterrestart-kjsfm-7krt9 was marked offline: Connection was broken: java.nio.channels.ClosedChannelException
 	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:209)
 	at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
 	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:816)
 	at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
 	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
 	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:784)
 	at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
 	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:314)
 	at hudson.remoting.Channel.close(Channel.java:1452)
 	at hudson.remoting.Channel.close(Channel.java:1405)
 	at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:844)
 	at hudson.slaves.SlaveComputer.kill(SlaveComputer.java:811)
 	at hudson.model.AbstractCIBase.killComputer(AbstractCIBase.java:89)
 	at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:233)
 	at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1576)
 	at jenkins.model.Nodes$6.run(Nodes.java:271)
 	at hudson.model.Queue._withLock(Queue.java:1379)
 	at hudson.model.Queue.withLock(Queue.java:1256)
 	at jenkins.model.Nodes.removeNode(Nodes.java:262)
 	at jenkins.model.Jenkins.removeNode(Jenkins.java:2092)
 	at org.csanchez.jenkins.plugins.kubernetes.pod.retention.Reaper.eventReceived(Reaper.java:122)
 	at org.csanchez.jenkins.plugins.kubernetes.pod.retention.Reaper.eventReceived(Reaper.java:48)
 	at io.fabric8.kubernetes.client.utils.WatcherToggle.eventReceived(WatcherToggle.java:49)
 	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:232)
 	at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323)
 	at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
 	at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
 	at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
 	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
 	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
 	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 	at java.lang.Thread.run(Thread.java:748)
 
 Cannot contact terminatedpodafterrestart-kjsfm-7krt9: hudson.remoting.RequestAbortedException: java.nio.channels.ClosedChannelException
 [Pipeline] }
 [Pipeline] // container
 [Pipeline] }
 [Pipeline] // node
 [Pipeline] // node
 [Pipeline] }
 [Pipeline] // podTemplate
 [Pipeline] End of Pipeline
 Agent was removed
 Finished: ABORTED
 "
 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
 	at org.junit.Assert.assertThat(Assert.java:956)
 	at org.junit.Assert.assertThat(Assert.java:923)
 	at org.jvnet.hudson.test.JenkinsRule.assertLogContains(JenkinsRule.java:1387)
 	at org.csanchez.jenkins.plugins.kubernetes.pipeline.RestartPipelineTest.lambda$terminatedPodAfterRestart$5(RestartPipelineTest.java:218)
 	at org.jvnet.hudson.test.RestartableJenkinsRule$3.evaluate(RestartableJenkinsRule.java:232)
 	at org.jvnet.hudson.test.RestartableJenkinsRule$6.evaluate(RestartableJenkinsRule.java:272)
 	at org.jvnet.hudson.test.JenkinsRule$1.evaluate(JenkinsRule.java:599)
 	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
 	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 	at java.lang.Thread.run(Thread.java:748) 
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

jglick@cloudbees.com (JIRA)

non lue,
16 juil. 2019, 15:01:0216/07/2019
à jenkinsc...@googlegroups.com

jglick@cloudbees.com (JIRA)

non lue,
16 juil. 2019, 15:01:0316/07/2019
à jenkinsc...@googlegroups.com
Jesse Glick assigned an issue to Jesse Glick
Change By: Jesse Glick
Assignee: Carlos Sanchez Jesse Glick

jglick@cloudbees.com (JIRA)

non lue,
30 juil. 2019, 08:31:0230/07/2019
à jenkinsc...@googlegroups.com
Jesse Glick commented on Bug JENKINS-58513
 
Re: Flake in RestartPipelineTest#terminatedPodAfterRestart

kubernetes #561 removed the failing assertion. Still need to track down the reason.

vincent@latombe.net (JIRA)

non lue,
30 juil. 2019, 08:56:0230/07/2019
à jenkinsc...@googlegroups.com
Vincent Latombe commented on Bug JENKINS-58513
 
Re: Flake in RestartPipelineTest#terminatedPodAfterRestart

Jesse Glick while attempting to track down the root cause, I saw that in the failure cases it was running this line.

vincent@latombe.net (JIRA)

non lue,
30 juil. 2019, 08:57:0230/07/2019
à jenkinsc...@googlegroups.com
Vincent Latombe edited a comment on Bug JENKINS-58513
 
Re: Flake in RestartPipelineTest#terminatedPodAfterRestart
[~jglick] while attempting to track down the root cause, I saw that in the failure cases it was running [this| [https://github.com/jenkinsci/workflow-durable-task-step-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/support/steps/ExecutorStepExecution.java #L265]] line. Since it runs asynchronously, this is probably a race condition with the thread handling node deletion.

vincent@latombe.net (JIRA)

non lue,
30 juil. 2019, 08:57:0330/07/2019
à jenkinsc...@googlegroups.com
Vincent Latombe edited a comment on Bug JENKINS-58513
 
Re: Flake in RestartPipelineTest#terminatedPodAfterRestart
[~jglick] while attempting to track down the root cause, I saw that in the failure cases it was running [this|#L265] ]   line. Since it runs asynchronously, this is probably a race condition with the thread handling node deletion.

vincent@latombe.net (JIRA)

non lue,
30 juil. 2019, 09:32:0230/07/2019
à jenkinsc...@googlegroups.com
Vincent Latombe edited a comment on Bug JENKINS-58513
 
Re: Flake in RestartPipelineTest#terminatedPodAfterRestart
[~jglick] while attempting to track down the root cause, I saw that in the failure cases it was running [this| https://github.com/jenkinsci/workflow-durable-task-step-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/support/steps/ExecutorStepExecution.java #L265] line. Since it runs asynchronously, this is probably a race condition with the thread handling node deletion.

jglick@cloudbees.com (JIRA)

non lue,
30 juil. 2019, 09:34:0230/07/2019
à jenkinsc...@googlegroups.com
Répondre à tous
Répondre à l'auteur
Transférer
0 nouveau message