[JIRA] (JENKINS-57086) Stuck, hanging, unkillable jobs in Jenkins

2 views
Skip to first unread message

nagy.attila@gmail.com (JIRA)

unread,
Apr 17, 2019, 10:20:03 AM4/17/19
to jenkinsc...@googlegroups.com
Attila Nagy created an issue
 
Jenkins / Bug JENKINS-57086
Stuck, hanging, unkillable jobs in Jenkins
Issue Type: Bug Bug
Assignee: Unassigned
Components: core
Created: 2019-04-17 14:19
Environment: Jenkins 2.164 running on:
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

Installed plugins:
PAM Authentication plugin (pam-auth)=1.4
Pipeline: Shared Groovy Libraries (workflow-cps-global-lib)=2.13
Pipeline: Stage Step (pipeline-stage-step)=2.3
Structs Plugin (structs)=1.17
Lockable Resources plugin (lockable-resources)=2.4
JUnit Plugin (junit)=1.27
LDAP Plugin (ldap)=1.20
AnsiColor (ansicolor)=0.6.2
JavaScript GUI Lib: ACE Editor bundle plugin (ace-editor)=1.1
Pipeline: Job (workflow-job)=2.31
Gradle Plugin (gradle)=1.30
Pipeline: SCM Step (workflow-scm-step)=2.7
Build Blocker Plugin (build-blocker-plugin)=1.7.3
Pipeline: GitHub Groovy Libraries (pipeline-github-lib)=1.0
Token Macro Plugin (token-macro)=2.6
PegDown Formatter Plugin (pegdown-formatter)=1.3
OWASP Markup Formatter Plugin (antisamy-markup-formatter)=1.5
Apache HttpComponents Client 4.x API Plugin (apache-httpcomponents-client-4-api)=4.5.5-3.0
Matrix Project Plugin (matrix-project)=1.13
SSH Credentials Plugin (ssh-credentials)=1.14
JavaScript GUI Lib: jQuery bundles (jQuery and jQuery UI) plugin (jquery-detached)=1.2.1
Credentials Binding Plugin (credentials-binding)=1.17
EnvInject API Plugin (envinject-api)=1.5
Groovy (groovy)=2.1
View Job Filters (view-job-filters)=2.1.1
Amazon Web Services SDK (aws-java-sdk)=1.11.457
JavaScript GUI Lib: Moment.js bundle plugin (momentjs)=1.1.1
Simple Theme Plugin (simple-theme-plugin)=0.5.1
GitHub Pull Request Builder (ghprb)=1.42.0
Git client plugin (git-client)=2.7.6
Pipeline: Multibranch (workflow-multibranch)=2.20
Pipeline: API (workflow-api)=2.33
Pipeline: Stage View Plugin (pipeline-stage-view)=2.10
Pipeline: REST API Plugin (pipeline-rest-api)=2.10
Pipeline: Milestone Step (pipeline-milestone-step)=1.3.1
Workspace Cleanup Plugin (ws-cleanup)=0.37
GitHub Authentication plugin (github-oauth)=0.31
Parameterized Trigger plugin (parameterized-trigger)=2.35.2
Display URL API (display-url-api)=2.3.0
Priority Sorter Plugin (PrioritySorter)=3.6.0
Timestamper (timestamper)=1.9
JDK Tool Plugin (jdk-tool)=1.2
CloudBees Docker Build and Publish plugin (docker-build-publish)=1.3.2
Slack Notification Plugin (slack)=2.17
Build Timeout (build-timeout)=1.19
SCM API Plugin (scm-api)=2.3.0
next-executions (next-executions)=1.0.12
Pipeline: Groovy (workflow-cps)=2.63
Docker Commons Plugin (docker-commons)=1.13
Pipeline: Input Step (pipeline-input-step)=2.9
Subversion Plug-in (subversion)=2.12.1
GIT server Plugin (git-server)=1.7
Mailer Plugin (mailer)=1.23
Badge (badge)=1.7
Pipeline: Basic Steps (workflow-basic-steps)=2.14
SSH Slaves plugin (ssh-slaves)=1.29.4
Jackson 2 API Plugin (jackson2-api)=2.9.8
Pipeline: Declarative Extension Points API (pipeline-model-extensions)=1.3.4.1
GitHub Branch Source Plugin (github-branch-source)=2.4.2
MapDB API Plugin (mapdb-api)=1.0.9.0
Multijob plugin (jenkins-multijob-plugin)=1.32
Pipeline: Model API (pipeline-model-api)=1.3.4.1
Environment Injector Plugin (envinject)=2.1.6
GitHub Integration Plugin (github-pullrequest)=0.2.4
Script Security Plugin (script-security)=1.52
Command Agent Launcher Plugin (command-launcher)=1.3
JavaScript GUI Lib: Handlebars bundle plugin (handlebars)=1.1.1
Run Condition Plugin (run-condition)=1.2
Javadoc Plugin (javadoc)=1.4
Ant Plugin (ant)=1.9
Pipeline (workflow-aggregator)=2.6
Branch API Plugin (branch-api)=2.1.2
Pipeline: Step API (workflow-step-api)=2.19
Folders Plugin (cloudbees-folder)=6.7
Credentials Plugin (credentials)=2.1.18
JSch dependency plugin (jsch)=0.1.55
Maven Integration plugin (maven-plugin)=3.2
Pipeline: Supporting APIs (workflow-support)=3.2
Build Authorization Token Root Plugin (build-token-root)=1.4
Git plugin (git)=3.9.3
Pipeline: Declarative Agent API (pipeline-model-declarative-agent)=1.1.1
Docker Pipeline (docker-workflow)=1.17
Job DSL (job-dsl)=1.71
Authentication Tokens API Plugin (authentication-tokens)=1.3
SSH Agent Plugin (ssh-agent)=1.17
Naginator (naginator)=1.17.2
Matrix Authorization Strategy Plugin (matrix-auth)=2.3
Groovy Postbuild (groovy-postbuild)=2.4.3
GitHub API Plugin (github-api)=1.95
Conditional BuildStep (conditional-buildstep)=1.3.6
Pipeline Graph Analysis Plugin (pipeline-graph-analysis)=1.9
bouncycastle API Plugin (bouncycastle-api)=2.17
External Monitor Job Type Plugin (external-monitor-job)=1.7
Pipeline: Build Step (pipeline-build-step)=2.7
Pipeline: Nodes and Processes (workflow-durable-task-step)=2.29
ThinBackup (thinBackup)=1.9
built-on-column (built-on-column)=1.1
Resource Disposer Plugin (resource-disposer)=0.12
Durable Task Plugin (durable-task)=1.29
Hudson Post build task (postbuild-task)=1.8
GitHub plugin (github)=1.29.3
Pipeline: Declarative (pipeline-model-definition)=1.3.4.1
WMI Windows Agents Plugin (windows-slaves)=1.4
Pipeline: Stage Tags Metadata (pipeline-stage-tags-metadata)=1.3.4.1
Plain Credentials Plugin (plain-credentials)=1.5
Icon Shim Plugin (icon-shim)=2.0.3
Email Extension Plugin (email-ext)=2.63
Job Configuration History Plugin (jobConfigHistory)=2.19
Queue cleanup Plugin (queue-cleanup)=1.3
Block Queued Job Plugin (block-queued-job)=0.2.0
Pipeline: GitHub (pipeline-github)=2.5
XML Job to Job DSL Plugin (xml-job-to-job-dsl)=0.1.10
Priority: Major Major
Reporter: Attila Nagy

I have a job which is stuck in the queue for 21 days now. According to the log, the build was failed:

 

Mar 26, 2019 7:38:04 PM hudson.model.Run execute
INFO: 0 Update R package docs #11580 main build action completed: FAILURE
Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
INFO: Performing complete notifications

But the job is still on the queue as running.

I've tried everthing I could read on https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server, but it's still there.

I will restart the server next monday, but opening this issue in the hope that something can be done.

I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

I've attached a gist with the /threadDump output. Searching for the job's name gives:

Executor #10 for master : executing 0 Update R package docs #11580
      
      
        

      
      
        "Executor #10 for master : executing 0 Update R package docs #11580" Id=1961561 Group=main RUNNABLE (in native)
      
      
        	at java.net.SocketInputStream.socketRead0(Native Method)
      
      
        	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
      
      
        	at java.net.SocketInputStream.read(SocketInputStream.java:171)
      
      
        	at java.net.SocketInputStream.read(SocketInputStream.java:141)
      
      
        	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
      
      
        	at sun.security.ssl.InputRecord.read(InputRecord.java:503)
      
      
        	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
      
      
        	-  locked java.lang.Object@2644728a
      
      
        	at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
      
      
        	-  locked java.lang.Object@18cea778
      
      
        	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
      
      
        	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
      
      
        	at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:275)
      
      
        	at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:254)
      
      
        	at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)
      
      
        	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)
      
      
        	at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
      
      
        	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
      
      
        	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
      
      
        	at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)
      
      
        	at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
      
      
        	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)
      
      
        	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
      
      
        	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
      
      
        	at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:163)
      
      
        	at jenkins.plugins.slack.StandardSlackService.publish(StandardSlackService.java:104)
      
      
        	at jenkins.plugins.slack.ActiveNotifier.completed(ActiveNotifier.java:150)
      
      
        	at jenkins.plugins.slack.SlackNotifier.perform(SlackNotifier.java:444)
      
      
        	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
      
      
        	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
      
      
        	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
      
      
        	at hudson.model.Build$BuildExecution.cleanUp(Build.java:196)
      
      
        	at hudson.model.Run.execute(Run.java:1863)
      
      
        	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
      
      
        	at hudson.model.ResourceController.execute(ResourceController.java:97)
      
      
        	at hudson.model.Executor.run(Executor.java:429)
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

nagy.attila@gmail.com (JIRA)

unread,
Apr 17, 2019, 10:21:02 AM4/17/19
to jenkinsc...@googlegroups.com
Attila Nagy updated an issue
Change By: Attila Nagy
I have a job in the queue for 21 days now. According to the log, the build was has failed:
{noformat}

Mar 26, 2019 7:38:04 PM hudson.model.Run execute
INFO: 0 Update R package docs #11580 main build action completed: FAILURE
Mar 26, 2019 7:38:04 PM jenkins.plugins.slack.SlackNotifier perform
INFO: Performing complete notifications
{noformat}

But the job is still on the queue as running.

I've tried everthing I could read on [https://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server,] but it's still there.

I will restart the server next monday, but opening this issue in the hope that something can be done.

I couldn't find any other lines related to this job in the logs. The script itself terminated with an exit code different than zero (so it exited, was not hanging around).

I've attached a gist with the /threadDump output. Searching for the job's name gives:
{noformat}
         at hudson.model.Executor.run(Executor.java:429){noformat}

nagy.attila@gmail.com (JIRA)

unread,
Apr 17, 2019, 10:21:05 AM4/17/19
to jenkinsc...@googlegroups.com
Attila Nagy updated an issue
I have a job which is stuck in the queue for 21 days now. According to the log, the build was failed:

nagy.attila@gmail.com (JIRA)

unread,
Apr 17, 2019, 10:23:01 AM4/17/19
to jenkinsc...@googlegroups.com
Attila Nagy updated an issue
I have a job in the queue for 21 days now. According to the log, the build has failed:

nagy.attila@gmail.com (JIRA)

unread,
Apr 17, 2019, 10:24:02 AM4/17/19
to jenkinsc...@googlegroups.com
Attila Nagy updated an issue
Change By: Attila Nagy
Component/s: slack-plugin

nagy.attila@gmail.com (JIRA)

unread,
Apr 17, 2019, 10:41:02 AM4/17/19
to jenkinsc...@googlegroups.com
Attila Nagy updated an issue
  Inspecting this, the slack notification plugin becomes the suspect.

Doing a netstat on the machine gives a lingering connection there to 99.84.75.163:443. After killing it with the following command:
{noformat}
ss -K dst 99.84.75.163 dport = 443{noformat}
the job (and the associated thread in the thread dump) immediately disappeared.

timjacomb1@gmail.com (JIRA)

unread,
Sep 29, 2019, 7:02:03 PM9/29/19
to jenkinsc...@googlegroups.com
Change By: Tim Jacomb
Status: Open Fixed but Unreleased
Resolution: Fixed
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo

timjacomb1@gmail.com (JIRA)

unread,
Sep 29, 2019, 7:02:03 PM9/29/19
to jenkinsc...@googlegroups.com
Change By: Tim Jacomb
Status: Fixed but Unreleased Closed
Reply all
Reply to author
Forward
0 new messages