[JIRA] (JENKINS-60536) Jenkins master hangs forever at git rev-list command

15 views
Skip to first unread message

emmz.l.phillips+jenkins@gmail.com (JIRA)

unread,
Dec 19, 2019, 3:55:03 AM12/19/19
to jenkinsc...@googlegroups.com
emma Phillps created an issue
 
Jenkins / Bug JENKINS-60536
Jenkins master hangs forever at git rev-list command
Issue Type: Bug Bug
Assignee: Mark Waite
Attachments: About Jenkins 2.208 [Jenkins].pdf
Components: core, git-plugin
Created: 2019-12-19 08:54
Environment: OS: Amazon Linux2 CPE: cpe:2.3:o:amazon:amazon_linux:2
Java Version: OpenJDK Runtime Environment Corretto-11.0.5.10.1 (build 11.0.5+10-LTS)

Jenkins & plugin versions: See attached PDF

About Jenkins installation:
Jenkins (master) is installed as a service on an AWS EC2 instance with an EBS backed volume for Jenkins data.

Jenkins master then sits behind a load balancer

We use AWS EC2 Cloud plugin for build nodes

Things tried:
* Updated plugins that had available plugins
Labels: pipeline git
Priority: Critical Critical
Reporter: emma Phillps

Recently Jenkins master  when a job is triggered it hangs around checkout

 Command that it seems to get stuck on:

git rev-list --no-walk a4539ebbe8fe6da843fb1fe4c7b0ce5cf79a0647 # timeout=10 

This issue only started happening sporadically from 2019-12-17 and is impacting our CI/CD pipeline. We rely on Jenkins to deploy to our environments.

 

If you tried to kill the job, it is unresponsive.

If you restart Jenkins master then the job that gets stuck is no longer exists

 

Job Build Logs:

Started by upstream project "parent-job" build number 5
originally caused by:
 Branch indexing
Checking out git <REPO> into /var/lib/jenkins/workspace/acceptance-tests@script to read acceptance-tests/Jenkinsfile-local
using credential jenkins-git
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url g...@bitbucket.org:obillex/fundit.git # timeout=10
Fetching upstream changes from g...@bitbucket.org:obillex/fundit.git
 > git --version # timeout=10
using GIT_SSH to set credentials Jenkins access to git
 > git fetch --tags --progress -- <REPO>+refs/heads/*:refs/remotes/origin/* # timeout=10
 > git rev-parse b6a1b7a159b86dd5a52b8385670397d568345c53^{commit} # timeout=10
Checking out Revision b6a1b7a159b86dd5a52b8385670397d568345c53 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f b6a1b7a159b86dd5a52b8385670397d568345c53 # timeout=10
Commit message: "Commit message"
 > git rev-list --no-walk a4539ebbe8fe6da843fb1fe4c7b0ce5cf79a0647 # timeout=10

This issue seems to be a regression of this resolved issue JENKINS-43106

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
Atlassian logo

emmz.l.phillips+jenkins@gmail.com (JIRA)

unread,
Dec 19, 2019, 4:14:04 AM12/19/19
to jenkinsc...@googlegroups.com
emma Phillps commented on Bug JENKINS-60536
 
Re: Jenkins master hangs forever at git rev-list command

Upgraded to the latest version of Jenkins 2.209 and within 5 minutes of running the new version the issue has occurred again.

 

You cannot stop/kill the job once it gets into this state. When restarting, the job ceases to exist

mark.earl.waite@gmail.com (JIRA)

unread,
Dec 19, 2019, 4:32:02 AM12/19/19
to jenkinsc...@googlegroups.com
Mark Waite assigned an issue to Unassigned
 
Change By: Mark Waite
Assignee: Mark Waite

mark.earl.waite@gmail.com (JIRA)

unread,
Dec 19, 2019, 4:41:03 AM12/19/19
to jenkinsc...@googlegroups.com
Mark Waite commented on Bug JENKINS-60536
 
Re: Jenkins master hangs forever at git rev-list command

If this is an instance of JENKINS-43106, then the Jira 2.5.0 plugin would be installed on the system. Libraries included in the Jira 2.5.0 plugin were causing the issue. Since you state in the environment section of the bug report that you updated all the plugins on the system, I doubt that you are running that old version of the Jira plugin. Could you provide the list of installed plugins (Manage Jenkins - System Information)?

Since you're running on Amazon EC2 with an EBS backed volume, is there any indication from the AWS reports or environment which might hint that you are exhausting some I/O limit?

When the git rev-list command hangs, are other jobs able to continue, or do they also hang?

When the git rev-list command hangs, can a user on the Jenkins master perform command line git operations still in the JENKINS_HOME directory structure? For example, can that user perform a git status or a git rev-list command in any other workspace?

If you create a similar environment on a different EC2 machine, can you see the same failure?

If you create a similar environment on a local machine, can you see the same failure?

kieran@kieranshaw.co.uk (JIRA)

unread,
Dec 19, 2019, 5:30:02 AM12/19/19
to jenkinsc...@googlegroups.com
Kieran Shaw updated an issue
 
Change By: Kieran Shaw
Attachment: Screenshot from 2019-12-19 10-27-27.png

kieran@kieranshaw.co.uk (JIRA)

unread,
Dec 19, 2019, 5:32:05 AM12/19/19
to jenkinsc...@googlegroups.com
Kieran Shaw commented on Bug JENKINS-60536
 
Re: Jenkins master hangs forever at git rev-list command

Hi Mark,

 

Thanks for picking this up so quickly. I'm working with Emma on this, below is our plugin list. 

It seems quite intermittent, some jobs are able to start successfully and complete once started, but others just can't seem to get started and then can't be killed.

Whilst there is a hung job, other jobs can run ok. We fire up a new EC2 slave for every job.

We are running on EC2 and EBS backed storage, but it shows no signs of unusual behaviour and our burst balance is fine.

Using the Monitoring plugin, I can see the thread for one of the hung jobs, this is what it is showing:

 

Name  ↓ Version    Enabled   
ace-editor 1.1 true
ansicolor 0.6.2 true
ant 1.10 true
antisamy-markup-formatter 1.6 true
apache-httpcomponents-client-4-api 4.5.10-2.0 true
authentication-tokens 1.3 true
aws-credentials 1.28 true
aws-java-sdk 1.11.687 true
blueocean 1.21.0 true
blueocean-autofavorite 1.2.4 true
blueocean-bitbucket-pipeline 1.21.0 true
blueocean-commons 1.21.0 true
blueocean-config 1.21.0 true
blueocean-core-js 1.21.0 true
blueocean-dashboard 1.21.0 true
blueocean-display-url 2.3.0 true
blueocean-events 1.21.0 true
blueocean-git-pipeline 1.21.0 true
blueocean-github-pipeline 1.21.0 true
blueocean-i18n 1.21.0 true
blueocean-jira 1.21.0 true
blueocean-jwt 1.21.0 true
blueocean-personalization 1.21.0 true
blueocean-pipeline-api-impl 1.21.0 true
blueocean-pipeline-editor 1.21.0 true
blueocean-pipeline-scm-api 1.21.0 true
blueocean-rest 1.21.0 true
blueocean-rest-impl 1.21.0 true
blueocean-web 1.21.0 true
bouncycastle-api 2.17 true
branch-api 2.5.5 true
build-pipeline-plugin 1.5.8 true
build-timeout 1.19 true
cloudbees-bitbucket-branch-source 2.6.0 true
cloudbees-folder 6.10.1 true
command-launcher 1.4 true
conditional-buildstep 1.3.6 true
config-file-provider 3.6.2 true
credentials 2.3.0 true
credentials-binding 1.20 true
display-url-api 2.3.2 true
docker-commons 1.15 true
docker-workflow 1.21 true
durable-task 1.33 true
ec2 1.47 true
email-ext 2.68 true
favorite 2.3.2 true
git 4.0.0 true
git-client 3.0.0 true
git-server 1.9 true
github 1.29.5 true
github-api 1.95 true
github-branch-source 2.5.8 true
gradle 1.35 true
h2-api 1.4.199 true
handlebars 1.1.1 true
handy-uri-templates-2-api 2.1.8-1.0 true
htmlpublisher 1.21 true
jackson2-api 2.10.1 true
javadoc 1.5 true
jaxb 2.3.0.1 true
jdk-tool 1.4 true
jenkins-design-language 1.21.0 true
jira 3.0.11 true
jira-steps 1.5.1 true
jquery 1.12.4-1 true
jquery-detached 1.2.1 true
jsch 0.1.55.1 true
junit 1.28 true
ldap 1.21 true
lockable-resources 2.7 true
mailer 1.29 true
matrix-auth 2.5 true
matrix-project 1.14 true
maven-plugin 3.4 true
mercurial 2.8 true
momentjs 1.1.1 true
monitoring 1.80.0 true
node-iterator-api 1.5.0 true
nodejs 1.3.4 true
parameterized-trigger 2.36 true
performance 3.17 true
pipeline-aws 1.39 true
pipeline-build-step 2.10 true
pipeline-github-lib 1.0 true
pipeline-graph-analysis 1.10 true
pipeline-input-step 2.11 true
pipeline-maven 3.8.2 true
pipeline-milestone-step 1.3.1 true
pipeline-model-api 1.5.0 true
pipeline-model-declarative-agent 1.1.1 true
pipeline-model-definition 1.5.0 true
pipeline-model-extensions 1.5.0 true
pipeline-rest-api 2.12 true
pipeline-stage-step 2.3 true
pipeline-stage-tags-metadata 1.5.0 true
pipeline-stage-view 2.12 true
plain-credentials 1.5 true
pubsub-light 1.13 true
resource-disposer 0.14 true
run-condition 1.2 true
saml 1.1.4 true
scm-api 2.6.3 true
script-security 1.68 true
slack 2.35 true
sonar 2.10 true
sse-gateway 1.20 true
ssh-agent 1.17 true
ssh-credentials 1.18 true
ssh-slaves 1.31.0 true
structs 1.20 true
test-results-analyzer 0.3.5 true
timestamper 1.10 true
token-macro 2.10 true
trilead-api 1.0.5 true
variant 1.3 true
workflow-aggregator 2.6 true
workflow-api 2.38 true
workflow-basic-steps 2.18 true
workflow-cps 2.78 true
workflow-cps-global-lib 2.15 true
workflow-durable-task-step 2.35 true
workflow-job 2.36 true
workflow-multibranch 2.21 true
workflow-scm-step 2.9 true
workflow-step-api 2.21 true
workflow-support 3.3 true
ws-cleanup 0.38 true

kieran@kieranshaw.co.uk (JIRA)

unread,
Dec 19, 2019, 5:34:02 AM12/19/19
to jenkinsc...@googlegroups.com

We can also run git commands just fine in the workspace directory of the job that is trying to be run.

kieran@kieranshaw.co.uk (JIRA)

unread,
Dec 19, 2019, 5:48:03 AM12/19/19
to jenkinsc...@googlegroups.com

We also just had a case of a job managing to get past the master checkout and assigned a slave and then hung in the same place doing the rev-list on the slave.

mark.earl.waite@gmail.com (JIRA)

unread,
Dec 19, 2019, 6:14:05 AM12/19/19
to jenkinsc...@googlegroups.com

Based on a comparison of your installed plugins and my installed plugins, you might consider the following as ways to explore the problem further. I don't have any reason to believe that any of these are the root of the problem, I'm just guessing in an attempt to further explore what might be causing the issue.

Wild Guesses

  • The jira-steps plugin is installed in your environment and not in mine. Since a different jira plugin was deeply involved in JENKINS-43106, you might attempt to remove the jira-steps plugin to see if that helps
  • The build-pipeline-plugin is installed in your environment and not in mine. It has a known security issue that should be unrelated to this issue. However, if you can remove that plugin without harming your users, it may be worth the attempt
  • The test-results-analyzer plugin is installed in your environment and not in mine. I don't see any reason that would affect this case, since it seems to be a presentation and reporting plugin

kieran@kieranshaw.co.uk (JIRA)

unread,
Dec 19, 2019, 6:23:02 AM12/19/19
to jenkinsc...@googlegroups.com

Tracked down another locked up thread and it does indeed look like Jira:

 

Executor #-1 for master : executing feature/acceptance-local #541
org.apache.http.nio.reactor.ssl.SSLIOSession.close(SSLIOSession.java:605)
org.apache.http.impl.nio.NHttpConnectionBase.close(NHttpConnectionBase.java:511)
org.apache.http.impl.nio.conn.CPoolEntry.closeConnection(CPoolEntry.java:75)
org.apache.http.impl.nio.conn.CPoolEntry.close(CPoolEntry.java:101)
org.apache.http.nio.pool.AbstractNIOConnPool.processPendingRequest(AbstractNIOConnPool.java:423)
org.apache.http.nio.pool.AbstractNIOConnPool.lease(AbstractNIOConnPool.java:280)
org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.requestConnection(PoolingNHttpClientConnectionManager.java:295)
org.apache.http.impl.nio.client.AbstractClientExchangeHandler.requestConnection(AbstractClientExchangeHandler.java:377)
org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.start(DefaultClientExchangeHandlerImpl.java:129)
org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:141)
org.apache.http.impl.nio.client.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:75)
org.apache.http.impl.nio.client.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:108)
com.atlassian.httpclient.apache.httpcomponents.SettableFuturePromiseHttpPromiseAsyncClient.execute(SettableFuturePromiseHttpPromiseAsyncClient.java:36)
com.atlassian.httpclient.apache.httpcomponents.ApacheAsyncHttpClient.doExecute(ApacheAsyncHttpClient.java:417)
com.atlassian.httpclient.apache.httpcomponents.ApacheAsyncHttpClient.execute(ApacheAsyncHttpClient.java:364)
com.atlassian.httpclient.apache.httpcomponents.DefaultRequest$DefaultRequestBuilder.execute(DefaultRequest.java:297)
com.atlassian.jira.rest.client.internal.async.AtlassianHttpClientDecorator$AuthenticatedRequestBuilder.execute(AtlassianHttpClientDecorator.java:83)
com.atlassian.httpclient.apache.httpcomponents.DefaultRequest$DefaultRequestBuilder.get(DefaultRequest.java:253)
com.atlassian.jira.rest.client.internal.async.AbstractAsynchronousRestClient.getAndParse(AbstractAsynchronousRestClient.java:69)
com.atlassian.jira.rest.client.internal.async.AsynchronousSearchRestClient.searchJqlImplGet(AsynchronousSearchRestClient.java:109)
com.atlassian.jira.rest.client.internal.async.AsynchronousSearchRestClient.searchJql(AsynchronousSearchRestClient.java:94)
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:196)
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:136)
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:52)
org.jenkinsci.plugins.workflow.job.WorkflowRun.onCheckout(WorkflowRun.java:853)
org.jenkinsci.plugins.workflow.job.WorkflowRun.access$1000(WorkflowRun.java:133)
org.jenkinsci.plugins.workflow.job.WorkflowRun$SCMListenerImpl.onCheckout(WorkflowRun.java:1116)
org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:140)
org.jenkinsci.plugins.workflow.cps.CpsScmFlowDefinition.create(CpsScmFlowDefinition.java:155)
org.jenkinsci.plugins.workflow.cps.CpsScmFlowDefinition.create(CpsScmFlowDefinition.java:69)
org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:299)
hudson.model.ResourceController.execute(ResourceController.java:97)
hudson.model.Executor.run(Executor.java:428)

I think it's the bit that parses the changelog and tries to link to Jira issues.

mark.earl.waite@gmail.com (JIRA)

unread,
Dec 19, 2019, 6:26:01 AM12/19/19
to jenkinsc...@googlegroups.com

Interesting and very encouraging. If that is a Jenkins pipeline job, you might insert an echo statement after the checkout scm step so that you have evidence that the checkout completed and the hang is in a different location that is unrelated to the git plugin.

Reply all
Reply to author
Forward
0 new messages