Jenkins Upgrade

71 views
Skip to first unread message

M.Madhusudana Reddy

unread,
Mar 11, 2020, 2:56:21 PM3/11/20
to Jenkins Users
Hi Team,

We upgraded our jenkins instance from 2.107.3 to 2.204.3 and sometimes we notice that jenkins master is taking lot of time to come up and jenkins UI/application is unresponsive.

I took a thread dump while we ran into this issue and noticed that application is suffering from deadlock. 

Thread jenkins.util.Timer [#1] is in deadlock with thread CpsStepContext.isReady [#6]  

jenkins.util.Timer [#1]

priority:0x5 - threadId:0x00007F9C4C16EBE0 - nativeId:0x5AF5 - nativeId (decimal):23285 - state:BLOCKED
stackTrace:
at jenkins/model/lazy/AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:370)
at jenkins/model/lazy/LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:228)
at org/jenkinsci/plugins/workflow/job/WorkflowJob.getBuildByNumber(WorkflowJob.java:232)
at org/jenkinsci/plugins/workflow/job/WorkflowJob.getBuildByNumber(WorkflowJob.java:103)
at hudson/model/Run.fromExternalizableId(Run.java:2446)
at org/jenkinsci/plugins/workflow/support/steps/ExecutorStepExecution$PlaceholderTask.runForDisplay(ExecutorStepExecution.java:445)
at org/jenkinsci/plugins/workflow/support/steps/ExecutorStepExecution$PlaceholderTask.getDisplayName(ExecutorStepExecution.java:461)
at org/jenkinsci/plugins/jqsmonitoring/buildqueue/QueueJob.<init>(QueueJob.java:50)
at org/jenkinsci/plugins/jqsmonitoring/buildqueue/BuildQueue.retrieveQueueJobs(BuildQueue.java:87)
at org/jenkinsci/plugins/jqsmonitoring/buildqueue/BuildQueue.<init>(BuildQueue.java:36)
at org/jenkinsci/plugins/jqsmonitoring/jqscore/JQSMonitoring.<init>(JQSMonitoring.java:43)
at org/jenkinsci/plugins/jqsmonitoring/jqscore/JQSMonitoring$$FastClassByGuice$$a25d5b0d.newInstance(<generated>)
at com/google/inject/internal/cglib/reflect/$FastConstructor.newInstance(FastConstructor.java:40)
at com/google/inject/internal/DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:61)
at com/google/inject/internal/ConstructorInjector.provision(ConstructorInjector.java:105)
at com/google/inject/internal/ConstructorInjector.access$000(ConstructorInjector.java:32)
at com/google/inject/internal/ConstructorInjector$1.call(ConstructorInjector.java:89)
at com/google/inject/internal/ProvisionListenerStackCallback$Provision.provision(ProvisionListenerStackCallback.java:115)
at hudson/ExtensionFinder$GuiceFinder$SezpozModule.onProvision(ExtensionFinder.java:567(Compiled Code))
at com/google/inject/internal/ProvisionListenerStackCallback$Provision.provision(ProvisionListenerStackCallback.java:126)
at com/google/inject/internal/ProvisionListenerStackCallback.provision(Provision 

#######################

 

CpsStepContext.isReady [#6]

priority:0x5 - threadId:0x00007F9C6802DE40 - nativeId:0x5B72 - nativeId (decimal):23410 - state:BLOCKED
stackTrace:
at org/jenkinsci/plugins/workflow/cps/CpsGroovyShell.doParse(CpsGroovyShell.java:133)
at org/jenkinsci/plugins/workflow/cps/CpsGroovyShell.reparse(CpsGroovyShell.java:127)
at org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.parseScript(CpsFlowExecution.java:561)
at org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.loadProgramAsync(CpsFlowExecution.java:770)
at org/jenkinsci/plugins/workflow/cps/CpsFlowExecution.onLoad(CpsFlowExecution.java:740)
at org/jenkinsci/plugins/workflow/job/WorkflowRun.getExecution(WorkflowRun.java:672)
at org/jenkinsci/plugins/workflow/job/WorkflowRun.onLoad(WorkflowRun.java:533)
at hudson/model/RunMap.retrieve(RunMap.java:225)
at hudson/model/RunMap.retrieve(RunMap.java:57)
at jenkins/model/lazy/AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:501)
at jenkins/model/lazy/AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:483)
at jenkins/model/lazy/AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:381)
at hudson/model/RunMap.getById(RunMap.java:205)
at org/jenkinsci/plugins/workflow/job/WorkflowRun$Owner.run(WorkflowRun.java:921)
at org/jenkinsci/plugins/workflow/job/WorkflowRun$Owner.get(WorkflowRun.java:932)
at org/jenkinsci/plugins/workflow/cps/CpsStepContext.getExecution(CpsStepContext.java:217)
at org/jenkinsci/plugins/workflow/cps/CpsStepContext.getThreadGroupSynchronously(CpsStepContext.java:242)
at org/jenkinsci/plugins/workflow/cps/CpsStepContext.access$000(CpsStepContext.java:97)
at org/jenkinsci/plugins/workflow/cps/CpsStepContext$1.call(CpsStepContext.java:263)
at org/jenkinsci/plugins/workflow/cps/CpsStepContext$1.call(CpsStepContext.java:261)
at jenkins/util/ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at java/util/concurrent/FutureTask.run(FutureTask.java:277)
at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153(Compiled Code))
at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java/lang/Thread.run(Thread.java:785)

#############################

Any one ran into the same issue and any help on this is highly appreciated please.

Devin Nusbaum

unread,
Mar 11, 2020, 3:05:47 PM3/11/20
to jenkins...@googlegroups.com
Do you have the full thread dump from a tool like jstack? If so, please attach it to this thread, or paste it somewhere like gist.github.com or pastebin.com and then paste a link to it here.

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/9d1375bf-e7a0-41d5-ae07-cc67acf210e5%40googlegroups.com.

M.Madhusudana Reddy

unread,
Mar 11, 2020, 5:03:40 PM3/11/20
to jenkins...@googlegroups.com, dnus...@cloudbees.com
Hi Devin, please find the thread dump attached.

jfyi, we are using IBM J9. 

Thanks
M.Madhu


javacore.20200311.124924.23218.0001.txt
javacore.20200311.102828.634.0003.txt

Mark Waite

unread,
Mar 12, 2020, 10:14:59 AM3/12/20
to Jenkins Users
FYI IBM openJ9 is not supported by the Jenkins project.  Initial testing has found no showstoppers, but IBM OpenJ9 introduces a very significant difference between what you're running and what is used and tested by people in the Jenkins project.

The Jenkins Platform SIG has started the process of evaluating OpenJ9 with the help of Jim Crowley of IBM.  There are at least 4 open issues related to OpenJ9 and another 15 issues related to J9.  Probably best to confirm that the same problem occurs with AdoptOpenJDK or with OpenJDK.

Mark Waite


On Wednesday, March 11, 2020 at 3:03:40 PM UTC-6, Madhu Muchukota wrote:
Hi Devin, please find the thread dump attached.

jfyi, we are using IBM J9. 

Thanks
M.Madhu


On Wed, Mar 11, 2020 at 2:05 PM Devin Nusbaum wrote:
Do you have the full thread dump from a tool like jstack? If so, please attach it to this thread, or paste it somewhere like gist.github.com or pastebin.com and then paste a link to it here.

M.Madhusudana Reddy

unread,
Mar 12, 2020, 1:05:59 PM3/12/20
to jenkins...@googlegroups.com, mark.ea...@gmail.com
Thanks for the update Mark.

I believe we tried with AdoptOpenJDK and ran into same issue but I will confirm that and let you know. 

Thanks
M.Madhu


--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.

Martijn Verburg

unread,
Mar 16, 2020, 4:49:31 AM3/16/20
to jenkins...@googlegroups.com, mark.ea...@gmail.com
Please note that AdoptOpenJDK produces both OpenJDK builds (i.e. HotSpot JVM) and OpenJ9 + OpenJDK class library builds.

Cheers,
Martijn


M.Madhusudana Reddy

unread,
Mar 30, 2020, 1:46:37 PM3/30/20
to jenkins...@googlegroups.com, dnus...@cloudbees.com, mark.ea...@gmail.com
Ok. I am able to reproduce the issue even with HotSpot JAVA of versino 1.8 update 121

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

######################

Thread CpsStepContext.isReady [#4] is in deadlock with thread Finalizing set up

CpsStepContext.isReady [#4]

priority:5 - threadId:0x00007f843c075000 - nativeId:0x4d0f - nativeId (decimal):19727 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at hudson.ExtensionList.ensureLoaded(ExtensionList.java:315)
- waiting to lock <0x00000006c191ac60> (a hudson.ExtensionList$Lock)
at hudson.ExtensionList.iterator(ExtensionList.java:172)
at org.jenkinsci.plugins.workflow.log.LogStorage.of(LogStorage.java:163)
at org.jenkinsci.plugins.workflow.job.WorkflowRun.getListener(WorkflowRun.java:229)
at org.jenkinsci.plugins.workflow.job.WorkflowRun.access$300(WorkflowRun.java:137)
at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.getListener(WorkflowRun.java:974)
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$2.onSuccess(CpsFlowExecution.java:789)
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$2.onSuccess(CpsFlowExecution.java:776)
at org.jenkinsci.plugins.workflow.support.concurrent.Futures$1.run(Futures.java:150)
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253)
at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149)
at com.google.common.util.concurrent.ExecutionList.add(ExecutionList.java:105)
at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:155)
at org.jenkinsci.plugins.workflow.support.concurrent.Futures.addCallback(Futures.java:160)
at org.jenkinsci.plugins.workflow.support.concurrent.Futures.addCallback(Futures.java:90)
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.loadProgramAsync(CpsFlowExecution.java:773)
at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:740)
at org.jenkinsci.plugins.workflow.job.WorkflowRun.getExecution(WorkflowRun.java:672)
- locked <0x00000006c73dcd98> (a org.jenkinsci.plugins.workflow.job.WorkflowRun)
at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:533)
- locked <0x00000006c73dd740> (a java.lang.Object)
at hudson.model.RunMap.retrieve(RunMap.java:225)
at hudson.model.RunMap.retrieve(RunMap.java:57)
at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:501)
at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:483)
at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:381)
- locked <0x00000006c639f178> (a hudson.model.RunMap)
at hudson.model.RunMap.getById(RunMap.java:205)
at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:921)
at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.get(WorkflowRun.java:932)
at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getExecution(CpsStepContext.java:217)
at org.jenkinsci.plugins.workflow.cps.CpsStepContext.getThreadGroupSynchronously(CpsStepContext.java:242)
at org.jenkinsci.plugins.workflow.cps.CpsStepContext.access$000(CpsStepContext.java:97)
at org.jenkinsci.plugins.workflow.cps.CpsStepContext$1.call(CpsStepContext.java:263)
at org.jenkinsci.plugins.workflow.cps.CpsStepContext$1.call(CpsStepContext.java:261)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
<0x00000006c73dbb00> (a java.util.concurrent.ThreadPoolExecutor$Worker)

Finalizing set up

priority:5 - threadId:0x00007f845c00d800 - nativeId:0x4ca8 - nativeId (decimal):19624 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:371)
- waiting to lock <0x00000006c639f178> (a hudson.model.RunMap)
at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:228)
at org.jenkinsci.plugins.workflow.job.WorkflowJob.getBuildByNumber(WorkflowJob.java:232)
at org.jenkinsci.plugins.workflow.job.WorkflowJob.getBuildByNumber(WorkflowJob.java:103)
at hudson.model.Run.fromExternalizableId(Run.java:2446)
at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.runForDisplay(ExecutorStepExecution.java:445)
at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getDisplayName(ExecutorStepExecution.java:461)
at org.jenkinsci.plugins.jqsmonitoring.buildqueue.QueueJob.<init>(QueueJob.java:50)
at org.jenkinsci.plugins.jqsmonitoring.buildqueue.BuildQueue.retrieveQueueJobs(BuildQueue.java:87)
at org.jenkinsci.plugins.jqsmonitoring.buildqueue.BuildQueue.<init>(BuildQueue.java:36)
at org.jenkinsci.plugins.jqsmonitoring.jqscore.JQSMonitoring.<init>(JQSMonitoring.java:43)
at org.jenkinsci.plugins.jqsmonitoring.jqscore.JQSMonitoring$$FastClassByGuice$$a25d5b0d.newInstance(<generated>)
at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)
at com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:61)
at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:105)
at com.google.inject.internal.ConstructorInjector.access$000(ConstructorInjector.java:32)
at com.google.inject.internal.ConstructorInjector$1.call(ConstructorInjector.java:89)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision(ProvisionListenerStackCallback.java:115)
at hudson.ExtensionFinder$GuiceFinder$SezpozModule.onProvision(ExtensionFinder.java:567)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision(ProvisionListenerStackCallback.java:126)
at com.google.inject.internal.ProvisionListenerStackCallback.provision(ProvisionListenerStackCallback.java:68)
at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:87)
at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:267)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1103)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:145)
at hudson.ExtensionFinder$GuiceFinder$FaultTolerantScope$1.get(ExtensionFinder.java:439)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
at com.google.inject.internal.InjectorImpl$2$1.call(InjectorImpl.java:1016)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
at com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1012)
at hudson.ExtensionFinder$GuiceFinder._find(ExtensionFinder.java:401)
at hudson.ExtensionFinder$GuiceFinder.find(ExtensionFinder.java:392)
at hudson.ClassicPluginStrategy.findComponents(ClassicPluginStrategy.java:344)
at hudson.ExtensionList.load(ExtensionList.java:381)
at hudson.ExtensionList.ensureLoaded(ExtensionList.java:317)
- locked <0x00000006c191ac60> (a hudson.ExtensionList$Lock)
at hudson.ExtensionList.iterator(ExtensionList.java:172)
at jenkins.model.Jenkins$16.run(Jenkins.java:3285)
at org.jvnet.hudson.reactor.TaskGraphBuilder$TaskImpl.run(TaskGraphBuilder.java:169)
at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296)
at jenkins.model.Jenkins$5.runTask(Jenkins.java:1122)
at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214)
at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
<0x00000006c1256288> (a java.util.concurrent.ThreadPoolExecutor$Worker)
<0x00000006c26f4b30> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
######################################

I am also attaching the thread dumps that are captured.

Thanks inadvance for your help.

Thanks
M.Madhu


19561.td
19561.td3
19561.td4
19561.td2

M.Madhusudana Reddy

unread,
Mar 30, 2020, 2:15:17 PM3/30/20
to jenkins...@googlegroups.com, dnus...@cloudbees.com, mark.ea...@gmail.com
I dumped the thread dump into one of the thread dump analysis tool and the tool is reporting on Dead Lock.

Thread CpsStepContext.isReady [#4] is in deadlock with thread Finalizing set up  

Thanks
M.Madhu

Mark Waite

unread,
Mar 30, 2020, 2:37:52 PM3/30/20
to M.Madhusudana Reddy, Jenkins Users, dnus...@cloudbees.com
Any chance you can duplicate the same problem on Java 8 update 242 (the current release)?

Java 8 u121 was released in Jan 2017.  Significant changes were made to Java 8 in update 161 (WIkipedia says " 21 Security fixes, 3 new features, 9 changes and 1 bug fix from 8u152.").  Update 231 (Oct 2019) included 18 security fixes.

I don't have any hints to offer on the cause of the thread deadlock or possible resolution.

M.Madhusudana Reddy

unread,
Mar 30, 2020, 2:43:18 PM3/30/20
to Mark Waite, Jenkins Users, dnus...@cloudbees.com
Thanks Mark. I can definitely try that.

But again as I mentioned early, If I kill the process and start the jenkins JVM(with the same JAVA version), I may not see this issue. This is intermittent and I need to keep trying with the restarts to reproduce the issue. 

Thanks
M.Madhu

Mark Waite

unread,
Mar 30, 2020, 3:21:37 PM3/30/20
to M.Madhusudana Reddy, Jenkins Users, dnus...@cloudbees.com
I'm afraid that it is much more difficult for anyone to assist with a problem that can't be duplicated.  

Devin Nusbaum

unread,
Mar 30, 2020, 3:24:21 PM3/30/20
to M.Madhusudana Reddy, Mark Waite, Jenkins Users
From a quick look at the thread dumps, I think that the main problem is that JQSMonitoring (from Job/Queue/Slaves Monitoring Plugin) ends up loading builds in its constructor, which is invoked during Jenkins startup because it is a RootAction, and the because extension loading is gated behind a instance-wide lock, this means that no other extensions can be loaded until JQSMonitoring is done loading builds.

I would try disabling jqs-monitoring to see if the issue goes away. I think the easiest way to fix the issue would be for jqs-monitoring to use an `@Initializer` method to do the work that is currently done inside of JQSMonitoring’s constructor so that it does not need to load builds while the extension-loading lock is held.

Some changes to the internals of Pipeline logging in the last two years or so might have made it so that Pipelines need to load extensions during startup where they did not do that previously, which could be why you are only seeing the problem after the upgrade.

One other thing I noticed from the thread dumps is that you are using Tomcat rather than the version of Jetty bundled in Jenkins, which is know to cause some weird behavior. I would consider using the bundled version of Jetty to run Jenkins instead.

M.Madhusudana Reddy

unread,
Mar 30, 2020, 3:31:09 PM3/30/20
to Devin Nusbaum, Mark Waite, Jenkins Users
Thanks Devin/Mark. 

I will try disabling jqs-monitoring first to see if the issue goes away. 

yes, we are using Tomcat and I will try with Jetty too. 

Thanks
M.Madhu

Reply all
Reply to author
Forward
0 new messages