[JIRA] (JENKINS-52966) Two sequential stages in a parallel stage in a declarative pipeline making use of the same agent can cause a StackOverflowError

124 views
Skip to first unread message

luke.ross@clearswift.com (JIRA)

unread,
Aug 9, 2018, 4:31:02 PM8/9/18
to jenkinsc...@googlegroups.com
Luke Ross updated an issue
 
Jenkins / Bug JENKINS-52966
Two sequential stages in a parallel stage in a declarative pipeline making use of the same agent can cause a StackOverflowError
Change By: Luke Ross
Summary: Two sequential stages in a parallel stage in a declarative pipeline making use of the same agent can cause a StackOverflowError
Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396)

luke.ross@clearswift.com (JIRA)

unread,
Aug 9, 2018, 4:39:01 PM8/9/18
to jenkinsc...@googlegroups.com
Luke Ross updated an issue
When running the below pipeline if I have 2 agents that are available and are using the RH7 label then everything completes as expected. However If I only have 1 available agent on the RH7 label either because the other agents are busy or offline then the job will fail with the below error message. Every machine on the RH7 label has a single executor.

Additionally whilst the job is marked as failed, the agent that tried to run the job still shows it as running. I don't know if it would eventually time out but after a few minutes it still shows the job "running" in the executor window. Cancelling the job in the executor window returns the agent to a usable state.

Pipeline:
{code
:java }
#!groovy

pipeline {
    agent none

    stages {
        stage ("p") {
            parallel {
                stage ("p1") {
                    agent { label "RH7" }

                    stages {
                     stage ("p1s1") {
                     steps {
                     echo "Hello in p1s1"
                     }
                     }

                     stage ("p1s2") {
                     steps {
                     echo "Hello in p1s2"
                     }
                     }
                    }
                }

                stage ("p2") {
                    agent { label "RH7" }

                    stages {
                     stage ("p2s1") {
                     steps {
                     echo "Hello in p2s1"
                     }
                     }
                    }
                }
            }
        }
    }
}{code}

Error Message:
{quote}Running in Durability level: MAX_SURVIVABILITY
[Pipeline] stage
[Pipeline] { (p)
[Pipeline] parallel
[Pipeline] [p1] { (Branch: p1)
[Pipeline] [p2] { (Branch: p2)
[Pipeline] [p1] stage
[Pipeline] [p1] { (p1)
[Pipeline] [p2] stage
[Pipeline] [p2] { (p2)
[Pipeline] [p1] node
[p1] Running on Red Hat 7 - 2 in /jenkins/workspace/Problem@2
[Pipeline] [p2] node
[Pipeline] [p1] {
[Pipeline] [p1] stage
[Pipeline] [p1]

Unknown macro: \
{ (p1s1)
[Pipeline] [p1] echo
[p1] Hello in p1s1
[Pipeline] [p1] }
[Pipeline] [p1] // stage
[Pipeline] [p1] stage
[Pipeline] [p1] { (p1s2)
[Pipeline] End of Pipeline
java.lang.StackOverflowError
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:111)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
TRUNCATED SEE ATTACHED LOG

{quote}

Also within the system log there is the following additional error:
{quote}Aug 09, 2018 9:11:10 PM WARNING org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService reportProblem

Unexpected exception in CPS VM thread: CpsFlowExecution[Owner[Problem/39:Problem #39
|#39 ]]
java.lang.IllegalStateException: JENKINS-50407: no loaded shell in CpsFlowExecution[Owner[Problem/39:Problem #39
|#39 ]]
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:52)
at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:332)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:83)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:244)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:232)
at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
at java.util.concurrent.FutureTask.run(Unknown Source)
at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
{quote}

If there is any additional information you require please let me know.

luke.ross@clearswift.com (JIRA)

unread,
Aug 9, 2018, 4:45:02 PM8/9/18
to jenkinsc...@googlegroups.com

luke.ross@clearswift.com (JIRA)

unread,
Aug 9, 2018, 6:05:02 PM8/9/18
to jenkinsc...@googlegroups.com
[Pipeline] [p1] \{ (p1s1)

[Pipeline] [p1] echo
[p1] Hello in p1s1
[Pipeline] [p1] \}
[Pipeline] [p1] // stage
[Pipeline] [p1] stage
[Pipeline] [p1] \{ (p1s2)
[Pipeline] End of Pipeline
java.lang.StackOverflowError
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:111)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:854)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1032)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:988)

andrew.bayer@gmail.com (JIRA)

unread,
Aug 10, 2018, 9:34:02 AM8/10/18
to jenkinsc...@googlegroups.com

andrew.bayer@gmail.com (JIRA)

unread,
Aug 10, 2018, 10:01:02 AM8/10/18
to jenkinsc...@googlegroups.com
Andrew Bayer commented on Bug JENKINS-52966
 
Re: Two sequential stages in a parallel stage in a declarative pipeline making use of the same agent can cause a StackOverflowError

Huh - does this happen consistently? I can't get it to reproduce so far. The underlying issue is, obviously, something in the serialization that gets self-referential. My rough guess is that the second error is a side effect of the stack overflow, but I can't be sure.

andrew.bayer@gmail.com (JIRA)

unread,
Aug 10, 2018, 10:03:02 AM8/10/18
to jenkinsc...@googlegroups.com

Oh, and does it always fail at the same point, i.e., right after [Pipeline] [p1] { (p1s2)?

luke.ross@clearswift.com (JIRA)

unread,
Aug 13, 2018, 3:43:02 AM8/13/18
to jenkinsc...@googlegroups.com

Yeah it fails every time and fails at the same point every time.

I just ran 10 more instances of the pipeline to confirm this.

l.heche@hotmail.fr (JIRA)

unread,
Jan 15, 2019, 8:13:01 AM1/15/19
to jenkinsc...@googlegroups.com

I had the same problem, and we fixed it by allowing more stack to the each thread of the JVM. You can do that by modifying the Jenkins.xml file, add the parameter -Xss with the size of the stack you want to allow to each thread

This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

stuartjsmith@gmail.com (JIRA)

unread,
Jan 16, 2019, 9:16:02 AM1/16/19
to jenkinsc...@googlegroups.com

Louis - I don't suppose you had just updated any plugins to run into this so recently did you? We did quite a major plugin upgrade over the weekend (12th Jan) including many pipeline plugins to the latest version and this has started happening since then. We are trying to track down the plug in that has caused this. Pipeline plugins updated to the latest version were:

  • pipeline model definition
  • pipeline stage tags metadata
  • pipeline model extensions
  • pipeline model api

There were around 25 other updates that we deployed but these are the ones we are initially thinking may have introduced something

l.heche@hotmail.fr (JIRA)

unread,
Jan 17, 2019, 2:50:05 AM1/17/19
to jenkinsc...@googlegroups.com

Yes exactly we haven't done any update, this problem appear when we add some steps in our pipeline. 

 

We do have the following plugin installed

  • pipeline model api 1.3.4
  • pipeline stage tags metadata 1.3.4

But we don't have the plugins pipeline model definition and pipeline model extensions installed

erikpaulmiller@gmail.com (JIRA)

unread,
Jan 30, 2019, 11:59:02 AM1/30/19
to jenkinsc...@googlegroups.com

Thomas.Hutchins@microfocus.com (JIRA)

unread,
Feb 5, 2019, 10:36:03 AM2/5/19
to jenkinsc...@googlegroups.com

I've hit the same issue, the latest version of Jenkins + all plugins. Looks to be a serialization issue when running parallel pipelines. Has this been investigated further? 

Thomas.Hutchins@microfocus.com (JIRA)

unread,
Feb 5, 2019, 10:42:05 AM2/5/19
to jenkinsc...@googlegroups.com
Thomas Hutchins started work on Bug JENKINS-52966
 
Change By: Thomas Hutchins
Status: Open In Progress

Thomas.Hutchins@microfocus.com (JIRA)

unread,
Feb 5, 2019, 10:42:06 AM2/5/19
to jenkinsc...@googlegroups.com
Thomas Hutchins stopped work on Bug JENKINS-52966
 
Change By: Thomas Hutchins
Status: In Progress Open

taitanwl@qq.com (JIRA)

unread,
Feb 6, 2019, 10:08:02 PM2/6/19
to jenkinsc...@googlegroups.com

taitanwl@qq.com (JIRA)

unread,
Feb 6, 2019, 10:10:03 PM2/6/19
to jenkinsc...@googlegroups.com

Thomas.Hutchins@microfocus.com (JIRA)

unread,
Feb 7, 2019, 3:41:03 AM2/7/19
to jenkinsc...@googlegroups.com

I've found a temporary workaround to set Manage Jenkins->Configure System->Pipeline Speed/Durability Level to "Performance-optimized". This "Avoids writing data with every step, avoids atomic writes of data. Pipelines can resume if Jenkins shuts down cleanly, but running pipelines lose step information and cannot resume." 

 

You can also set this option on an individual pipeline to maintain durability for any non-parallel pipelines you have running.

 

This has gotten things working for me so far, hopefully this is fixed soon. 

dnusbaum@cloudbees.com (JIRA)

unread,
Feb 7, 2019, 12:08:02 PM2/7/19
to jenkinsc...@googlegroups.com

Does anyone have a minimal and self-contained reproduction case? It is possible that this is a problem in the 2.x version of the JBoss marshalling library used by Pipeline (hence why everyone is seeing it after upgrading to workflow-support 3.x), or perhaps a subtle change in some Pipeline-related plugin has caused it to create cyclic data structures that are not being handled correctly during serialization?

dnusbaum@cloudbees.com (JIRA)

unread,
Feb 7, 2019, 12:09:05 PM2/7/19
to jenkinsc...@googlegroups.com
Devin Nusbaum edited a comment on Bug JENKINS-52966
Does anyone have a minimal and self-contained reproduction case? It is possible that this is a problem in the 2.x version of the JBoss marshalling library used by Pipeline (hence why everyone is more users are seeing it after upgrading to workflow-support 3.x), or perhaps a subtle change in some Pipeline-related plugin has caused it to create cyclic data structures that are not being handled correctly during serialization?

dnusbaum@cloudbees.com (JIRA)

unread,
Feb 7, 2019, 12:24:09 PM2/7/19
to jenkinsc...@googlegroups.com

See also https://github.com/jenkinsci/ansicolor-plugin/issues/148, where a user reported that removing `ansicolor` caused the problem to go away. I don't think ansicolor is really related to the issue, but if subtle changes to the Pipeline make a difference in whether a StackOverflowError occurs, then perhaps this is not infinite recursion, and JBoss Marshalling 2.x has just increased the number of recursive calls that are made under normal circumstances. Hard to say for sure without a good reproduction.

taitanwl@qq.com (JIRA)

unread,
Feb 8, 2019, 3:17:02 AM2/8/19
to jenkinsc...@googlegroups.com

Devin Nusbaum I provide the code for my loop section, and you can try it yourself,

Unable to find source-code formatter for language: grovvy. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
def call(path) {
    def changeLogSets = currentBuild.changeSets
    def canNext = false
    if (changeLogSets.size() <= 0){
        return true
    }
    for (int i = 0; i < changeLogSets.size(); i++) {
        def entries = changeLogSets[i].items
        for (int j = 0; j < entries.length; j++) {
            def entry = entries[j]
            echo "${entry.commitId} by ${entry.author} on ${new Date(entry.timestamp)}: ${entry.msg}"
            def files = new ArrayList(entry.affectedFiles)
            for (int k = 0; k < files.size(); k++) {
                def file = files[k]
                if (file.path.contains(path)) {
                    echo "有文件匹配  ===》 ${path} "
                    canNext = true
                    break;
                }
            }
            if(canNext){
                break;
            }
        }
    }

    return canNext
}

taitanwl@qq.com (JIRA)

unread,
Feb 8, 2019, 3:18:02 AM2/8/19
to jenkinsc...@googlegroups.com
loongle tse edited a comment on Bug JENKINS-52966
[~dnusbaum]  I provide the code for my loop section, and you can try it yourself,

{code:
grovvy java }
!# grovvy
def call(path) {
    def changeLogSets = currentBuild.changeSets
    def canNext = false
    if (changeLogSets.size() <= 0){
        return true
    }
    for (int i = 0; i < changeLogSets.size(); i++) {
        def entries = changeLogSets[i].items
        for (int j = 0; j < entries.length; j++) {
            def entry = entries[j]
            echo "${entry.commitId} by ${entry.author} on ${new Date(entry.timestamp)}: ${entry.msg}"
            def files = new ArrayList(entry.affectedFiles)
            for (int k = 0; k < files.size(); k++) {
                def file = files[k]
                if (file.path.contains(path)) {
                    echo "有文件匹配  ===》 ${path} "
                    canNext = true
                    break;
                }
            }
            if(canNext){
                break;
            }
        }
    }

    return canNext
}
{code}

taitanwl@qq.com (JIRA)

unread,
Feb 8, 2019, 3:19:04 AM2/8/19
to jenkinsc...@googlegroups.com
loongle tse edited a comment on Bug JENKINS-52966
[~dnusbaum]  I provide the code for my loop section, and you can try it yourself,

{code:java}

!# grovvy
def call(path) {
    def changeLogSets = currentBuild.changeSets
    def canNext = false
    if (changeLogSets.size() <= 0){
        return true
    }
    for (int i = 0; i < changeLogSets.size(); i++) {
        def entries = changeLogSets[i].items
        for (int j = 0; j < entries.length; j++) {
            def entry = entries[j]
            echo "${entry.commitId} by ${entry.author} on ${new Date(entry.timestamp)}: ${entry.msg}"
            def files = new ArrayList(entry.affectedFiles)
            for (int k = 0; k < files.size(); k++) {
                def file = files[k]
                if (file.path.contains(path)) {
                    echo " 有文件匹配 path   ===》 ${path} "

                    canNext = true
                    break;
                }
            }
            if(canNext){
                break;
            }
        }
    }

    return canNext
}
{code}

taitanwl@qq.com (JIRA)

unread,
Feb 8, 2019, 3:22:02 AM2/8/19
to jenkinsc...@googlegroups.com

I don't think the upgrade will have an effect, but the fact is it will cause StackOverflow.As soon as I upgrade the plug-in associated with the Pipline

h.kayser@pilz.de (JIRA)

unread,
May 6, 2019, 11:54:04 AM5/6/19
to jenkinsc...@googlegroups.com

We face the same issue. It causes builds to fail, the executor is stuck afterwards, but the stage seems to be null (displays only "part" instead of the current stage name).

 

Pipeline console output is
java.lang.StackOverflowError
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:114)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1082)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1040)
[...]
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1019)
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:920)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1082)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1040)

 

The Jenkins error log contains the following:

Mai 03, 2019 7:05:54 PM org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService reportProblem
WARNUNG: Unexpected exception in CPS VM thread: CpsFlowExecution[Owner[<JOB_NAME>/<BUILD>]]
java.lang.StackOverflowError

at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:114)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1082)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1040)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1019)
[...]

I suspected the HTTP Request Plugin to be responsible for the failure and removed it from the Pipeline code. The removal reduced the occurance, but did not remove it completely.

Any idea how to process with this issue?

taitanwl@qq.com (JIRA)

unread,
May 15, 2019, 9:45:02 PM5/15/19
to jenkinsc...@googlegroups.com

Maybe it has something to do with host memory, I'm using a 1 gigabyte raspberry pie.Switching to a 4G PC won't cause this problem

cfrolik@gmail.com (JIRA)

unread,
Jun 14, 2019, 11:27:03 AM6/14/19
to jenkinsc...@googlegroups.com

I'm not sure why this issue is marked as Minor. It has a pretty dramatic impact on us, and a fix would be much appreciated.

DuAell@gmail.com (JIRA)

unread,
Jun 19, 2019, 5:39:03 AM6/19/19
to jenkinsc...@googlegroups.com

bastianbrodbeck@googlemail.com (JIRA)

unread,
Jul 5, 2019, 9:11:03 AM7/5/19
to jenkinsc...@googlegroups.com
Bastian Brodbeck commented on Bug JENKINS-52966
 
Re: Two sequential stages in a parallel stage in a declarative pipeline making use of the same agent can cause a StackOverflowError

I also have this problem with our declarative pipeline, but we do not use parallel steps.

We run Jenkins Jenkins 2.176.1 but I had it with 2.150.1 as well.

 

I think it happened from one day to another a few days back. Cannot remember having done any change to Jenkins or the pipeline that caused this to happen.

 I though updating Jenkins and all installed Plugins might fix the issue, but it did not.

The things I noticed:

  • it happens 9 out of 10 builds
  • when it happens then consistent at the same step
  • when chaining the pipeline to mediate the issue it happens at another step (and stays there) – i initially thought it might be related to credentials (withCredentials, withAWS)
  • When i reduce the size of the pipeline it does not happen. But then of course we can not use the result.
  • so far it only happened on my Windows machine, never during a mac build
  • System Log: 
      • Unexpected exception in CPS VM thread: CpsFlowExecution
      • java.lang.IllegalStateException: JENKINS-50407: no loaded shell in CpsFlowExecution

    fayf86+jenkins@gmail.com (JIRA)

    unread,
    Jul 16, 2019, 2:17:03 AM7/16/19
    to jenkinsc...@googlegroups.com

    I encountered this issue too. I am using Jenkins on Windows which was installed using the installer. After some digging, I realized that this distribution of Jenkins comes packaged with a 32-bit version of the JRE which is used by the service definition (see jenkins.xml). This severely limits the amount of heap memory the JVM can allocate. If you're facing this issue in the same situation, modify jenkins.xml to use a different, 64-bit version of JRE and also increase the max heap allocation (e.g. -Xmx1024m).

    fayf86+jenkins@gmail.com (JIRA)

    unread,
    Jul 16, 2019, 2:21:03 AM7/16/19
    to jenkinsc...@googlegroups.com
    Andrew Ching edited a comment on Bug JENKINS-52966
    I encountered this issue too. I am using Jenkins on Windows which was installed using the installer. After some digging, I realized that this distribution of Jenkins comes packaged with a 32-bit version of the JRE which , and it is used by the service definition Windows Service ( see which uses the jenkins.xml file ). This severely limits the amount of heap memory the JVM can allocate. If you're facing this issue in the same situation, modify jenkins.xml to use a different, 64-bit version of JRE and also increase the max heap allocation (e.g. -Xmx1024m).

    arnaud.richard@st.com (JIRA)

    unread,
    Sep 11, 2019, 8:57:04 AM9/11/19
    to jenkinsc...@googlegroups.com

    +1 for Andrew Ching status and solution.

    I initially used Windows installer to avoid Java version and configuration issues but they came back anyway!

     

    This message was sent by Atlassian Jira (v7.13.6#713006-sha1:cc4451f)
    Atlassian logo

    peter_carenza@bcbst.com (JIRA)

    unread,
    Nov 18, 2019, 1:05:04 PM11/18/19
    to jenkinsc...@googlegroups.com

    peter_carenza@bcbst.com (JIRA)

    unread,
    Nov 18, 2019, 1:06:05 PM11/18/19
    to jenkinsc...@googlegroups.com

    lsnodak@gmail.com (JIRA)

    unread,
    Dec 23, 2019, 8:49:03 AM12/23/19
    to jenkinsc...@googlegroups.com
    Lance Swoboda commented on Bug JENKINS-52966
     
    Re: Two sequential stages in a parallel stage in a declarative pipeline making use of the same agent can cause a StackOverflowError

    I was able to finally overcome this problem by taking these steps:

    1. Update the jenkins.xml file to use a 64bit JRE
    2. increase max heap size to 1024m for the JRE (in the jenkins.xml file)
    3. increase the stack size to 4m in the jenkins.xml as well (-Xss4m)

    hope this is helpful for someone else.

    Reply all
    Reply to author
    Forward
    0 new messages