Getting InterruptedException in pipeline build

2,403 views
Skip to first unread message

Christian McHugh

unread,
May 9, 2017, 2:54:32 AM5/9/17
to Jenkins Developers
Hey all,

As recently reported, plugin users (and me) are seeing a problem where pipeline jobs after 5 minutes seem to get an InterruptedException thrown. 

This plugin calls an http rest endpoint and polls for the response. The main activity is sending the request, sleeping for a few seconds, then polling again. It is in this sleep where we are seeing the InterruptedException. This same code is called from the classic freestyle job type, and has no issues, so there seems to be something special with pipeline builds, both declarative and scripted.

Does anyone have any thoughts or suggestions?



For anyone attempting to test, install the saltapi plugin and use something like the following pipeline:
import groovy.json.*

stage
("run salt") {
    node
("agent1") {
        saltresult
= salt authtype: 'pam', clientInterface: local(arguments: '"sleep 300; tail -1 /etc/hosts"', blockbuild: true,
        minionTimeout
: 32, function: 'cmd.run', jobPollTime: 7, target: 'master', targetType: 'glob'),
        credentialsId
: 'b5f40401-01b9-4b27-a5e8-8ae94bc90250', servername: 'http://localhost:8000'

       
def prettyJson = JsonOutput.prettyPrint(saltresult)
        println
(prettyJson)
   
}
}



Christian McHugh

unread,
May 10, 2017, 1:54:48 AM5/10/17
to Jenkins Developers
It appears the workflow-cps-plugin contains a 5 minute timeout. https:////issues.jenkins-ci.org/browse/JENKINS-42561

Is this to say that any plugin that supports pipeline, and could run a job that takes longer than 5 minutes to complete needs to run as a durable task?

Christian McHugh

unread,
May 15, 2017, 3:00:52 AM5/15/17
to Jenkins Developers
So as mentioned, it seems that the standard AbstractSynchronousStepExecution or AbstractStepImpl is interrupted after 5 minutes. What is the recommended way of supporting a long running pipeline job that is also resumable? Are there any notes on how to setup a durable step or AbstractSynchronousNonBlockingStepExecution, or are there other recommendations.

Jesse Glick

unread,
May 15, 2017, 9:47:21 AM5/15/17
to Jenkins Dev
On Tue, May 9, 2017 at 2:54 AM, Christian McHugh
<christia...@gmail.com> wrote:
> This plugin calls an http rest endpoint and polls for the response. The main
> activity is sending the request, sleeping for a few seconds, then polling
> again. It is in this sleep where we are seeing the InterruptedException.

You must not do this from the CPS VM thread. A correctly written
`StepExecution` will do only near-instantaneous work in that thread
(say, taking one second at the most). All else must be done in a
background thread. See `SleepStep` for a simple example.
(`SynchronousNonBlockingStepExecution` is not appropriate here
either.)

Christian McHugh

unread,
May 26, 2017, 4:40:13 PM5/26/17
to Jenkins Developers
Hey Jesse and all,

First off, sorry to keep bugging you, but I'm not sure how to go about getting this plugin going and I'm hoping you have ideas or could point me in the proper direction. 

At this point I'm a bit stuck, and not very sure how to proceed. I think I must use something like:

getContext().newBodyInvoker().withContext(<run api call here>).withCallback(BodyExecutionCallback.wrap(getContext())).start();

But I am now stuck on how I should go about setting up finished(), start(), stop(), or onResume() methods. Basically, I think I understand how to start my communication with the saltapi, but I don't understand how the callback process works that I can somehow notify jenkins when the long running job polling returns something. 

The orginal code that is working but gets killed after 5 minutes, is available on the master branch:

While the non-functional code I've been messing with is in the async-pipeline branch. 

If you can provide any help, or direct me to someone who can, it would be greatly appreciated.

Jesse Glick

unread,
May 30, 2017, 9:40:36 AM5/30/17
to Jenkins Dev
You may not use `newBodyInvoker` on a step that does not take a body
(~ closure argument).

Did you look at `SleepStep` for an example? Or `InputStep`?

Christian McHugh

unread,
Jun 3, 2017, 9:06:42 AM6/3/17
to Jenkins Developers
On Tuesday, May 30, 2017 at 2:40:36 PM UTC+1, Jesse Glick wrote:
You may not use `newBodyInvoker` on a step that does not take a body
(~ closure argument).
Understood
 
Did you look at `SleepStep` for an example? Or `InputStep`?
Yes. The difficulty I'm running into is that there are multiple possible http calls, only one of which is long running and requires subsequent polling for status. Thus, my long running process is sort of monolithic and different from something like SleepStep that can initiate the command in the start() and later check in at onResume(). In my case I'd really like to kick off the http call in start(), but just don't want it killed off/hogging the start() thread.

In looking around, I came across a method of doing just as described by running the http process in it's own thread and worked up the following version:

This seems to work as intended, but now I have a problem where previously I was using AbstractSynchronousStepExecution and the run() method returned a string. Now with the new thread process, I don't think I can return a String. 

Any ideas for being able to return a String from AbstractStepExecutionImpl?

Jesse Glick

unread,
Jun 3, 2017, 2:42:51 PM6/3/17
to Jenkins Dev
`StepContext` has methods to let you return a value or throw an error.

Jesse Glick

unread,
Jun 3, 2017, 2:43:47 PM6/3/17
to Jenkins Dev
BTW update your `workflow-step-api` dependency and avoid deprecated APIs.

Christian McHugh

unread,
Jun 3, 2017, 3:08:02 PM6/3/17
to Jenkins Developers
On Saturday, June 3, 2017 at 7:42:51 PM UTC+1, Jesse Glick wrote:
`StepContext` has methods to let you return a value or throw an error.

Do you mean the StepContextParameter to get to the run, workspace, listener, and launcher? I did notice it wasn't working with the 2.3 version that I upped to. Do you know what it should be, or is it best practice to depend on the latest since it's a plugin?

Christian McHugh

unread,
Jun 4, 2017, 11:33:34 AM6/4/17
to Jenkins Developers
On Saturday, June 3, 2017 at 7:42:51 PM UTC+1, Jesse Glick wrote:
`StepContext` has methods to let you return a value or throw an error.

If this is about the sending a String return, do you have any more details? I've tried looking over jenkinsci code and missed an example, and the StepContext doc doesn't seem to show it.  

Jesse Glick

unread,
Jun 5, 2017, 8:23:45 AM6/5/17
to Jenkins Dev
On Sat, Jun 3, 2017 at 3:08 PM, Christian McHugh
<christia...@gmail.com> wrote:
> Do you mean the StepContextParameter

No, what I said:
http://javadoc.jenkins.io/plugin/workflow-step-api/org/jenkinsci/plugins/workflow/steps/StepContext.html#onSuccess-java.lang.Object-

> I did notice it wasn't working with the 2.3 version that I
> upped to.

I am not aware of any such bug.

> is it best practice to depend on
> the latest since it's a plugin?

Yes depend on the latest (compatible with your preferred Jenkins core
version at least).

Christian McHugh

unread,
Jun 7, 2017, 2:08:18 AM6/7/17
to Jenkins Developers
Works great!
 
> I did notice it wasn't working with the 2.3 version that I
> upped to.

I am not aware of any such bug.
This was from my confusion of StepContextParameter with StepContext. In reading about Step operations, it looks like you should be able to add something like the following to the Step class.

 @StepContextParameter

        private transient Run<?, ?> run;

Instead of the old version of needing to add the following to each method.
            Run<?, ?>run = getContext().get(Run.class);
When I do this, the StepContextPareter bits that seem to be null. So I was wondering if there was a minimum version of the workflow-step-api needed to make it work.


This all seems to be going well, but there's one last issue where I could really use advice. Currently in the start() I'm creating a new thread and running the http poll process inside. This works fine as start() can then finish off with the return false, and the thread can then finish off with getContext().onSuccess(results);

new Thread("saltAPI") {

                @Override

                public void run() {

                    try {

                        callRun(token, saltFunc, netapi);


The problem comes in how to handle the onResume(). Since the reference to the thread has been lost, I'm not able to reattach to it if the Jenkins master restarts. Is there a Jenkins'y way of somehow launching a subprocess or something in start() that can be resumed in the onResume()? 
Otherwise, I can write a file to the workspace containing the job ID being polled, and the onResume() can kick off a new poll process for that ID. But that could leave multiple polling threads running, which seems suboptimal.


Thank you very much!

Jesse Glick

unread,
Jun 8, 2017, 11:43:19 AM6/8/17
to Jenkins Dev
On Wed, Jun 7, 2017 at 2:08 AM, Christian McHugh
<christia...@gmail.com> wrote:
> In
> reading about Step operations, it looks like you should be able to add
> something like the following to the Step class.
>
> @StepContextParameter

This is deprecated. Use `StepContext.get` for new code.

> The problem comes in how to handle the onResume(). Since the reference to
> the thread has been lost, I'm not able to reattach to it if the Jenkins
> master restarts.

Reattach? That makes no sense—there will be a new JVM (generally).
Rather you need to start a new background thread (or use `Timer`), as
in `SleepStep` for example.

> I can write a file to the workspace containing the job ID being
> polled

`StepExecution` may have non-`transient` fields which will be restored.
Reply all
Reply to author
Forward
0 new messages