workflow plugin: canceling running jobs

137 views
Skip to first unread message

Craig Silverstein

unread,
Dec 11, 2014, 2:04:20 PM12/11/14
to jenkin...@googlegroups.com
(I apologize in advance if I get terminology wrong; I'm not fully versed in the jenkins world yet.)

At Khan Academy, we're very excited by the potential of the workflow plugin to improve our deploy process.  But I have a couple of questions that I didn't see addressed in the (excellent) tutorial:

1) Is it possible to run an existing jenkins job from a workflow groovy script?  I saw mention elsewhere of a 'build' step, but don't see it mentioned in the tutorial.

2) Is it possible to cancel a running task (either a jenkins job or an ad-hoc job created by a workflow node step) from within a workflow groovy script?

Here is what we are trying to do: our deploy process currently builds deploy artifacts and runs tests at the same time.  I see how we could implement that using workflow via the 'parallel' step.  However, we also want the behavior that if and when build-artifacts fails, it immediately cancels the running tests and fails the deploy job; and likewise if the tests fail, it immediately cancels the building of artifacts and fails the deploy job.  If both succeed, then the deploy job continues.

Pseudocode in a unix-y environment would be something like this:
```
p1 = spawnJobAsync('build-artifacts")
p2 = spawnJobAsync("run-tests")
// poll() returns True when a process has finished.
while (!p1.poll() || !p2.poll()) {
   if (p1.poll() && p1.wait() != SUCCESS) {
       p2.kill()
       return FAIL
   }
   if (p2.poll() && p2.wait() != SUCCESS) {
      p1.kill()
      return FAIL
   }
   sleep(1)         // check once a second
}

// ... continue to deploy ...
```

Is such a thing possibly using the workflow plugin?  If not, is something like this contemplated? -- in particular, the ability to start a job in the background and poll it periodically to see how it's doing.

Thanks,
craig

Jesse Glick

unread,
Dec 11, 2014, 4:29:06 PM12/11/14
to Jenkins Dev
On Thu, Dec 11, 2014 at 2:04 PM, Craig Silverstein
<csil...@khanacademy.org> wrote:
> Is it possible to run an existing jenkins job from a workflow groovy script? I saw mention elsewhere of a 'build' step

That is it.

> don't see it mentioned in the tutorial.

Indeed not all features are mentioned in the tutorial, which is why it
advises you to use the Snippet Generator to explore further.

(Whether this step is important for a broad enough class of users to
merit direct mention is another question. Newly written flows can
probably do without it, but it can be useful when migrating existing
setups.)

> we also want the
> behavior that if and when build-artifacts fails, it immediately cancels the
> running tests and fails the deploy job; and likewise if the tests fail, it
> immediately cancels the building of artifacts and fails the deploy job. If
> both succeed, then the deploy job continues.

I do not think flows should be writing such code directly. Probably
this should just be an option for the parallel step (or even the
standard behavior). Feel free to file it as an RFE.

> the ability to start a job in the background and poll it periodically to see how it's doing.

Do you have other use cases for such a feature, besides just making
parallel fail early?

Craig Silverstein

unread,
Dec 11, 2014, 4:45:29 PM12/11/14
to jenkin...@googlegroups.com
On Thursday, December 11, 2014 1:29:06 PM UTC-8, Jesse Glick wrote:
> (Whether this step is important for a broad enough class of users to
> merit direct mention is another question. Newly written flows can

Right -- I guess I was just surprised not to see it mentioned, since it seemed like a pretty important feature.  But maybe that's because I was thinking in terms of migration.

I don't understand what is exposed to flows enough to know if newly written flows would never need to use 'old-style' jenkins jobs.  For instance, our build job uses the ec2 plugin to dynamically manage ec2 machines running the build slaves; I wouldn't know how to get the same functionality via groovy, if we were trying to re-implement things from scratch.


> I do not think flows should be writing such code directly. Probably
> this should just be an option for the parallel step (or even the
> standard behavior). Feel free to file it as an RFE.

Just to be clear: by "this" do you mean: canceling sibling jobs when a job inside 'parallel' fails?

I'm glad to file an RFE.  Where is the right place to file it?

Another thing I was unclear on: does 'parallel' have a return value?  (Do steps have a return value in general?)  If I had 'parallel [ nodeA: ..., nodeB: ...]' and nodeA failed, would I have some way of telling that things failed?


> Do you have other use cases for such a feature, besides just making
> parallel fail early?

I can think of others -- for instance, implementing a more sophisticated timeout functionality.  Say I wanted to run jobA and jobB in parallel, but if one job took more than 10 minutes longer than the other job, I'd cancel the whole thing.  This is pretty contrived though.

My instinct is that having access to a 'job' object that lets you query what's going on with a running job would be a useful thing to have.  But for now -- and I say this not having tried to re-implement our deploy process yet -- the parallel-fail case is the only one I think we really need.

Thanks for the quick reply!

craig

Jesse Glick

unread,
Dec 11, 2014, 6:14:00 PM12/11/14
to Jenkins Dev
On Thu, Dec 11, 2014 at 4:45 PM, Craig Silverstein
<csil...@khanacademy.org> wrote:
> I guess I was just surprised not to see [the ‘build’ step] mentioned, since it seemed like a pretty important feature.

Could be. Will consider adding a section on it. Anyway there are
several important RFEs to implement on this step.

> our build job uses the ec2 plugin to dynamically manage ec2 machines running the
> build slaves; I wouldn't know how to get the same functionality via groovy

You mean a build step that does something to EC2? This plugin would
just need to implement the newish Jenkins core API
SimpleBuildStep—typically, though not always, a simple refactoring—and
then the generic ‘step’ step could run it.

If you mean you just want to run on an EC2 slave, that is covered by ‘node’.

> by "this" do you mean: canceling sibling [branches] when a [branch] inside 'parallel' fails?

Yes.

> I'm glad to file an RFE. Where is the right place to file it?

JIRA, component ‘workflow-plugin’.

> does 'parallel' have a return value?

It does not seem to be documented (and perhaps not tested) but it does
seem to return a map of the return values of its branches. File an
issue to make sure this is documented and tested (and works).

> Do steps have a return value in general?

Sure. Some like ‘readFile’ have no other purpose. Others like ‘tool’
do something but really need the return value to be useful. Others
return a value which is not commonly used.

> If I had 'parallel [ nodeA: ..., nodeB: ...]' and nodeA failed, would I have some way of telling that things failed?

The whole ‘parallel’ step would fail in this case.

> This is pretty contrived though.

Yeah, I think the plain ‘timeout’ step (with a predetermined timeout
for each branch) would be what you would use in practice.

Craig Silverstein

unread,
Dec 11, 2014, 7:58:10 PM12/11/14
to jenkin...@googlegroups.com


On Thursday, December 11, 2014 3:14:00 PM UTC-8, Jesse Glick wrote:
> If you mean you just want to run on an EC2 slave, that is covered by ‘node’.

I do mean running on an ec2 slave, but the ec2 plugin dynamically brings up ec2 instances for the slaves to run on (and then shuts them down again after a timeout).

I presume the 'workflow' way of handling this is to have the plugin expose the necessary functionality via a SimpleBuildStep, and then I could call it.  I'm not exactly sure what functionality that would be in this case; I don't really understand where the ec2 plugin hooks in to jenkins, though it seems rather complicated.


> > If I had 'parallel [ nodeA: ..., nodeB: ...]' and nodeA failed, would I have some way of telling that things failed?
> The whole ‘parallel’ step would fail in this case.

So if we wanted to log the fact that it was nodeA that failed, it sounds like what we'd do is wrap the 'parallel' step in a let-it-fail construct, and then examine the return value of parallel to see what failed, if anything.  If we saw something failed, we'd log it and then manually fail the deploy job, otherwise we'd just go on to the next deploy step.

Thanks for this project, btw!  I've been checking in from time to time and have seen how actively it's being developed.  I am very excited to play around with it.

craig

Jesse Glick

unread,
Dec 12, 2014, 11:58:55 AM12/12/14
to Jenkins Dev
On Thu, Dec 11, 2014 at 7:58 PM, Craig Silverstein
<csil...@khanacademy.org> wrote:
> I do mean running on an ec2 slave, but the ec2 plugin dynamically brings up
> ec2 instances for the slaves to run on (and then shuts them down again after
> a timeout).

That should be an automatic aspect of the EC2 plugin’s Cloud
implementation. (I have not personally tried it with this plugin yet.)
In other words, unless there is some bug I do not know about yet, you
should not need to do anything special at all: just configure an EC2
cloud with some label, then pass that label to ‘node’.

> if we wanted to log the fact that it was nodeA that failed, it sounds
> like what we'd do is wrap the 'parallel' step in a let-it-fail construct,
> and then examine the return value of parallel to see what failed

As noted in JENKINS-26033, it is clearer and easier (IMO) to use a
try/catch block inside the parallel branch. You can then decide how to
proceed in the catch block: rethrow the exception as is, throw up a
polite error (JENKINS-25924), record some information in a variable
accessible to the closure, etc.
Reply all
Reply to author
Forward
0 new messages