[JIRA] [workflow-plugin] (JENKINS-28822) Can't distinguish between manual abort and failure in workflow plugin

421 views
Skip to first unread message

jmif96@gmail.com (JIRA)

unread,
Jun 9, 2015, 8:45:01 PM6/9/15
to jenkinsc...@googlegroups.com
Joe Mifsud created an issue
 
Jenkins / Bug JENKINS-28822
Can't distinguish between manual abort and failure in workflow plugin
Issue Type: Bug Bug
Assignee: Jesse Glick
Components: workflow-plugin
Created: 10/Jun/15 12:44 AM
Priority: Minor Minor
Reporter: Joe Mifsud

When a workflow is manually aborted in Jenkins it fires a hudson.AbortException. This is the same thing that happens when a step fails. Thus, it's impossible to properly set the build status to ABORTED on a manual abort and to FAILED on a failed step because you can't programmatically tell the difference.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265)
Atlassian logo

jglick@cloudbees.com (JIRA)

unread,
Jun 12, 2015, 5:56:01 PM6/12/15
to jenkinsc...@googlegroups.com
Jesse Glick resolved as Cannot Reproduce
 

Manually stopping a build should set the result to ABORTED. This uses FlowInterruptedException, not AbortException.

Change By: Jesse Glick
Status: Open Resolved
Resolution: Cannot Reproduce

jmif96@gmail.com (JIRA)

unread,
Jun 19, 2015, 3:06:02 PM6/19/15
to jenkinsc...@googlegroups.com
Joe Mifsud edited a comment on Bug JENKINS-28822
 
Re: Can't distinguish between manual abort and failure in workflow plugin
What do you mean by manually stopping?

If you abort the build by clicking "Abort" when paused for input, a FlowInterrupedException is fired.  However, if you click the Jenkins abort button, an abort exception with no cause is fired.



{code
:text }

Started by user Joe Mifsud
Running: Build Stage
Entering stage Build Stage
Proceeding
Running: Allocate node : Start
Running on master in /var/lib/jenkins/jobs/REDACTED/workspace
Running: Allocate node : Body : Start
Running: Git
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url git@REDACTED
Fetching upstream changes from git@REDACTED
 > git --version # timeout=10
using GIT_SSH to set credentials Credentials for git - created by Chef
 > git -c core.askpass=true fetch --tags --progress REDACTED +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
Checking out Revision REDACTED (refs/remotes/origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f REDACTED
 > git rev-list REDACTED # timeout=10
Running: Archive Artifacts
Running: Allocate node : Body : End
Running: Allocate node : End
Running: Allocate node : Start
Running on master in /var/lib/jenkins/jobs/REDACTED/workspace
Running: Allocate node : Body : Start
Aborted by Joe Mifsud
Running: Write file to workspace
Running: Shell Script
[workspace] Running shell script
+ flock -w 600 /var/lock/sp_js_build_npm_install.lock -c npm install
Terminated
Running: Allocate node : Body : End
Running: Allocate node : End
Running: Print Message
Received hudson.AbortException with message script returned exit code 143      <--- e.getClass().name
Running: Print Message
Received org.codehaus.groovy.runtime.NullObject with message script returned exit code 143      <--- e.getCause().getClass().name
Running: Print Message
Build failed due to class hudson.AbortException script returned exit code 143
{code}

jmif96@gmail.com (JIRA)

unread,
Jun 19, 2015, 3:06:02 PM6/19/15
to jenkinsc...@googlegroups.com

What do you mean by manually stopping?

If you abort the build by clicking "Abort" when paused for input, a FlowInterrupedException is fired. However, if you click the Jenkins abort button, an abort exception with no cause is fired.

Unable to find source-code formatter for language: text. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml

jmif96@gmail.com (JIRA)

unread,
Jun 19, 2015, 3:08:01 PM6/19/15
to jenkinsc...@googlegroups.com
Joe Mifsud reopened an issue
 

Reopening to ensure that our definition of "manual abort" is the same. Was just able to reproduce by clicking the jenkins abort button (red x).

Change By: Joe Mifsud
Resolution: Cannot Reproduce
Status: Resolved Reopened

jmif96@gmail.com (JIRA)

unread,
Jun 19, 2015, 3:09:01 PM6/19/15
to jenkinsc...@googlegroups.com
 
Re: Can't distinguish between manual abort and failure in workflow plugin

An AbortException is also fired when the job is paused for input and the Jenkins abort button is clicked.

jglick@cloudbees.com (JIRA)

unread,
Aug 12, 2015, 8:08:01 AM8/12/15
to jenkinsc...@googlegroups.com
Jesse Glick updated an issue
 
Can't distinguish between durable task abort and failure in workflow plugin
Change By: Jesse Glick
Summary: Can't distinguish between  manual  durable task  abort and failure in workflow plugin

jglick@cloudbees.com (JIRA)

unread,
Aug 12, 2015, 8:14:01 AM8/12/15
to jenkinsc...@googlegroups.com
Jesse Glick commented on Bug JENKINS-28822
 
Re: Can't distinguish between durable task abort and failure in workflow plugin

If you abort inside a sh step, Jenkins sends SIGTERM to the process. That will normally cause it to exit with code 143 (128 + the SIGTERM signal code IIRC), which is treated as a failed script by sh, thus throwing an error (and causing the build to fail, if you do not catch it). The process could however be trapping the signal and behaving differently.

If the process does not respond to SIGTERM, pressing the abort button a second time forcibly aborts the build (FlowInterruptedException).

Canceling an input step uses FlowInterruptedException and sets the build status to aborted. This is unrelated to handling of durable tasks.

jglick@cloudbees.com (JIRA)

unread,
Aug 29, 2016, 2:04:01 PM8/29/16
to jenkinsc...@googlegroups.com
Jesse Glick updated an issue
Change By: Jesse Glick
Component/s: workflow-durable-task-step-plugin
Component/s: pipeline
This message was sent by Atlassian JIRA (v7.1.7#71011-sha1:2526d7c)
Atlassian logo

kai@hoewelmeyer.eu (JIRA)

unread,
Mar 24, 2017, 10:05:02 AM3/24/17
to jenkinsc...@googlegroups.com
Kai Howelmeyer commented on Bug JENKINS-28822
 
Re: Can't distinguish between durable task abort and failure in workflow plugin

I am also confused by this handling:

In https://github.com/jenkinsci/workflow-durable-task-step-plugin/blob/master/src/main/java/org/jenkinsci/plugins/workflow/steps/durable_task/DurableTaskStep.java#L319 we send AbortException. If I surround a "sh" step with try/catch, I cannot tell if Abort was clicked or if the shell script failed. Is there an easy way to distinguish this?

This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)
Atlassian logo

jglick@cloudbees.com (JIRA)

unread,
Mar 24, 2017, 3:16:03 PM3/24/17
to jenkinsc...@googlegroups.com

kutzi@gmx.de (JIRA)

unread,
Mar 8, 2018, 2:04:04 PM3/8/18
to jenkinsc...@googlegroups.com
kutzi commented on Bug JENKINS-28822

I guess this is also the reason for JENKINS-43339, isn't it?

This is IMO a big issue as it makes to post { aborted {} } section in declarative pipelines largely unusable. In fact, I've not seen a single occasion when aborted worked as expected. For me the failure section is always triggered instead.

kutzi@gmx.de (JIRA)

unread,
Mar 8, 2018, 2:04:06 PM3/8/18
to jenkinsc...@googlegroups.com
kutzi edited a comment on Bug JENKINS-28822
I guess this is also the reason for JENKINS-43339, isn't it?

This is IMO a big issue as it makes to post the _post { aborted {} } _ section in declarative pipelines largely unusable. In fact, I've not seen a single occasion when _aborted_ worked as expected. For me the _failure_ section is always triggered instead.

kutzi@gmx.de (JIRA)

unread,
Mar 12, 2018, 12:40:02 PM3/12/18
to jenkinsc...@googlegroups.com
kutzi edited a comment on Bug JENKINS-28822
I guess this is also the reason for JENKINS-43339, isn't it?

This is IMO a big issue as it makes the _post { aborted {} }_ section in declarative pipelines largely unusable. In fact, I've not seen a single occasion when _aborted_ worked as expected - when manually aborting a build . For me Then the _failure_ section is always triggered instead.

crussell52@gmail.com (JIRA)

unread,
Aug 17, 2018, 12:40:03 PM8/17/18
to jenkinsc...@googlegroups.com

Piecing together clues from comments here and in similar issues (such as JENKINS-41604), I've added this to our shared lib with pretty good results, so far.

This version has some debug output and logic which isn't needed for our purposes (such as distinguishing between user and system rejections for inputs), but I left that logic in for the purposes of this comment in case it is useful for others.

Pipeline

try {
  // work
} catch(err) {
  (new ExecutionHelper()).fixBuildResult(error)
  
  // Don't hide the error from jenkins
  throw err
} finally {
  // React to currentBuild.result
}

ExcecutionHelper.groovy

import jenkins.model.CauseOfInterruption
import jenkins.model.InterruptedBuildAction
import org.jenkinsci.plugins.workflow.steps.FlowInterruptedException
import org.jenkinsci.plugins.workflow.support.steps.input.Rejection


def fixBuildResult(error) {


    def iba = currentBuild.rawBuild.getAction(InterruptedBuildAction.class)
    if (iba && iba.causes.size() && iba.causes.any{ it instanceof CauseOfInterruption.UserInterruption }) {
        // probably aborted the execution from outside pipeline
        echo "run cancelled"
        currentBuild.result = "ABORTED"
    } else if (error instanceof FlowInterruptedException && error.getCauses().any { it instanceof Rejection }) {

        // Dig out the rejection and see if it is associated to a user.
        def user = null
        error.getCauses().each {
            if (it instanceof Rejection) {
                user = it.user
            }
        }

        // System user probably means rejected because of timeout
        echo (user && "${user}" != "SYSTEM" ? "rejected by user: ${user}" : "rejected by... not user")

        currentBuild.result = "ABORTED"
    } else {
        currentBuild.result = "FAILURE"
    }

    throw error
} 
This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396)

crussell52@gmail.com (JIRA)

unread,
Aug 17, 2018, 12:52:04 PM8/17/18
to jenkinsc...@googlegroups.com
Chris Russell edited a comment on Bug JENKINS-28822
Piecing together clues from comments here and in similar issues (such as JENKINS-41604), I've added this to our shared lib with pretty good results, so far.

This version has some debug output and logic which isn't needed for our purposes (such as distinguishing between user and system rejections for inputs), but I left that logic in for the purposes of this comment in case it is useful for others.

Pipeline
{code:java}

try {
  // work
} catch(err) {
  (new ExecutionHelper()).fixBuildResult(error)
  
  // Don't hide the error from jenkins
  throw err
} finally {
  // React to currentBuild.result
}{code}
ExcecutionHelper.groovy
{code:java}

import jenkins.model.CauseOfInterruption
import jenkins.model.InterruptedBuildAction
import org.jenkinsci.plugins.workflow.steps.FlowInterruptedException
import org.jenkinsci.plugins.workflow.support.steps.input.Rejection


def fixBuildResult(error) {


    def iba = currentBuild.rawBuild.getAction(InterruptedBuildAction.class)
    if (iba && iba.causes.size() && iba.causes.any{ it instanceof CauseOfInterruption.UserInterruption }) {
        // probably aborted the execution from outside pipeline
        echo "run cancelled"
        currentBuild.result = "ABORTED"
    } else if (error instanceof FlowInterruptedException && error.getCauses().any { it instanceof Rejection }) {

        // Dig out the rejection and see if it is associated to a user.
        def user = null
        error.getCauses().each {
            if (it instanceof Rejection) {
                user = it.user
            }
        }

        // System user probably means rejected because of timeout
        echo (user && "${user}" != "SYSTEM" ? "rejected by user: ${user}" : "rejected by... not user")

        currentBuild.result = "ABORTED"
    } else {
        currentBuild.result = "FAILURE"
    }

    throw error
} {code}

jglick@cloudbees.com (JIRA)

unread,
Aug 27, 2018, 11:30:04 AM8/27/18
to jenkinsc...@googlegroups.com
Jesse Glick updated an issue
Change By: Jesse Glick
When a workflow is manually aborted in Jenkins it fires a hudson.AbortException.  This is the same thing that happens when a step fails.  Thus, it's impossible to properly set the build status to ABORTED on a manual abort and to FAILED on a failed step because you can't programmatically tell the difference.


[~jglick]’s recommended fix: see comment as of 2017-06-28
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

joerg.schwaerzler@infineon.com (JIRA)

unread,
Sep 3, 2018, 4:59:04 AM9/3/18
to jenkinsc...@googlegroups.com
Joerg Schwaerzler commented on Bug JENKINS-28822
 
Re: Can't distinguish between durable task abort and failure in workflow plugin

I wonder whether my issue is related to this one:

I get a hudson.AbortException() sometimes if some parallel steps are waiting for some free node while a timeout aborts the build. I would expect to get a FlowInterruptedException, though.

joerg.schwaerzler@infineon.com (JIRA)

unread,
Sep 3, 2018, 5:00:07 AM9/3/18
to jenkinsc...@googlegroups.com
Joerg Schwaerzler edited a comment on Bug JENKINS-28822
I wonder whether my issue is related to this one:

I get a {{hudson.AbortException()}} sometimes if some parallel steps are waiting for some free node while when a timeout aborts the build. I would expect to get a {{FlowInterruptedException}}, though.

joerg.schwaerzler@infineon.com (JIRA)

unread,
Sep 3, 2018, 5:01:02 AM9/3/18
to jenkinsc...@googlegroups.com
Joerg Schwaerzler edited a comment on Bug JENKINS-28822
I wonder whether my issue is related to this one:

I get a {{hudson.AbortException()}} sometimes if some parallel steps are waiting for some free node when a timeout aborts the build. I would expect to get a {{FlowInterruptedException}}, though.

I get {{Encountered exception:  hudson.AbortException: Queue task was cancelled.}}

jglick@cloudbees.com (JIRA)

unread,
Sep 4, 2018, 5:51:03 PM9/4/18
to jenkinsc...@googlegroups.com

Joerg Schwaerzler that is indeed related, but should have been fixed by this in 2.21 AFAIK.

joerg.schwaerzler@infineon.com (JIRA)

unread,
Sep 5, 2018, 7:52:02 AM9/5/18
to jenkinsc...@googlegroups.com

Thanks for the info. After updating to 2.21 I can confirm that the issue we faced is fixed.

dnusbaum@cloudbees.com (JIRA)

unread,
Sep 20, 2018, 12:25:15 PM9/20/18
to jenkinsc...@googlegroups.com

dnusbaum@cloudbees.com (JIRA)

unread,
Sep 20, 2018, 12:25:19 PM9/20/18
to jenkinsc...@googlegroups.com

dnusbaum@cloudbees.com (JIRA)

unread,
Sep 20, 2018, 12:25:22 PM9/20/18
to jenkinsc...@googlegroups.com
Devin Nusbaum started work on Bug JENKINS-28822
 
Change By: Devin Nusbaum
Status: Open In Progress

dnusbaum@cloudbees.com (JIRA)

unread,
Sep 20, 2018, 12:25:22 PM9/20/18
to jenkinsc...@googlegroups.com

dnusbaum@cloudbees.com (JIRA)

unread,
Sep 21, 2018, 1:19:05 PM9/21/18
to jenkinsc...@googlegroups.com
Devin Nusbaum commented on Bug JENKINS-28822
 
Re: Can't distinguish between durable task abort and failure in workflow plugin

I am working on this in https://github.com/jenkinsci/workflow-durable-task-step-plugin/pull/75. My main concern is whether fixing it now will break all of the workarounds the people are currently using, so I want to do a few tests with some of the workarounds posted here to get an idea of the impact.

dnusbaum@cloudbees.com (JIRA)

unread,
Sep 25, 2018, 5:03:03 PM9/25/18
to jenkinsc...@googlegroups.com

There are 2 main differences that my change would cause:

  1. If a sh step is manually aborted, the exception thrown will be a FlowInterruptedException rather than an AbortException (and FlowInterruptedException will be thrown even if returnStatus is true). The script will exit with RESULT.ABORTED both before and after my change. The behavior of Chris Russell's workaround is unaffected by my change in this case.
  2. If a sh step is automatically aborted by a timeout step, the exception thrown will be a FlowInterruptedException rather than an AbortException (and FlowInterruptedException will be thrown even if returnStatus is true), and as a result the script will exit with Result.ABORTED instead of RESULT.FAILURE. Chris Russell's workaround does not handle this case, so the behavior would change in the same way that it would for people not using a workaround. This change also makes timeout}}s of {{sh steps consistent with other steps such as sleep. (ABORTED instead of FAILURE)

It looks like the workaround from Kai Howelmeyer does not work, because FlowInterruptedException is thrown in the sleeping parallel branch when the pipeline is manually aborted, when the sh step is aborted by a timeout step, or when the sh step fails because of an issue in the script itself, so wasAborted is set to true in all cases and we always rethrow an AbortedException and never execute the block where the sh step's script failed.

Based on that info, I think it is ok to move forward with the change, but if anyone knows of other common workarounds that may be broken by this change, feel free leave a comment.

haridara@gmail.com (JIRA)

unread,
Sep 26, 2018, 2:03:02 PM9/26/18
to jenkinsc...@googlegroups.com

Chris Russell: I see that your workaround uses instanceof checks, but these are not permitted in pipeline sandbox. Using instanceof generates the below sort of error:

org.jenkinsci.plugins.scriptsecurity.sandbox.RejectedAccessException: Scripts not permitted to use method java.lang.Class isInstance java.lang.Object 

I just submitted this PR that whitelists isinstance check: https://github.com/jenkinsci/script-security-plugin/pull/226

dnusbaum@cloudbees.com (JIRA)

unread,
Oct 22, 2018, 5:51:10 PM10/22/18
to jenkinsc...@googlegroups.com
Devin Nusbaum edited a comment on Bug JENKINS-28822
Fixed in Pipeline Nodes and Processes 2.24. See my comment here for an overview of the changes.

dnusbaum@cloudbees.com (JIRA)

unread,
Oct 22, 2018, 5:51:11 PM10/22/18
to jenkinsc...@googlegroups.com
 

Fixed in Pipeline Nodes and Processes 2.24. See my comment here for an overview of the changes.

Jenkins / Bug JENKINS-28822
Change By: Devin Nusbaum
Status: In Review Resolved
Resolution: Fixed
Released As: workflow-durable-task-step 2.24

dnusbaum@cloudbees.com (JIRA)

unread,
Oct 22, 2018, 5:52:02 PM10/22/18
to jenkinsc...@googlegroups.com

dnusbaum@cloudbees.com (JIRA)

unread,
Oct 22, 2018, 5:53:03 PM10/22/18
to jenkinsc...@googlegroups.com
Devin Nusbaum edited a comment on Bug JENKINS-28822
Fixed in Pipeline Nodes and Processes 2.24. See [ my comment here |https://issues.jenkins-ci.org/browse/JENKINS-28822?focusedCommentId=350058&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-350058]  for an overview of the changes , but in short, externally aborted tasks should throw FlowInterruptedException, while tasks that fail in the script itself should throw AbortException .
Reply all
Reply to author
Forward
0 new messages