[JIRA] (JENKINS-57866) ShellStepTest.abort flake on Windows

3 views
Skip to first unread message

jglick@cloudbees.com (JIRA)

unread,
Jun 5, 2019, 1:20:02 PM6/5/19
to jenkinsc...@googlegroups.com
Jesse Glick created an issue
 
Jenkins / Bug JENKINS-57866
ShellStepTest.abort flake on Windows
Issue Type: Bug Bug
Assignee: Unassigned
Components: workflow-durable-task-step-plugin
Created: 2019-06-05 17:19
Labels: flake
Priority: Minor Minor
Reporter: Jesse Glick

On CI I see a lot of flakes on Windows:

java.lang.AssertionError: org.jenkinsci.plugins.workflow.steps.durable_task.ShellStepTest$1@52070d1b
...	at org.jenkinsci.plugins.workflow.steps.durable_task.ShellStepTest.ensureForWhile(ShellStepTest.java:682)
...	at org.jenkinsci.plugins.workflow.steps.durable_task.ShellStepTest.abort(ShellStepTest.java:192)

The test is not written terribly well, but basically this means that the batch script running ping every second in a loop was sent a termination signal yet continued running for at least five seconds after the interrupt. Did the signal get lost? Sent to the wrong subprocess without breaking the loop? Was it going to get handled but the system was just too heavily loaded? Could probably improve test to:

  • Use a single process for the batch script, like ping -n 99999 127.0.0.1 >tmp.
  • Wait indefinitely (up to global test timeout) for the file to not have been touched in the last few seconds.
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)

jglick@cloudbees.com (JIRA)

unread,
Jun 5, 2019, 1:21:02 PM6/5/19
to jenkinsc...@googlegroups.com
Jesse Glick updated an issue
Change By: Jesse Glick
On CI I see a lot of flakes on Windows:

{code:none}
java.lang.AssertionError: org.jenkinsci.plugins.workflow.steps.durable_task.ShellStepTest$1@52070d1b
... at org.jenkinsci.plugins.workflow.steps.durable_task.ShellStepTest.ensureForWhile(ShellStepTest.java:682)
...
at org.jenkinsci.plugins.workflow.steps.durable_task.ShellStepTest.abort(ShellStepTest.java:192)
{code}


The test is not written terribly well, but basically this means that the batch script running {{ping}} every second in a loop was sent a termination signal yet continued running for at least five seconds after the interrupt. Did the signal get lost? Sent to the wrong subprocess without breaking the loop? Was it going to get handled but the system was just too heavily loaded? Could probably improve test to:

* Use a single process for the batch script, like {{ping -n 99999 127.0.0.1 >tmp}}.
* Wait indefinitely (up to global test timeout) for the file to _not_ have been touched in the last few seconds.
Reply all
Reply to author
Forward
0 new messages