Hi,
I cannot get the list of failed node(s) from within an Error Handler
My job is
===================================================
- defaultTab: output
description: ''
executionEnabled: true
group: Elementary Tasks
id: 79cb7340-2aea-405b-aa0d-11061167a170
loglevel: DEBUG
name: 01.Step 1
nodeFilterEditable: false
nodefilters:
dispatch:
excludePrecedence: true
keepgoing: true
rankOrder: ascending
successOnEmptyNodeFilter: false
threadcount: '1'
filter: tags:remote
nodesSelectedByDefault: true
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- args: ${option.MAKE_FAIL}
script: |
#!/bin/bash
if [ "$1" == 'Yes' ]; then
echo "Step 1 has failed on $(hostname)"
exit 1
else
echo "Step 1 succeeded on $(hostname)"
fi
keepgoing: false
strategy: node-first
uuid: 79cb7340-2aea-405b-aa0d-11061167a170
===================================================
And the error handler, which is correctly called :
===================================================
- defaultTab: output
description: ''
executionEnabled: true
group: Elementary Tasks
id: 65be3205-4bda-4d13-b3fb-b2b15af881b7
loglevel: DEBUG
name: 02.Error Handler
nodeFilterEditable: false
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- args: ${result.failedNodes}
script: |-
#!/bin/bash
echo "ARGS = $@"
echo "Failed Nodes = @result.failedNodes@"
echo "-----------------------------------------------"
echo "if Step 1 failed, I am in error hanlder"
echo " Trying to fix condition for Step1"
keepgoing: false
strategy: node-first
uuid: 65be3205-4bda-4d13-b3fb-b2b15af881b7
===================================================
The output is :
===================================================
Step 1 succeeded on sup-dev-tgt0041.tld
Step 1 has failed on sup-dev-tgt0042tld
Failed: NonZeroResultCode: [ssh-exec] Result code: 1
Step 1 succeeded on sup-dev-tgt0043.tld
ARGS =
${result.failedNodes}
Failed Nodes =
-----------------------------------------------
if Step 1 failed, I am in error handler
Trying to fix condition for Step1
===================================================
Where is my error ?
Thanks,
Regards
Xavier
-- Xavier Humbert CRT Supervision et Exploitation de Niveau 1 Rectorat de Nancy-Metz 03 83 86 27 39
Hi Xavier,
I made an example with a little bit different approach. Works in this way:
1- The “Step 1” job, this job contains a script step dispatched to all nodes, this step uses an error handler which calls another job passing the ${result.failedNodes}
as an argument.
- defaultTab: output
description: ''
executionEnabled: true
group: Elementary Tasks
id: 79cb7340-2aea-405b-aa0d-11061167a170
loglevel: INFO
name: 01.Step 1
nodeFilterEditable: false
nodefilters:
dispatch:
excludePrecedence: true
keepgoing: true
rankOrder: ascending
successOnEmptyNodeFilter: false
threadcount: '1'
filter: 'name: node.*'
nodesSelectedByDefault: true
options:
- name: MAKE_FAIL
value: 'Yes'
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- args: ${option.MAKE_FAIL}
errorhandler:
jobref:
args: -node_failed ${result.failedNodes}
group: Elementary Tasks
name: 02.Error Handler
nodeStep: 'true'
uuid: 65be3205-4bda-4d13-b3fb-b2b15af881b7
fileExtension: .sh
interpreterArgsQuoted: false
script: |
#!/bin/bash
if [ "$1" == 'Yes' ]; then
echo "Step 1 has failed on $(hostname)"
exit 1
else
echo "Step 1 succeeded on $(hostname)"
exit 0
fi
scriptInterpreter: /bin/bash
keepgoing: false
strategy: node-first
uuid: 79cb7340-2aea-405b-aa0d-11061167a170
2- The “Error handler” job, this job “receives” via options the variable (from “Step 1” job error handler) in case of the node fails in the first job. Basically, this job saves the failed node names in a work file to print that list later.
- defaultTab: output
description: ''
executionEnabled: true
group: Elementary Tasks
id: 65be3205-4bda-4d13-b3fb-b2b15af881b7
loglevel: INFO
name: 02.Error Handler
nodeFilterEditable: false
options:
- name: node_failed
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- args: ${result.node_failed}
interpreterArgsQuoted: false
script: |
#!/bin/bash
echo "@option.node_failed@" >> failed_nodes_list.txt
scriptInterpreter: /bin/bash
keepgoing: false
strategy: node-first
uuid: 65be3205-4bda-4d13-b3fb-b2b15af881b7
3- The Parent job: first, calls the first job, then calls the second job and later gets the file and prints their content (failed node list).
- defaultTab: nodes
description: ''
executionEnabled: true
group: Elementary Tasks
id: c6a5573b-11b2-4871-bddb-f1100fbbeaa6
loglevel: INFO
name: 00. ParentJob
nodeFilterEditable: false
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- jobref:
group: Elementary Tasks
name: 01.Step 1
nodeStep: 'true'
uuid: 79cb7340-2aea-405b-aa0d-11061167a170
- jobref:
group: Elementary Tasks
name: 02.Error Handler
nodeStep: 'true'
uuid: 65be3205-4bda-4d13-b3fb-b2b15af881b7
- fileExtension: .sh
interpreterArgsQuoted: false
script: |-
echo "failed nodes:"
cat failed_nodes_list.txt
echo "" > failed_nodes_list.txt
scriptInterpreter: /bin/bash
keepgoing: true
strategy: sequential
uuid: c6a5573b-11b2-4871-bddb-f1100fbbeaa6
Here the result.
Hope it helps!