Hi David,
I think there are a couple of options to already do it:
- in the "rocket_launch" of the queue script, don't use "rlaunch" but rather some other command that wraps rlaunch and does your cleanup if it catches an error.
- modify the FireTasks themselves to try and catch the error. If this results in repeated code (e.g.,. there are 10 different FireTasks that should clean things up in the same way), one could use a generic function decorator to the run() function to help avoid this. The decorator can try the original run() function, but if it catches an error, to do something else. Then you can just add that decorator to the FireTasks.
If you think having an additional feature would be useful beyond those options, perhaps you can write a bit about how it might be better (i.e., what problem does it solve) and how it would be implemented. For example, one option is to have a special keyword in the spec like "_error_tasks" that links to a list of FireTasks (the same as any other FireTask so we don't need a new object, and so you can reuse existing CopyTasks if needed). In the Rocket code, if we catch the error and the _error_tasks key exists, it can execute the _error_tasks (which can also return a FWAction). There is not much harm to implementing this, but it would be nice to know a little more about the use case.