Background
Normally if you vary inputs, parameters or outputs of a single component in Anduril, the whole downstream pipeline and all the loop-parallel instances will be triggered. However, each component also stores the way it was launched more or less exactly in a file called "_launch" though replicating the running environment is not a trivial task in principle.
NOTE: "_commandline" has been renamed to "_launch"!
Simply running "_launch" which is a bash script will rerun the component instance with identical configuration. The file is now executable by default and running it invokes the script described in this post, so all parameters are valid for it.
NewsAdded optional flag to remove e.g. slurm prefix to avoid enqueuing ( new flag is -s or --strip-prefix, and renamed old -s to -S for sleep ).
Return the Component's success code on exit (and tell about it)
Print where the component sits on the hard drive, to allow easier access to editing associated files.
Motivation
Often especially when developing a component or finding the source of error, it would be desirable to be able to
- alter erroneous inputs in an existing component or gauge problematic parameter values
- avoid running dozens of replicas of the same component inside a loop just to fix one issue
- skip invoking Anduril itself, which saves considerable time during iterative development
- see or redirect standard output without need to clean the output from Anduril's annotations
- develop a component in the real context of a pipeline without specifying its configuration in detached form, manually
- test whether a new program version you just installed would radically alter the results at a certain step
- run the component under a debugger or other utility, which is not possible in Anduril
- have full control and be able to inspect the execution environment by outputting the launcher string
If you're using someone else's component, you can ask them run the step, so they are able to focus on it in the context of your actual pipeline. Simply give them read access to the inputs and the component instance directory, and send them the path to it.Solution from command lineWith the anduril-run-instance utility residing under the Anduril source tree's "utils" directory the above is as easy as pointing to the output directory of the instance and then specifying those inputs, outputs and parameters you wish to alter, all others remaining intact. If you're familiar with the Anduril subcommand "run-component" which enables running a component through Anduril, invoking it is similar.
TutorialNotice that in order to not perturb the pipeline state, an alternative output directory needs to be specified with the -d flag, as in the example below.In all the examples below, you may skip "anduril-run-instance" and just use the _launch as a script directly.
Simply invoking anduril-run-instance will show you simple usage examples, but for now I'll copy paste them here but a more elaborate explanation follows.
Examples:
* Simple launching of an instance:
anduril-run-instance execute/_array_key1-instance
* The previous may work like this as well
./execute/_array_key1-instance
* Produce a modified command file so you can verify it is correct and output how it would be run:
anduril-run-instance -c -d my_outputs -IinputFile=myFile.bam -PmyParam=1234 -OpictureFolder=/home/pictures execute/_array_key1-instance
* Run a modified command file, redirecting all outputs by default to my_outputs.
Redirect one output to a separate location:
anduril-run-instance -d my_outputs -IinputFile=myFile.bam -PmyParam=1234 -OpictureFolder=/home/pictures execute/_array_key1-instance
Let's say you had a pipeline producing output in the default execute directory, and an instance called instance, found in an array called array, with the key key1. Running the pipeline ended in an error and Anduril has placed the components output under "execute/_array_key1-instance". Your component has a parameter called myParam and you want to set its value to 1234. And, you want to replace the input called inputFile by the value myFile.bam. Your component also outputs a set of pictures to the output port pictureFolder, but you want to build a gallery at a predefined location under your home folder. The rest of the output you would like to end up in the accessible location "my_outputs" in the current directory, instead of Anduril's default output directory which is an automatically generated, not very legible path name, and messing with that would also mean messing with the pipeline.
anduril-run-instance may either launch the component itself or output a modified command line that would launch it. The -c flag deters launching the component.
anduril-run-instance -c -d my_outputs -IinputFile=myFile.bam -PmyParam=1234 -OpictureFolder=$HOME/pictures execute/_array_key1-instance
Experimental tutorialYou may store the configuration with the -o flag, which outputs a modified _command file. Later you can input that with -i. In this case, the _launch file is still needed, so you still need to specify the component instance directory. If this feature makes any sense, it should be made optional... This would allow you to catalogue different cases for your component, and in essence allow you to use any component configuration as "just another command line utility".
anduril-run-instance -c -o mySavedCommand -d my_outputs -IinputFile=myFile.bam -PmyParam=1234 -OpictureFolder=$HOME/pictures execute/_array_key1-instance
# The next day, when you have e.g. edited the script in question
anduril-run-instance -i mySavedCommand execute/_array_key1-instance
IssuesSince this is a lightweight solution, I think it would be enough to document any issues you run into here, in one place.
- PROBLEM Component return value was ignored. Thanks Amjad for reporting.
- FIX Switch from os.system to subprocess, because os.system apparently uses non-standard error codes.
- PROBLEM Under slurm or other prefix mode, that prefix is part of the command file. This caused the temporary file be unreadable because the /tmp is different on different machines. Solution could be to replace the "srun" part from the command line, to always run the script locally. However there may be flags to "srun" which would need to be removed as well. So it would be easier to perhaps write the file in a known-to-work location. Thanks Julia for testing.
- FIX The temporary files are now written to $HOME/.tmp by default. This is configurable.
- PROBLEM the key=value syntax failed for parameter values such as parameter=this=one.
- FIX only split at the first =. Thanks Katherine for reporting.
- PROBLEM BashEvaluate doesn't output anything on the screen.
- SOLUTION This is because the component itself captures the standard output. You need to set the parameter -PechoStdOut=true (or whatever it's called). Thanks Amjad.
- PROBLEM When starting a debugger or something else interactive, such as pdb, there is no echo. Normally this works, so what's the cause? Thanks again, Julia.
- SOLUTION Likely you are using slurm or another Anduril prefix-mode without knowing, so use the -s flag to strip that portion of the command line.
- PROBLEM Determine how multiple line strings behave as parameters, because they should be encoded on a single line in the _command file, but they probably aren't.
- PROBLEM Could avoid breaking a pipeline accidentally by introducing a default value to -d. However - what would this default value be? And the directory better be empty in any case...
Feature ideas
- Record component name, date or other metadata in the produced _command file names. Or add a metadata field...
- Allow providing the inputs, outputs and parameters in a separate file, encoded e.g. in the properties (_command) format , json, yaml or something else which allows multiple line string parameters more naturally.
- Or just single parameters from a file - probably a more dynamic approach.
- Replacement of environment variables, e.g. in cases where the original environment has problems.
- Insert a debugger string or other prefix, like gdb, valgrind, some other profiler, ... preferably by replacing the usual prefix. Could e.g. be an optional value for the -p prefix stripping parameter.
- Generate an Anduril snippet with the new parameters, e.g. using anduril run-component -L
- Validate parameter types against component.xml
- Defaults file for options (e.g. whether to strip prefix)
- Write the produced _command file as the file _command under the specified directory (same for _launch). However this is not a good idea if running in the pipeline directory.
Features unlikely to be supported- Saving stdout/stderr to files: You can use shell redirection for this, with the -q flag for quiet there shouldn't be anything else on the screen
- Update _state file to reflect success. Would conflict with Anduril itself - hard to ensure consistency.