PBS/Torque and iteration

Marcin Cieślik

unread,

Nov 27, 2013, 9:55:59 AM11/27/13

to andur...@googlegroups.com

Hi,

I am evaluating Anduril for the use in our next-gen sequencing lab. I pleased to have been able to run basic "local" workflows and implement custom nodes. However, I have trouble in finding a good way to execute components on our PBS cluster.

- In prefix mode an arbitrary prefix is added to each component _commandline. Since different components have different computational requirements (cpu, memory, etc.) it makes sense to be able to modify these resources on a per-component basis, a single prefix will not cut it. Is this possible for prefix jobs?

- PBS has two ways to execute jobs, "qsub" and "qsub -I" which seem equivalent to "sbatch" and "srun" in slurm ("qrun" does sth different and this is probably a mistake in the Anduril docs). Unfortunately "qsub -I" cannot be used instead of "srun" because it ignores the script (all input is interactive) and never returns. Just "qsub" on the other hand returns immediately. Do I assume correctly that Anduril assumes prefix jobs to "wait" i.e. not return until done? Along the lines of:

qwait

#!/usr/bin/env bash

JOBID=$(qsub "$@" | cut -f 1 -d '.')

STATUS="R"

while [ $STATUS = "R" ]; do

# keep running until R

sleep 1

STATUS=$(qstat $JOBID | tail -n +3 | sed 's/\s\+/ /g' | cut -f 5 -d " ")

done

- how difficult would it be to add a native "pbs" mode, could you point me to the implementation of Slurm in Anduril?

- in prefix mode it appears that paths passed to "qsub" are "local" and real (i.e. never include symlinks). So if on a local machine it is "/home/user/andruil/components" and remotely it is "/user_homes/user/andruil/components" the submitted job will fail. Since it is impossible to fake a directory structure using simlinks it appears that the prefix command will only work if the andruil workflow is started from one of the nodes with a shared filesystem. Is this correct?

- lets assume I have a directory of a 100s of files and I write a pipeline which processes them one-by-one. I would want to be able to run the pipeline easily for all files, but I would also like to be able re-run it when new files appear in the directory but only for those new files. How should I accomplish this?

Thanks,

Marcin

Ville Rantanen

unread,

Nov 28, 2013, 3:08:28 AM11/28/13

to andur...@googlegroups.com

Hi, and great to hear you trying out Anduril.

On Wednesday, November 27, 2013 4:55:59 PM UTC+2, Marcin Cieślik wrote:

Hi,

I am evaluating Anduril for the use in our next-gen sequencing lab. I pleased to have been able to run basic "local" workflows and implement custom nodes. However, I have trouble in finding a good way to execute components on our PBS cluster.

- In prefix mode an arbitrary prefix is added to each component _commandline. Since different components have different computational requirements (cpu,

memory, etc.) it makes sense to be able to modify these resources on a per-component basis, a single prefix will not cut it. Is this possible for prefix jobs?

for the exec-mode slurm, it is possible to request for example Component(inputs, @memory=40, @cpus=4, @host="hostname1"), but unfortunately this annotation method is not implemented in the custom prefix mode. It is possible however to create a prefix script, that detects which component is being run, and could make parameter choices based on that. ( The _command file is always referenced in the component call, and that file contains the name of the component to be run )

- PBS has two ways to execute jobs, "qsub" and "qsub -I" which seem equivalent to "sbatch" and "srun" in slurm ("qrun" does sth different and this is probably a mistake in the Anduril docs). Unfortunately "qsub -I" cannot be used instead of "srun" because it ignores the script (all input is interactive) and never returns. Just "qsub" on the other hand returns immediately. Do I assume correctly that Anduril assumes prefix jobs to "wait" i.e. not return until done? Along the lines of:

qwait
#!/usr/bin/env bash
JOBID=$(qsub "$@" | cut -f 1 -d '.')

STATUS="R"
while [ $STATUS = "R" ]; do
# keep running until R
sleep 1
STATUS=$(qstat $JOBID | tail -n +3 | sed 's/\s\+/ /g' | cut -f 5 -d " ")
done

This is correct. Sometimes I have created a temporary '_done' file somewhere in the filesystem that the qsub job will write after finishing the run. The prefix script will then poll for this file, and when it exists, the prefix script may exit. It is sometimes ( especially on NFS ) necessary to wait for all the files to be accessible across nodes before the script can exit. Good that you pointed out the qrun reference - it is my version of srun, that we replaced in an oracle grid engine environment.

- how difficult would it be to add a native "pbs" mode, could you point me to the implementation of Slurm in Anduril?

It could be fairly easily added in the java engine, but i fear you'd still have to tackle with the exiting problem. the nice thing about slurm, is that the command exits only after the job is done..

- in prefix mode it appears that paths passed to "qsub" are "local" and real (i.e. never include symlinks). So if on a local machine it is "/home/user/andruil/components" and remotely it is "/user_homes/user/andruil/components" the submitted job will fail. Since it is impossible to fake a directory structure using simlinks it appears that the prefix command will only work if the andruil workflow is started from one of the nodes with a shared filesystem. Is this correct?

Currently we assume a shared filesystem. Anduril has a previously used remote execution method, which must then be assigned separately to each component wished to be run on another node. that execution method can handle the path mapping. Again, the prefix script could detect paths in the command call, and change them as necessary.

- lets assume I have a directory of a 100s of files and I write a pipeline which processes them one-by-one. I would want to be able to run the pipeline easily for all files, but I would also like to be able re-run it when new files appear in the directory but only for those new files. How should I accomplish this?

You would use instance names based on the file names, for example

source=INPUT(path="in")

processing={}

files={}

for f:std.iterdir(source) {

filename=std.quote(f.name, type="Anduril")

files[filename]=INPUT(path=f.path)

processing[filename]=CSVJoin(files[filename])

}

Thanks,
Marcin

Ville Rantanen

unread,

Nov 28, 2013, 5:32:05 AM11/28/13

to andur...@googlegroups.com

Also, I can share the prefix scripts i've created for the grid engine ( interface for srun, with polling output files, but no path mapping )

Marcin Cieślik

unread,

Nov 28, 2013, 11:48:42 AM11/28/13

to andur...@googlegroups.com

Thank you for the clear answers, the prefix scripts would be greatly appreciated (I think the could also be bundled with Anduril - PBS is still very common).

In a recent (large-scale) evaluation of Anduril (http://prezi.com/evqldiemdwpe/anduril/). The author listed low cpu-utilization (overall and per-component) as a limitation of Anduril. The are many conflated issues there most importantly IO being a limited resource which is not allocated by the cluster scheduler or Anduril execution engine.

In your experience, does Anduril limit the execution time relative to what can be achieved using an "equivalent" Make-file (assuming that all CPU-intensive processing within components is delegated to binaries)?

From my initial assessment is should not, but possible culprits of such slow-down would include:

- non-negligible overhead of components executed thousands of times?

- unnecessary copies of files?

- executing multiple instances of the same component at the same time (i.e. leading to contention of CPU and free IO or the other way round)

Optimizing the wall-clock time of a workflow seems to require the difficult allocation of IO and CPU to specific components does Anduril in any way make it simpler or more difficult?

Thanks so much.

Yours,

Marcin

Reply all

Reply to author

Forward