Hello Anurag,
I would like to thank for
this software and your previous post on integration sequenceserver with
HPC, I have question related to implementation.
I replaced the temp locations in blast.rb and I replaced
the blast binaries with sge wrappers,jobs got submitted to cluster, Because of Job result which is in BLAST Archive format with different filename the Sequenceserver not
able to generate xml. Can you help us how we can resolve this issue.
On Wednesday, 18 March 2015 12:39:36 UTC+5:30, Anurag Priyam wrote:Thanks!
Based
on user input SequenceServer constructs a command (just like you would
create a command without SequenceServer e.g. blastp -query foo.fa -db
"bar.fa baz.fa") which is then executed in the shell with due security
considerations. Output, in BLAST Archive format (-outfmt 11), is
redirected to a file. We then obtain XML output from the archive file
using blast_formatter (again, output is redirected to a file). We parse
the XML and generate HTML ourselves. The same archive file is used to
generate XML and tabular report for download.
We
used pipes in the very early days of SequenceServer (when we were just
starting out) but soon felt that pipes were unreliable. So not anymore.
Query sequences are written to a file and passed to blast using -query
option instead of piping from stdin. Output is written to a file which
is subsequently read instead of reading from a pipe.
For
antgenomes.org,
which is hosted on a thin server but runs BLAST on a 48 core fat
machine (designated node on QMUL's HPC cluster), we simply replace
BLAST+ binaries with a shim that executes BLAST on the fat machine via
ssh:
#!/usr/bin/env sh
ssh <host name> /path/to/blastn "$@"
The
same scheme can be used to queue jobs _if_ the queuing system allows
waiting on a job id. I guess the corresponding shim would look something
like:
#!/usr/bin/env sh
job_id=<auto generate somehow>
qsub -N $job_id /path/to/blastn "$@"
qusb -hold_jid $job_id
(or use -sync option maybe)
If
waiting on job id is not allowed in the conventional UNIX sense, it
will not work because SequenceServer processes requests synchronously.
That bit is due to change soon though.
I hope
this helps. Please let us know if you took the above suggestions to
integrate SequenceServer into an HPC system. We will be happy to help
along the way.
-- Priyam