Hi Alex,
A few clarifications if it's ok (and I apologize if it's trivial). This relates both to using the shared memory option as well as piping sam to the standard output to convert it to bam
What I'm basically trying to do is align multiple samples to multiple indexed genomes (each sample is aligned to its own reference genome) efficiently - space and time wise.
Since I'm using a university cluster my plan is for each reference genome to align all samples relating to it on one node, sequentially.
So, for example my lsf queue system job script will be:
#!/bin/sh
.
.
.
STAR <args.sample1> --genomeLoad LoadAndKeep --outStd SAM | samtools view -buS > <sample1.bam>
STAR <args.sample2> --genomeLoad LoadAndKeep --outStd SAM | samtools view -buS > <sample2.bam>
.
.
Is this the correct way to achieve my goals?
Also, I'm not sure whether the outFileNamePrefix and the outStd are mutually exclusive or not? I'm asking since I'm also interested in the logs and splice junctions outputs and I want them to be located in a specified location. If these options are no mutually exclusive am I guessing that specifying --outFileNamePrefix <prefix> --outStd SAM | samtools view -buS > <sample1.bam> would save the logs and splice junctions output to <prefix> and the sam output would be piped to standard output? If how do I achieve this?
Thanks a lot,
rubi