Vladimir
time cat case.part-02.fastq | parallel -j15 --block 24M --recstart '>' --pipe perl deconseq.pl -dbs btref -f case.part-02.fastq > ~/results
I picked the number of cores and the block size based on the total input file size and the number of reads deconseq appears to be processing at a pass ~44K. The total number of reads in the data should be ~660K
I started the command about 90 minutes ago on our centos cluster and it is still running. But it only took 6 minutes for deconseq to screen 21K reads with 1 core... so I'm not getting any speedup (and am probably slowing down). Sorry if this is too off topic. How do you parallelize?
Aaron
time cat case.part-02.fastq | parallel -j15 --block 24M --recstart '@' --pipe perl deconseq.pl -dbs btref -f case.part-02.fastq > ~/results
and got the same outcome. is stating the file name in both the parallel and deconseq command sections causing confusion?
time cat case.part-02.fastq | parallel -j15 --block 24M --recstart '^@' --pipe perl deconseq.pl -dbs btref -f case.part-02.fastq > ~/results
Then I get
parallel: Warning: A record was longer than 25165824. Increasing to --blocksize 32715573.
parallel: Warning: A record was longer than 32715573. Increasing to --blocksize 42530246.
parallel: Warning: A record was longer than 42530246. Increasing to --blocksize 55289321.
parallel: Warning: A record was longer than 55289321. Increasing to --blocksize 71876119.
parallel: Warning: A record was longer than 71876119. Increasing to --blocksize 93438956.
parallel: Warning: A record was longer than 93438956. Increasing to --blocksize 121470644.
parallel: Warning: A record was longer than 121470644. Increasing to --blocksize 157911839.
parallel: Warning: A record was longer than 157911839. Increasing to --blocksize 205285392.
parallel: Warning: A record was longer than 205285392. Increasing to --blocksize 266871011.
parallel: Warning: A record was longer than 266871011. Increasing to --blocksize 346932316.
Then parallel ended up only using a single core. That doesn't make any sense... my reads shouldn't be that big, max 250 nt