Hi Wouter,
I'm afraid we never had that behaviour of SortMeRNA. We did a small test to check it, but using
"htop" (or task manager for instance) we could observe as all threads were used as selected in "settings.txt".
If you check the file "m2-time.tsv" you will find the command that is launched by the pipeline, and you can "play" with
it independently of the pipeline. This is the command that our pipeline ran for the test:
sortmerna -n 5 \
--db <fullPath>/0-Databases/1-SILVA-23S-28S-LSURef_115_tax_silva.fasta.trimmedwhitespaces.inDNA.fasta \
<fullPath>/0-Databases/1-SILVA-16S-18S-SSURef_115_NR99_tax_silva.trimmedwhitespaces.inDNA.fasta \
<fullPath>/0-Databases/rfam-5s-database-id98.fasta \
<fullPath>/0-Databases/trna_db.fasta \
<fullPath>/0-Databases/phix_db.fasta \
--I 1-PROCESSED_SAMPLES/<sampleName>/2-rRNA-removal/m2-temp/<sampleName>_m2_interleaved.fastq \
--accept 1-PROCESSED_SAMPLES/<sampleName>/2-rRNA-removal/m2-output/interleaved_rrna/irrna \
--other 1-PROCESSED_SAMPLES/<sampleName>/2-rRNA-removal/m2-output/interleaved_mrna/imrna \
--log 1-PROCESSED_SAMPLES/<sampleName>/2-rRNA-removal/m2-log/log_sortmerna \
--bydbs \
-m 3086031 \
--paired-out \
-a 30 \
-v
Another thing you could test is to run the same command in another OS (like the one we used : Ubuntu14.04) using
a VirtualMachine or Docker. You could also take a subsampling (here they propose many ways:
https://www.biostars.org/p/6544/;
remember the paired order must be maintained) of your fastq files, interleave them with:
merge-paired-reads.sh \
<fullPath>/1-QC/m1-output/<sampleName>_1_m1.fastq \
<fullPath>/1-QC/m1-output/<sampleName>_2_m1.fastq \
<fullPath>/1-PROCESSED_SAMPLES/<sampleName>/2-rRNA-removal/m2-temp/<sampleName>_m2_interleaved.fastq
and use the output as input for the sortmerna to see if it works with the subsample.
Hope it helps