Sortmerna database for funngi

23 views
Skip to first unread message

Dasiel Obregon

unread,
Nov 15, 2022, 11:47:21 AM11/15/22
to SAMSA bioinformatics group
Dear all,
It is my firth time using sansa2. I'm processing samples from fungi, sequenced by Illumina with ribosome depletion method.

Sansa2 (master_script_preserving_unmerged) was running fine, until the end of step 2, and the error message suggest problems with the database from sortmerna:

root@DESKTOP:/mnt/d/Metatranscriptomics/samsa2/bash_scripts# ./master_script_preserving_unmerged.sh
NOTE: Before running this script, please run package_installation.bash and full_database_download.bash located at https://github.com/transcript/samsa2/tree/master/setup in order to set up SAMSA2 dependencies.

Using SAMSA at /mnt/d/Metatranscriptomics/samsa2

Step 0.1: Checking for the presence of the checkpoint file.
        The file 'checkpoints' already exists in the input directory.
        The variable TRIMMO is in the checkpoint file. STEP 1 will be skipped.
        The variable MERGING is in the checkpoint file. STEP 2 will be skipped.
        The variable RAW is in the checkpoint file. STEP 2.9 will be skipped.

  ERROR: the file /mnt/d/Metatranscriptomics/samsa2/step_2_output/*.assembled2.fastq could not be opened: No such file or directory.

'/mnt/d/Metatranscriptomics/samsa2/programs/sortmerna-2.1/sortmerna -a 24 --ref /mnt/d/Metatranscriptomics/samsa2/programs/sortmerna-2.1/rRNA_databases/silva-euk-18s-id95.fasta,/mnt/d/Metatranscriptomics/samsa2/programs/sortmerna-2.1/index/silva-euk-18s-db --reads /mnt/d/Metatranscriptomics/samsa2/step_2_output/*.assembled2.fastq --aligned /mnt/d/Metatranscriptomics/samsa2/step_2_output/*.assembled2.fastq.ribosomes --other /mnt/d/Metatranscriptomics/samsa2/step_2_output/*.ribodepleted --fastx --log -v' exited with non-zero status 1


At this point, I tried to index the db silva-euk-18s-id95 in sortmerna, and teaorically it worked:

./indexdb_rna --ref ./rRNA_databases/silva-euk-18s-id95.fasta,./index/silva-euk-18s-db -v

  Total number of databases to index: 1

  Begin indexing file ./rRNA_databases/silva-euk-18s-id95.fasta under index name ./index/silva-euk-18s-db:
  Collecting sequence distribution statistics ..  done  [0.119842 sec]

  start index part # 0:
    (1/3) building burst tries .. done  [7.208371 sec]
    (2/3) building CMPH hash .. done  [5.508265 sec]
    (3/3) building position lookup tables .. done [19.056150 sec]
    total number of sequences in this part = 7348
      temporary file was here: /tmp/sortmerna_keys_10985.txt
      writing kmer data to ./index/silva-euk-18s-db.kmer_0.dat
      writing burst tries to ./index/silva-euk-18s-db.bursttrie_0.dat
      writing position lookup table to ./index/silva-euk-18s-db.pos_0.dat
      writing nucleotide distribution statistics to ./index/silva-euk-18s-db.stats
    done.


But still, the master_script_preserving_unmerged is stopping at the same point.
Please, I will appreciate any observation or contribution to fix the problem.

Best wishes, 

Dasiel Obregon

unread,
Nov 16, 2022, 1:48:43 PM11/16/22
to SAMSA bioinformatics group
Any update?
Reply all
Reply to author
Forward
0 new messages