I also get segmentation fault on bacterial genome

31 views
Skip to first unread message

Luciano Brocchieri

unread,
Sep 26, 2017, 10:25:15 AM9/26/17
to rna-star
Alex,

I also get segmentation fault (see two posts ago on bacterial genome alignment) and I tried your suggestions but unfortunately they did not work for me. I am trying to align to the Pseudomonas aeruginosa PAO1 genome 6.7M ribosome-footprint sequences of average length 30 nt. If this might help to figure it out, STAR goes through 1.2M reads successfully, until it finds this one:

@HWUSI-EAS1654:58:FC:7:6:18376:321580 length=23
CTGGCAGGTCTCTCAAAAAAAAA
+HWUSI-EAS1654:58:FC:7:6:18376:321580 length=23
DEBDDEGFFACGDDGGEGDDDDD

Entropy of the sequence does not seem to matter (I excluded all reads with H<1.5 and it still crashes). Length of the read does not seem to matter (it successfully aligns more than 20K reads of length <= 23nt (down to 16nt) before crashing. STAR crashes whenever it encounters this sequence, irrespective of where I put this read in the fastq file. I tried to force no-intron as you suggested, to no avail. I indexed the genome with --genomeSAindexNbases min(14, log2(GenomeLength) / 2 - 1) and no annotation file. I am running STAR with:
STAR --runThreadN 4 --genomeDir /IndexedGenomePath --readFilesIn ./Input.fastq --outFilterMultimapNmax 100 
--outFilterMismatchNmax 5

Thanks!
Luciano
 

Alexander Dobin

unread,
Sep 27, 2017, 5:52:53 PM9/27/17
to rna-star
Hi Luciano,

please try to reduce the --genomeSAindexNbases even more than the formula suggest.
The problem has to do with the presence/absence of certain the k-mers in the genome.

Cheers
Alex

Luciano Brocchieri

unread,
Sep 29, 2017, 12:55:31 PM9/29/17
to rna-star
Hi Alex,

It worked! The formula gave a value of 10, I tried re-indexing the genome with --genomeSAindexNbases 8 and the alignment job finished  successfully.

Thanks!
Luciano
Reply all
Reply to author
Forward
0 new messages