I also get segmentation fault on bacterial genome

Luciano Brocchieri

unread,

Sep 26, 2017, 10:25:15 AM9/26/17

to rna-star

Alex,

I also get segmentation fault (see two posts ago on bacterial genome alignment) and I tried your suggestions but unfortunately they did not work for me. I am trying to align to the Pseudomonas aeruginosa PAO1 genome 6.7M ribosome-footprint sequences of average length 30 nt. If this might help to figure it out, STAR goes through 1.2M reads successfully, until it finds this one:

@HWUSI-EAS1654:58:FC:7:6:18376:321580 length=23

CTGGCAGGTCTCTCAAAAAAAAA

+HWUSI-EAS1654:58:FC:7:6:18376:321580 length=23

DEBDDEGFFACGDDGGEGDDDDD

Entropy of the sequence does not seem to matter (I excluded all reads with H<1.5 and it still crashes). Length of the read does not seem to matter (it successfully aligns more than 20K reads of length <= 23nt (down to 16nt) before crashing. STAR crashes whenever it encounters this sequence, irrespective of where I put this read in the fastq file. I tried to force no-intron as you suggested, to no avail. I indexed the genome with --genomeSAindexNbases min(14, log2(GenomeLength) / 2 - 1) and no annotation file. I am running STAR with:

STAR --runThreadN 4 --genomeDir /IndexedGenomePath --readFilesIn ./Input.fastq --outFilterMultimapNmax 100

--outFilterMismatchNmax 5

Thanks!

Luciano

Alexander Dobin

unread,

Sep 27, 2017, 5:52:53 PM9/27/17

to rna-star

Hi Luciano,

please try to reduce the --genomeSAindexNbases even more than the formula suggest.

The problem has to do with the presence/absence of certain the k-mers in the genome.

Cheers

Alex

Luciano Brocchieri

unread,

Sep 29, 2017, 12:55:31 PM9/29/17

to rna-star

Hi Alex,

It worked! The formula gave a value of 10, I tried re-indexing the genome with --genomeSAindexNbases 8 and the alignment job finished successfully.