Alex,
I also get segmentation fault (see two posts ago on bacterial genome alignment) and I tried your suggestions but unfortunately they did not work for me. I am trying to align to the Pseudomonas aeruginosa PAO1 genome 6.7M ribosome-footprint sequences of average length 30 nt. If this might help to figure it out, STAR goes through 1.2M reads successfully, until it finds this one:
@HWUSI-EAS1654:58:FC:7:6:18376:321580 length=23
CTGGCAGGTCTCTCAAAAAAAAA
+HWUSI-EAS1654:58:FC:7:6:18376:321580 length=23
DEBDDEGFFACGDDGGEGDDDDD
Entropy of the sequence does not seem to matter (I excluded all reads with H<1.5 and it still crashes). Length of the read does not seem to matter (it successfully aligns more than 20K reads of length <= 23nt (down to 16nt) before crashing. STAR crashes whenever it encounters this sequence, irrespective of where I put this read in the fastq file. I tried to force no-intron as you suggested, to no avail. I indexed the genome with --genomeSAindexNbases min(14, log2(GenomeLength) / 2 - 1) and no annotation file. I am running STAR with:
STAR --runThreadN 4 --genomeDir /IndexedGenomePath --readFilesIn ./Input.fastq --outFilterMultimapNmax 100
--outFilterMismatchNmax 5
Thanks!
Luciano