Mapping small RNA reads to mature miRNA index

903 views
Skip to first unread message

Matthew Jones

unread,
Jun 11, 2015, 2:49:53 PM6/11/15
to rna-...@googlegroups.com
Hello,

I have been trying to map sequences from a small RNA-seq experiment to a fasta containing the updated list of mature human miRNAs. I started out by generating the genome index as follows: star --runMode genomeGenerate --genomeDir miRNA --genomeFastaFiles Mature.homo_sapiens.fa --sjdbGTFtagExonParentTranscript hsa.gff3 --sjdbOverhang 100
And this finished successfully. The Mature.homo_sapiens. fa is the fasta containing all mature miRNA sequences and the hsa.gff is directly from miRBase.org. 

I then attempted to map as follows: star --runThreadN 12 --genomeDir miRNA --readFilesIn fastq/SRR1759212.fastq --genomeSAindexNbases 6 --outFilterMultimapNmax 20 --outFilterMismatchNoverLmax 0.05 --outFilterScoreMinOverLread 0 --outFilterMatchNminOverLread 0 --outFilterMatchNmin 16 --alignIntronMax 1 --seedSearchStartLmax 20

Jun 11 12:25:59 ..... Started STAR run

Jun 11 12:26:00 ..... Started mapping

Segmentation fault


I also tried the above without the genomeSAindexNbases 6 and ended up with the same segmentation fault. 


I would greatly appreciate any help and/or advice on how best to address this issue.


Thank you in advance.


Matt


Log 1.out

Alexander Dobin

unread,
Jun 11, 2015, 7:05:26 PM6/11/15
to rna-...@googlegroups.com, mjone...@gmail.com
Hi Matt,

the --genomeSAindexNbases 6 parameter has to be used at the genome generation step. Please re-generate the genome using this parameter and try mapping again.

Cheers
Alex

Matthew Jones

unread,
Jun 12, 2015, 9:37:22 PM6/12/15
to rna-...@googlegroups.com
Great. It worked. Thanks for the help.
Matt


On Thursday, June 11, 2015 at 2:49:53 PM UTC-4, Matthew Jones wrote:

Roy Francis

unread,
Apr 14, 2017, 7:13:52 AM4/14/17
to rna-...@googlegroups.com
I am also trying to map my small RNA reads to miRBase hairpin.fa and mature.fa. I noticed in thread that the OP is mapping small RNA reads to 'Mature.homo_sapiens.fa' using annotations as 'hsa.gff3'. If I understand correctly, the hsa.gff3 is coordinates of the miRNAs on the whole human genome and not on 'Mature.homo_sapiens.fa'. And most of the extra parameters used (--sjdbOverhang,--alignIntronMax etc) applies only when mapping the small RNAs to the whole genome. Mapping small RNAs to hairpin.fa or mature.fa must be a lot simpler (A DNASeq mapper rather than an RNASeq mapper). Please correct me if I am wrong.

Also, does anyone know what would be the optimal star parameters for mapping smallRNAs to mature.fa and hairpin.fa? Does something simple like below suffice? 

star --runMode genomeGenerate --genomeFastaFiles "path/hairpin.fa" --genomeDir "path/hairpin-index"
star --genomeDir "path/hairpin-index" --readFilesIn read.fq

Thanks,
Roy


Alexander Dobin

unread,
Apr 17, 2017, 4:18:17 PM4/17/17
to rna-star
Hi Roy,

if your RNAs are not spliced, you can use DNA or RNA mapper, since the only advantage of the latter is the ability to map spliced reads.
Your index generation command looks fine.
At the mapping stage, if you want to prohibit splicing in STAR, you need to use --alignIntronMin 1 . If you have paired-end reads, you would also need to specify --alignMatesGapMax <MaxInsertSize> .

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages