Hello,
I am trying to align reads to a rhesus macaque .fna reference and .gff annotation. The genome builds properly, the command used:
/hpcdata/lmm/lmm_data/muddjc/STAR-2.5.2b/bin/Linux_x86_64/STAR
--runMode genomeGenerate
--runThreadN 24
--genomeDir <Mmul_8.0.1_genomic.fna>
--sjdbGTFfile <GCF_000772875.2_Mmul_8.0.1_genomic.gff>
--sjdbGTFtagExonParentTranscript Parent
--sjdbOverhang 100
--genomeChrBinNbits min
--genomeSAindexNbases 13
I am attaching the log.out for this.
I'm encountering a problem when I proceed to map:
Nov 09 15:29:40 ..... started STAR run
Nov 09 15:29:41 ..... loading genome
Nov 09 15:30:07 ..... processing annotations GTF
Nov 09 15:30:23 ..... inserting junctions into the genome indices
Nov 09 15:31:43 ..... started mapping
For these runs STAR is getting stuck at the mapping step. It creates the header in the log.progress.out, but does not begin mapping to each chromosome. I am wondering if it's due to the particular .gff format? I had previously ran these reads against a different rhesus .fna and a .gtf file, which aligned successfully. Below are a few lines of the .gff file:
NC_027893.1 RefSeq region 1 225584828 . + . ID=id0;Dbxref=taxon:9544;Name=1;chromosome=1;country=USA: Southwest National Primate Research Center at the Southwest Fou
ndation for Biomedical Research%2C San Antonio%2C TX;gbkey=Src;genome=chromosome;isolate=17573;mol_type=genomic DNA;note=derived from Indian origin rhesus;sex=female
NC_027893.1 Gnomon gene 15791 22125 . - . ID=gene0;Dbxref=GeneID:106999150;Name=LOC106999150;gbkey=Gene;gene=LOC106999150;gene_biotype=lncRNA
NC_027893.1 Gnomon ncRNA 15791 22125 . - . ID=rna0;Parent=gene0;Dbxref=GeneID:106999150,Genbank:XR_001445959.1;Name=XR_001445959.1;gbkey=ncRNA;gene=LOC106999150;model_evide
nce=Supporting evidence includes similarity to: 1 mRNA%2C 27 ESTs%2C and 99%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 7 samples with support for all annotated intro
ns;ncrna_class=lncRNA;product=uncharacterized LOC106999150%2C transcript variant X2;transcript_id=XR_001445959.1
NC_027893.1 Gnomon exon 22088 22125 . - . ID=id1;Parent=rna0;Dbxref=GeneID:106999150,Genbank:XR_001445959.1;gbkey=ncRNA;gene=LOC106999150;ncrna_class=lncRNA;product=unchar
acterized LOC106999150%2C transcript variant X2;transcript_id=XR_001445959.1
NC_027893.1 Gnomon exon 17186 21651 . - . ID=id2;Parent=rna0;Dbxref=GeneID:106999150,Genbank:XR_001445959.1;gbkey=ncRNA;gene=LOC106999150;ncrna_class=lncRNA;product=unchar
acterized LOC106999150%2C transcript variant X2;transcript_id=XR_001445959.1