RSEM can not recognize reference sequence name

4,217 views
Skip to first unread message

David Balli

unread,
Oct 13, 2014, 8:24:41 AM10/13/14
to rsem-...@googlegroups.com
Hi,

I am trying to use RSEM with SAM/BAM files from STAR alignment following directions here.  When running rsem-calculate-expression on Aligned.out.sam/Aligned.out.bam from STAR, I receive this error: "RSEM can not recognize reference sequence name chr10!".  I tried running the native bowtie alignment on the same RNAseq reads with RSEM and received the same error.  Also, I ran rsem-sam-validator and the input was said to be valid and have re-ran rsem-prepare-reference to no avail.  

Any suggestions?

thank you! 

David 

Colin Dewey

unread,
Oct 13, 2014, 2:07:31 PM10/13/14
to rsem-...@googlegroups.com
Hi David,

That error message suggests that you are still getting alignments with respect to the genome, rather than to the transcriptome.  The thread that you referenced has directions that are outdated.  Since the time of that thread, STAR has gained the "--quantMode TranscriptomeSAM" option, which tells STAR to convert its genome-based alignments to transcript-based alignments that are compatible with RSEM.  So the steps are:

(1) Create a normal genome-based reference for STAR using the genome sequence and a GTF annotation
(2) Create a RSEM reference with the same genome and GTF you gave to STAR 
(3) Run STAR with the "--quantMode TranscriptomeSAM" option
(4) Feed the resulting transcriptome-based SAM (or BAM) file to RSEM

Best,
Colin

--
RSEM website: http://deweylab.biostat.wisc.edu/rsem/
---
You received this message because you are subscribed to the Google Groups "RSEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rsem-users+...@googlegroups.com.
To post to this group, send email to rsem-...@googlegroups.com.
Visit this group at http://groups.google.com/group/rsem-users.

marco trizzino

unread,
Oct 29, 2015, 6:05:16 PM10/29/15
to RSEM Users


Hi Colin,
I am having a similar problem with rsem. I am trying to quantify expression of RNA-seq data from 7 different Primates species, including humans.
I used STAR for alignment with the following conditions:

STAR INDEX GENERATION:
STAR --limitGenomeGenerateRAM 240000000000 --runThreadN 8 --runMode genomeGenerate --genomeDir DIRECTORY --genomeFastaFiles genome.fa --sjdbGTFfile ENSEMBLE ANNOTATIONS

STAR ALIGNMENT:
STAR --limitGenomeGenerateRAM 240000000000 --genomeDir $3 --readFilesIn $1"_trimmed.fq" --outSAMtype SAM --outFilterMultimapNmax 10 --outFilterMismatchNmax 10 --outFilterMismatchNoverLmax 0.3 --alignIntronMin 21 --alignIntronMax 0 --alignMatesGapMax 0 --alignSJoverhangMin 5 --runThreadN 12 --twopassMode Basic --twopass1readsN 60000000 --sjdbOverhang 100 --quantMode TranscriptomeSAM --outFileNamePrefix $output_STAR

RSEM INDEX GENERATION:
rsem-prepare-reference genome.fa --gtf ENSEMBLE ANNOTATIONS output_rsem

RSEM QUANT:
rsem-calculate-expression --bam $output_STAR.bam rsem_reference quant_output_name

But I keep getting this error: RSEM can not recognize reference sequence name chr1!

Thanks for your help!!

Marco





Colin Dewey

unread,
Oct 30, 2015, 3:37:28 PM10/30/15
to RSEM Users
Hi Marco (and others feeding STAR alignments to RSEM),

You will need to pass the STAR output file that ends with prefix “Aligned.toTranscriptome.out.bam” to RSEM, not the genomic alignment BAM file that is STAR’s primary output.

Best,
Colin

Alexander Predeus

unread,
Dec 5, 2015, 2:38:27 PM12/5/15
to RSEM Users
Dear Colin,

actually (I haven't used RSEM for a while and not sure when did this happen), the problem seems to arise when using STAR's version of the transcriptome bam also.

If you are using GTF to prepare RSEM reference, RSEM now adds _<gene_name>-<transcript_number> to it - and hence can't match the transcript names in the bam file to the ones in the reference.

There probably should be options to avoid this, e.g. not to add the gene name and tr. number to the transcript id when making the reference.

All the best,

-- Alex

Colin Dewey

unread,
Dec 5, 2015, 5:41:03 PM12/5/15
to RSEM Users
Dear Alex,

This issue should be fixed in RSEM v1.2.25.  Please let us know if you are still having problems.

Best,
Colin

Alexander Predeus

unread,
Feb 19, 2016, 8:40:36 AM2/19/16
to RSEM Users
Colin,

thank you - it does work now. For others reading, I wanted to remind that it's also necessary to re-make the reference with the new version of RSEM.

All the best,

-- Alex

Mousheng Xu

unread,
Oct 4, 2016, 9:41:18 AM10/4/16
to RSEM Users

I have a set of BAM files from somewhere else which I don't know how they were generated. I have RSEM reference files generated with RSEM (bowtie, gtf, genomic fa). When running rsem-calcuate-expression, I got the following: 

rsem-calculate-expression --bam mysample1.bam ~/softspace/reference/RSEM_bowtie_hg19_genes_reference test
rsem-parse-alignments /ark/home/mx010/softspace/reference/RSEM_bowtie_hg19_genes_reference test.temp/test test.stat/test KP9-1_Ctl2.bam 1 -tag XM
Warning: The SAM/BAM file declares less reference sequences (35) than RSEM knows (41970)!
RSEM can not recognize reference sequence name chrM!

The same reference files work well for BAM files generated by HISAT/Cufflinks.

Thanks in advance for your help!

Bo Li

unread,
Oct 4, 2016, 3:33:45 PM10/4/16
to rsem-...@googlegroups.com
Hi Mousheng,

You should align your reads to a set of transcripts instead of the
genome.

Hope it helps,
Bo
> --
> RSEM website: http://deweylab.biostat.wisc.edu/rsem/ [1]
> ---
> You received this message because you are subscribed to the Google
> Groups "RSEM Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to rsem-users+...@googlegroups.com.
> To post to this group, send email to rsem-...@googlegroups.com.
> Visit this group at https://groups.google.com/group/rsem-users [2].
>
>
> Links:
> ------
> [1] http://deweylab.biostat.wisc.edu/rsem/
> [2] https://groups.google.com/group/rsem-users
Reply all
Reply to author
Forward
0 new messages