rsem-prepare-reference failure

664 views
Skip to first unread message

Brian Burger

unread,
Feb 19, 2015, 1:53:20 PM2/19/15
to rsem-...@googlegroups.com
Hi all, 

I'm having some trouble with rsem-prepare-reference. 

The following command:

rsem-prepare-reference --gtf Rhodospeudomonas_palustris_chrOnly_cleaned.gtf --bowtie --bowtie-path /opt/bifxapps/bowtie/ --bowtie2 --bowtie2-path /opt/bifxapps/bowtie2/ NC_005296.fa palustris_reference

produces a lot of activity, but ultimately results in the following:

Error Message: Cannot separate the identifier from the value for attribute ID=gene0!
"rsem-extract-reference-transcripts palustris_reference 0 Rhodospeudomonas_palustris_chrOnly_cleaned.gtf 0 NC_005296.fa" failed! Plase check if you provide correct parameters/options for the pipeline!

Any help would be much appreciated!

Thanks,
Brian

Colin Dewey

unread,
Feb 19, 2015, 2:48:22 PM2/19/15
to rsem-...@googlegroups.com
Hi Brian,

This error suggests that your GTF file is not of the format required by RSEM.  RSEM requires GTF v2.2 format, as described here:


If you think your GTF file is in the correct format, maybe provide a few lines of it (particularly for “gene0”) to see if we can help debug the situation.

Best,
Colin

--
RSEM website: http://deweylab.biostat.wisc.edu/rsem/
---
You received this message because you are subscribed to the Google Groups "RSEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rsem-users+...@googlegroups.com.
To post to this group, send email to rsem-...@googlegroups.com.
Visit this group at http://groups.google.com/group/rsem-users.

Brian Burger

unread,
Feb 20, 2015, 9:55:14 AM2/20/15
to rsem-...@googlegroups.com
Hi Colin, 

Thanks for the quick reply. Indeed, I think the [attributes] field is the problem. It's chock full of information that's not needed by RSEM, and in the wrong format to boot. It seems I'll have to configure this field manually. 

Many thanks,
Brian

ngsbioin...@gmail.com

unread,
Feb 23, 2015, 1:01:26 PM2/23/15
to rsem-...@googlegroups.com
I see this as well.  The NCBI references include a gff3 file, using gff spec version 1.2. The NCBI gff file needs to be reformatted for RSEM.  It would be nice for RSEM to understand the NCBI gff format as well. 

Brian Burger

unread,
Feb 26, 2015, 3:37:36 PM2/26/15
to rsem-...@googlegroups.com
Hi Colin and others, 

After spending some time getting my GTF file in order, I've run into a new problem. 

$ rsem-prepare-reference --gtf R_palustris_gtf.gtf --bowtie --bowtie-path /opt/bifxapps/bowtie/ --bowtie2 --bowtie2-path /opt/bifxapps/bowtie2/ palustris.fa palustris_reference
rsem-extract-reference-transcripts palustris_reference 0 R_palustris_gtf.gtf 0 palustris.fa
According to the GTF file given, a transcript has exons from different orientations!
"rsem-extract-reference-transcripts palustris_reference 0 R_palustris_gtf.gtf 0 palustris.fa" failed! Plase check if you provide correct parameters/options for the pipeline!

Any help?

Thanks!
Brian

Bo Li

unread,
Mar 1, 2015, 3:03:11 AM3/1/15
to rsem-...@googlegroups.com
Hi Brian,

The error suggest that for one transcripts, you have at least two exons,
which orientations are different. Can you send me the original gff3
file? I have a script to convert gff3 to gft and I'd like to see if it
works for your gff3 file.

Best,
Bo
> --
> RSEM website: http://deweylab.biostat.wisc.edu/rsem/ [1]
> ---
> You received this message because you are subscribed to the Google
> Groups "RSEM Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to rsem-users+...@googlegroups.com.
> To post to this group, send email to rsem-...@googlegroups.com.
> Visit this group at http://groups.google.com/group/rsem-users [2].
>
>
> Links:
> ------
> [1] http://deweylab.biostat.wisc.edu/rsem/
> [2] http://groups.google.com/group/rsem-users

Brian Burger

unread,
Mar 2, 2015, 12:41:41 PM3/2/15
to rsem-...@googlegroups.com
Hi Bo, 

Thanks for your response. The problem resulted from transcript_ids being the same for both chromosomal and plasmid transcripts. I thought the <seqname> field would differentiate between chromosomal and plasmid transcripts. In any case, it was an easy fix and I was able to run rsem-prepare-reference without error after altering the transcript_ids for the plasmid transcripts. 

If you're still interested, I've sent you the gff file to test with your script. If you are able to easily convert the NCBI gff file to an RSEM-compatible gtf I'm sure the community would be very interested. 

Thanks,
Brian
Rhodospeudomonas_palustris.gff

haichao...@gmail.com

unread,
Aug 26, 2016, 3:39:56 PM8/26/16
to RSEM Users
Dear Sir, 
      I have met the same question. The error is like this :
      rsem-extract-reference-transcripts ./rnor6 0 ./rnor6.protein_coding_clean.gtf None 0 ./rnor6.fa
Parsed 200000 lines
Parsed 400000 lines
Parsed 600000 lines
Parsing gtf File is done!
 (ASCII code 13), at line 2, position 282763075!ter,
"rsem-extract-reference-transcripts ./rnor6 0 ./rnor6.protein_coding_clean.gtf None 0 ./rnor6.fa" failed! Plasns for the pipeline!
     How could I solve the problem? Thank you !
Haichao Wei

在 2015年3月1日星期日 UTC-6上午2:03:11,Bo Li写道:

Bo Li

unread,
Aug 27, 2016, 3:17:57 PM8/27/16
to rsem-...@googlegroups.com
Hi Haichao,

It suggest rnor6.fa might be created from Windows OS. The line separator
is "\r\n" instead of "\n". What you can do is to write a script to
remove "\r" from your rnor6.fa file.

Hope it helps,
Bo
>>>  RSEM website: http://deweylab.biostat.wisc.edu/rsem/ [1] [1]
>>>  ---
>>>  You received this message because you are subscribed to the
>> Google
>>> Groups "RSEM Users" group.
>>>  To unsubscribe from this group and stop receiving emails from
>> it,
>>> send an email to rsem-users+...@googlegroups.com.
>>>  To post to this group, send email to rsem-...@googlegroups.com.
>>>  Visit this group at http://groups.google.com/group/rsem-users
>> [2] [2].
>>>
>>>
>>> Links:
>>> ------
>>> [1] http://deweylab.biostat.wisc.edu/rsem/ [1]
>>> [2] http://groups.google.com/group/rsem-users [2]
>
> --
> RSEM website: http://deweylab.biostat.wisc.edu/rsem/ [1]
> ---
> You received this message because you are subscribed to the Google
> Groups "RSEM Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to rsem-users+...@googlegroups.com.
> To post to this group, send email to rsem-...@googlegroups.com.
> Visit this group at https://groups.google.com/group/rsem-users [3].
> [3] https://groups.google.com/group/rsem-users

haichao...@gmail.com

unread,
Aug 29, 2016, 2:07:41 PM8/29/16
to RSEM Users
I tried, but it not worked. 
rsem-extract-reference-transcripts ./rnor6 0 ./rnor6.protein_coding_clean.gtf None 0 ./rnor6.fa
Parsed 200000 lines
Parsed 400000 lines
Parsed 600000 lines
Parsing gtf File is done!
 (ASCII code 13), at line 2, position 282763075!ter,
"rsem-extract-reference-transcripts ./rnor6 0 ./rnor6.protein_coding_clean.gtf None 0 ./rnor6.fa" failed! Plase check if you provide correct parameters/options for the pipeline!


在 2016年8月27日星期六 UTC-5下午2:17:57,Bo Li写道:
Reply all
Reply to author
Forward
0 new messages