rsem + star pipeline in yeast (S cerevisiae) samples

115 views
Skip to first unread message

Marco Di Stefano

unread,
May 6, 2016, 2:18:23 PM5/6/16
to RSEM Users
Hi all,

I'm analysing a series of yeast samples. I have the following problem in the quantification step.

1) To generate the index I used 
 

STAR --runMode genomeGenerate --runThreadN 8 --genomeDir . --genomeFastaFiles XXX.fa --sjdbGTFfile XXX.gtf --outFileNamePrefix XXXprefixXXX --genomeSAindexNbases 10


I noticed that I'm not able to use an analogous rsem-prepare-reference command because I need to change the default value for --genomeSAindexNbases from 14 to 10.


2) Map with STAR.


3) To quantify using 


rsem-calculate-expression -p 8 --no-bam-output --bam  --paired-end XXXbamfileXXX XXXrefnameXXX XXXsamplenameXXX 


rsem is looking for the *grp file which has not been generated at point 1 so it doesn't exist.


Since I really need to change the option --genomeSAindexNbases at step 1, is there a way to pass this option to rsem-prepare-reference?


Thanks,

Marco

Bo Li

unread,
May 6, 2016, 3:31:51 PM5/6/16
to rsem-...@googlegroups.com
Hi Marco,

Before you run rsem-calculate-expression, you have to run
rsem-prepare-reference. However, you do not need to use RSEM to
build/run STAR aligner for you. You just need to make sure the extracted
reference transcripts using rsem-prepare-reference and STAR are the
same.

Hope it helps,
Bo
> --
> RSEM website: http://deweylab.biostat.wisc.edu/rsem/ [1]
> ---
> You received this message because you are subscribed to the Google
> Groups "RSEM Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to rsem-users+...@googlegroups.com.
> To post to this group, send email to rsem-...@googlegroups.com.
> Visit this group at https://groups.google.com/group/rsem-users [2].
>
>
> Links:
> ------
> [1] http://deweylab.biostat.wisc.edu/rsem/
> [2] https://groups.google.com/group/rsem-users

Marco Di Stefano

unread,
May 6, 2016, 5:53:22 PM5/6/16
to RSEM Users
Hi Bo,

Thanks for your reply.

To implement your suggestion, I compared the files transcriptInfo.tab generated using

1 - STAR --runMode genomeGenerate --runThreadN 8 --genomeDir . --genomeFastaFiles XXX.fa --sjdbGTFfile XXX.gtf --outFileNamePrefix XXXprefix --genomeSAindexNbases 10


and


2 - rsem-prepare-reference --star -p 8 --gtf XXX.gtf XXX.fa XXX


Since they are the same, I can safely use the *grp file generated by 2 even if the mapping has been done using the Index generated by 1.

Is this ok?

Excuse me for being pedantic and thanks a lot,
Marco

Bo Li

unread,
May 6, 2016, 6:40:06 PM5/6/16
to rsem-...@googlegroups.com
Hi Marco,

What's this transcriptInfo.tab file?

Thanks,
Bo

Marco Di Stefano

unread,
May 7, 2016, 10:53:32 AM5/7/16
to RSEM Users
Hi Bo,

The transcriptInfo.tab is part of the output of STAR genome generation. It reports on some features of the 
transcripts mapped onto the reference genome. 

E.g. These are the first 3 lines of one of those files:

7126

YAL069W         334 648 648 1 1 0

YAL068W-A 537 791 648 1 1 1

YAL068C         1806 2168 791 2 1 2


I thought that, since this file is the same using rsem-prepare-reference or STAR, I could conclude that the extracted 
reference transcripts are the same.

If this is not the case, what can I do to check whether the extracted reference transcripts using rsem-prepare-reference 
and STAR are the same? Thanks a lot for your help.



Cheers,
Marco

Bo Li

unread,
May 8, 2016, 1:19:24 AM5/8/16
to rsem-...@googlegroups.com
Hi Marco,

This is not the case. You want to make sure the GTF file provided to the
STAR aligner is the same as the GTF file provided to the
'rsem-prepare-reference'. In particular, this is the STAR commands RSEM
uses to build STAR indices:

$command = $star_path . "STAR " .
" --runThreadN $star_nthreads " .
" --runMode genomeGenerate " .
" --genomeDir $out_star_genome_path " .
" --genomeFastaFiles @list " .
" --sjdbGTFfile $gtfF " .
" --sjdbOverhang $star_sjdboverhang " .
" --outFileNamePrefix $ARGV[1]";

Hope it helps,
Bo

Marco Di Stefano

unread,
May 8, 2016, 1:20:05 PM5/8/16
to RSEM Users
Hi Bo,

This clarified me what I needed for my analysis.

Thanks a lot,
Marco
Reply all
Reply to author
Forward
0 new messages