Using STAR to quantify known and novel isoform expression levels per sample

336 views
Skip to first unread message

kelregan

unread,
Oct 8, 2015, 9:45:38 AM10/8/15
to rna-star
Hello, 

I am wondering if you could recommend a protocol for quantifying the isoform levels (both known and novel) per sample. 

In particular, our group has identified several novel transcript isoforms of a common human gene in a number of different cancer cell lines that include patterns of exon skipping, an exon-exon fusion and incorporation of a novel exon. We determined the sequences for each of these isoforms and designed custom primers to determine expression levels in the original samples.

We have now obtained RNAseq data from an independent patient dataset and are interested in quantifying the expression of all isoforms in each sample. We are not comparing expression levels across samples. 

I received one recommendation to 1) Use samtools to extract the alignments of the region surrounding the gene (including upstream and downstream as far as possible without hitting another gene), 2) Create a gtf gene annotation with the known transcripts of the gene and use this with the -g option of cufflinks, 3) Use cuffmerge to create a custom gtf file to use in cuffquant/cuffnorm. 

I am wondering if anyone could comment on this recommendation, or offer an alternative approach using STAR along with other tools? Detailed answers are greatly appreciated. 

Thank you,
Kelly

Alexander Dobin

unread,
Oct 9, 2015, 2:32:53 PM10/9/15
to rna-star
Hi Kelly,

adding your novel isoforms to the known annotations (GTF) is definitely the way to go. After that, you can different expression quantifiers. Cufflinks is one of them.
For ENCODE, we are using STAR with --quantMode TranscriptomeSAM and then using RSEM on  the Aligned.toTranscriptome.out.bam file (alignments in transcriptomic coordinates).

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages