RNAseq in time-series

161 views
Skip to first unread message

nelcaster

unread,
Nov 13, 2015, 5:34:18 PM11/13/15
to trinityrnaseq-users
Hi there,

We have wheat  RNAseq data generated from eight time points (10 min, 1h, 2h, 4h, 1d, 2d, 4d, 7d) three replicates each after bacterial  infection. The aim is to find both mapped and unmapped (or novel sequences assembled using Trinity) in the wheat genome responding after infection and to monitor each gene all throughout the time course. I was wondering how can I analyze (i.e. monitor) the expression level of each gene in each time point. I was able to generate the abundance estimate using Trinity-RSEM but realized that DE (using edgeR) does not give this kind of information (if I am right). 

I searched the google group but looks like information on any time course study with RNA-seq is scarce. Your suggestions or pointers will be helpful.


Thanks,


Nelcaster


Tiago Hori

unread,
Nov 13, 2015, 6:44:32 PM11/13/15
to nelcaster, trinityrnaseq-users
Well, you could do 2 by 2 comparisons. Here is the challenge you are going to face, time-course multi variate analysis is a strange beast. However, there are some models that were devolved for microarrays. I think Limma has one and so does MANNOVA, both implemented in R. 

T.

"Profanity the is the only language all programmers understand" 
Sent from my iPhone, the universal excuse for my poor spelling.
--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Ken Field

unread,
Nov 14, 2015, 6:52:49 AM11/14/15
to Tiago Hori, nelcaster, trinityrnaseq-users
Dear Nelcaster-
I think that Tiago's suggestion of doing pairwise comparisons would work, but you would be losing a lot of the information that you want about the pattern of gene expression change over time. Instead, I think you should check out the vignette for DESeq2 where time series experiments are directly addressed. 
Ken
--
Ken Field, Ph.D.
Associate Professor of Biology
Program in Cell Biology/Biochemistry
Bucknell University
Room 203A Biology Building

Tiago Hori

unread,
Nov 14, 2015, 7:00:38 AM11/14/15
to Ken Field, nelcaster, trinityrnaseq-users
Just be aware that controlling for false discovery in multi-variate and multi factorial models is a very complicated and muddy statistical problem. The multi-factor aspect creates a fairly burdensome simulation or permutation issue, which leads most often the use of other methods of FDR estimation, which MAY not be as accurate. 

If pattern of expression is what matters to you than you can also use clustering strategies such as k-means or hierarchical clusters to look into this.

I am not at all saying that Ken's suggestion is not a good one. It is a great one, I totally forgot about DEseq2! However, you may want to approach this issue from many fronts!

T.

"Profanity the is the only language all programmers understand" 
Sent from my iPhone, the universal excuse for my poor spelling.

Dan Browne

unread,
Nov 14, 2015, 4:05:31 PM11/14/15
to trinityrnaseq-users
I have time-course data, and it is indeed challenging to analyze. The pairwise comparison of time points for DE analysis really didn't prove terribly useful to me. I found the most interesting information by identifying candidate genes with BLAST and then extracting and analyzing the expression patterns for each candidate across the time course. When you have a set of interesting genes, you can look for patterns in the expression profiles.

If you have a group of genes that show a consistent expression pattern, and you want to find more genes that might be co-expressed with your group, you can use Pearson's correlation coefficient (easily done in Excel), or k-means clustering (I used MapMan to do this).

The starting point for this is deciding what genes you want to use as BLAST queries. Assuming you're looking at protein-coding genes (as opposed to miRNAs, siRNAs, etc), then I'd suggest always using the protein sequence as a query and the TBLASTN algorithm to search through the transcriptome. Choosing interesting queries is not an easy task - you have to dig through the literature. But I think that's a good activity to do, because then you become familiar with the body of research for each query.

I would call the above strategy a "targeted" approach to expression pattern analysis. If you want to do an "untargeted" approach, then you could maybe try expression-based k-means clustering of the whole transcriptome. That should give you groups of genes with similar expression patterns. You could then test for GO term enrichment in each group. Might give you insight into what cellular processes are changing in response to infection.

However, I think the biggest and most important part of an experiment like yours is the control. Do you have a corresponding time-course from a plant that was not subjected to infection? That would establish your baseline and allow you to compare against your infection experiment. Furthermore, how many replicates do you have? The more replicates you have, the more statistical confidence you can have in your data.

Just some thoughts. Hope it helps!

Best,
Dan

nelcaster7

unread,
Nov 14, 2015, 6:21:02 PM11/14/15
to Dan Browne, trinityrnaseq-users
Hi  all,
First, thank you all for these insights.

Dan - With regards to your question, yes, we do have a control (Mock inoculated) in each time point. I forgot to state that on my first email message.
The number of replicates is three (though in a few time points, two).

Thanks,
Nelzo



--
You received this message because you are subscribed to a topic in the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/trinityrnaseq-users/B6jZDtC6yfQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to trinityrnaseq-u...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
Nelzo C. Ereful
Post-doctoral Researcher
NIAB-TAG-Cambridge Univ Farm
Cambridge, UK

Reply all
Reply to author
Forward
0 new messages