Rowvar usage instead of rowmean in heatmap making

34 views
Skip to first unread message

Irishtsany Indira

unread,
Mar 27, 2023, 1:37:21 AM3/27/23
to SAMSA bioinformatics group
Hi Sam, 

I'm comparing the script you made to create the heatmap with R (https://github.com/transcript/samsa2/blob/master/R_scripts/make_DESeq_heatmap.R) with the heatmap creation script listed in the DESeq2 vignette (http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#data-transformations-and-visualization). 

I'm curious, why did you use the variance (Rowvar) instead of mean (Rowmean) to select which data to display on the heatmap? What are the considerations, and what differences will it make when you want to interpret the heatmap results?

Thank you so much in advance

Sam Westreich

unread,
Mar 27, 2023, 6:05:48 PM3/27/23
to Irishtsany Indira, SAMSA bioinformatics group
Hi Irishtsany,

Great question!  I'll admit it's been a few years since I last looked in depth at this code, but I believe the answer is because I wanted to specifically focus on genes/organisms/features that were most different between the two compared groups, instead of looking just at means.  Looking at means would focus more heavily on the overall most active genes, which could mask some genes that are expressed at a lower abundance, but show more significant variation.

This is, of course, not a ruling saying that everyone has to only stick with variance.  You could certainly swap this to rowmean instead if you prefer to focus on the most highly expressed overall genes/organisms/features.

Best,
Sam

--
You received this message because you are subscribed to the Google Groups "SAMSA bioinformatics group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samsa-bioinformatic...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/samsa-bioinformatics-group/d40454c4-245e-4ad6-bbdd-5b8bb4267cc7n%40googlegroups.com.


--
Sam Westreich, PMP, PhD
Microbiome Scientist, DNAnexus, 
Message has been deleted

Sam Westreich

unread,
Apr 11, 2023, 1:43:44 PM4/11/23
to Irishtsany Indira, SAMSA bioinformatics group
Hi Irishtsany,

You should be able to use pairwise comparisons to figure out the effects of some treatment on your samples, as long as you have at least 2 samples for each condition (to determine the amount of variation within-group versus between-group).  So, if you had:

4 samples from groups treated with drug A
5 samples from groups treated with drug B
3 control samples

You could do pairwise comparisons with DESeq2 to look at A vs. control, B vs. control, and A vs. B.

There is a run_DESeq_stats R script included with SAMSA2 to handle doing this; you just need to name your inputs with the "control_" or "experimental_" prefix to designate the two groups that DESeq2 compares.

Does this make sense?  Is it helpful?

Best,
Sam

On Fri, Apr 7, 2023 at 9:09 AM Irishtsany Indira <iris...@gmail.com> wrote:
Dear Sam,

Thank you so much for your answer! I want to ask more question about statistical analysis that is compatible with SAMSA2 pipeline. I've done annotation and aggregation and got data on functional genes & microorganisms expressed in my samples. I want to see how the effect of giving some treatment to the expression of genes and microorganisms that are in my sample. Do you have any ideas how I could do it, or do you have any idea what kind of statistical test I could use?

Again, thank you so much. Hope you have a nice day.

Reply all
Reply to author
Forward
0 new messages