Hi all,
I consulted this subreddit for information before stumbling across this help page (crosspost link: https://www.reddit.com/r/bioinformatics/comments/1lk6m8a/looking_for_advice_on_gsea_setup_with_unique/). I have continued working on my sequencing analysis pipeline after DESeq2 analysis for differential gene expression and am now focusing on gene set enrichment analysis. For reference, here are the replicates I have in the normalized counts file (.cgt, directly scraped from DESeq2):
0% stenosis - x6 replicates (x3 from the upstream of a blood vessel, x3 from the down)
70% stenosis - x6 replicates (x3 from the upstream of a blood vessel, x3 from the down)
90% stenosis - x6 replicates (x3 from the upstream of a blood vessel, x3 from the down)
100% occlusion - x6 replicates (x3 from the upstream of a blood vessel, x3 from the down)
Main question to address for now: How does stenosis/occlusion alone affect these vessels?
The issue I am having is that the replicates split between the upstream and downstream are neither technical replicates nor biological replicates (due to their regional differences). In DESeq2, this was no issue, as I set up my design as such to analyze changes in stenosis while considering regional effects:
~region + stenosis
But for GSEA, I need to decide to compare two groups. What is the best way to do this? From what I gather, and advice over there, is to use GSEA preranked. In the future, I might be interested in comparing regional differences, but for right now, I am only interested in the differences purely due to the effect of stenosis.
In the past, I always used standard GSEA but now believe it is my only option to do GSEA preranked (where I will undoubtadly have to choose a ranking metric, which I know is a common question and one in which I am unsure how to approach). I know that when using lfcshrink() in this pipeline, it aids in ranking of genes by using the log2(fold-change) exclusively, but I am open to perspective on that.
Thanks!
Alex