Gene Set Size and Expected Value

264 views
Skip to first unread message

nhuas

unread,
Jan 15, 2021, 6:38:12 AM1/15/21
to webgestalt
Hello, 
I am new to webgestalt and ORA/GSEA in general. 
I was how the "Gene Set Size" and the "Expected" value are calculated in the output.
Is the Gene set size, the number of genes that fall into a certain GO term AND that are also present in the background or gene list of interest?

Furthermore, I wanted to ask, what the best way to compare two gene lists is (unranked gene list 1 was treated with A and unranked gene list 2 was treated with B). I want to see if certain pathways are enriched in one of the gene list, but I am really unsure what background I should choose for this kind of analysis. The gene lists were generated by WES.

Any help is much appreciated!
Cheers

Yuxing Liao

unread,
Jan 15, 2021, 9:18:18 PM1/15/21
to nhuas, webgestalt
The gene set size is the number of genes that fall into a certain GO term and that are also present in the background for us. Others may not take the intersection with the background.

Expect = gene set size / #background * #input

ORA is not for comparing two DE lists. You need a differential list derived from comparing two experiment conditions. Or better calculate a rank score and use GSEA.

Yuxing


--
You received this message because you are subscribed to the Google Groups "webgestalt" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webgestalt+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/webgestalt/a5764805-a1e6-45ac-891d-b32530ffaaaan%40googlegroups.com.

nhuas

unread,
Jan 16, 2021, 4:35:57 PM1/16/21
to webgestalt
Thank you so much for your quick responds. This clears up things a lot!

Regarding my second question: 
I do not have a list of DE expressed genes. Actually I have lists of genes from different biopsies, which carry different mutations (WES was used to determine mutations).
My idea was to compare whether or not some pathways are more influenced in biopsy A vs biopsy B.
Am I right in assuming that for that kind of analysis webgestalt is suitable? 
Furthermore, to perform such an analysis I would have to upload the gene list of biopsy A as my "gene list of interest" and 
as my background the combination of gene list biopsy A and gene list of biopsy B. Is that correct?
Thank you very much!

Yuxing Liao

unread,
Jan 16, 2021, 9:36:53 PM1/16/21
to nhuas, webgestalt
OK, I see. ORA is just a statistics test and biology agnostic. The gene list is already derived from a comparison of two conditions and it tests if the overlap is significant than random. So it is not for your case. You can still do one by one, where the background is all the genes that potentially could be measured. I am actually not sure what is the best method for pathway analysis with just two gene lists.

Reply all
Reply to author
Forward
0 new messages