help regarding input

17 views
Skip to first unread message

mehwish wahid

unread,
May 22, 2023, 1:33:24 PM5/22/23
to gsea-help
Hello , 
I am doing RNAseq analysis , while running GSEA analysis I wanted to ask that on internet i have seen that the input file containing  
log2FoldChange

lfcSE

stat

pvalue

padj
   
 etc of a defined condition (treatedAsample vs untreated Asample) (implemented in R) can also be used .
my question is 
then what information phenotype file will contain ?

I have done GSEA using the normalized expression values  in different samples but my results are not matching the results previous results generated by my lab .(I have to reproduce  the work).

Can you please guide me 


Thankyou 
Mehwish

mehwish wahid

unread,
May 22, 2023, 2:24:42 PM5/22/23
to gsea-help
my question is if we are using the these input value 
log2FoldChange
lfcSE

stat

pvalue

padj
 
What does the phenotype file contains , as in the input file the samples are not  present rather , the above mention values for each gene is present in the specified condition of samples 

Castanza, Anthony

unread,
May 22, 2023, 3:17:42 PM5/22/23
to gsea...@googlegroups.com

Hi Mehwish,

For GSEA generally we recommend an expression file containing the full normalized read counts for all expressed genes for all samples. Then the phenotype file contains the mapping of sample to the phenotypes being studied. See the file spec’s in our Data formats wiki: https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats

 

The file you’ve described seems to be the results of a differential expression analysis, you can take one of the metrics from that file and use it for GSEA Preranked, but that is generally considered an “advanced” analysis that we don’t really provide much direct support for.

 

Another factor that can be influencing your results could be using a different version of MSigDB than was used in the original analysis. My advice is to contact the person who did the original analysis and get as much detail about what they did as possible (ideally the full results files as there should be some indications of specific parameters used). Do be aware though that due to the random nature of the null distribution generation GSEA results are expected to vary slightly from run-to-run. This can only be “fixed” by supplying the exact random seed as was used in the previous run.

 

If you get the specific details about the previous run that you’re trying to replicate I might be able to provide more detailed advice

 

-Anthony

 

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/7f675d11-bf2e-4f81-82cb-513d155f5eeen%40googlegroups.com.

 

Reply all
Reply to author
Forward
0 new messages