Hi Felicia,
If you have data like this that’s already been ranked by a metric like LogFC, it needs to be formatted in a .RNK file (https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#RNK:_Ranked_list_file_format_.28.2A.rnk.29)
And run through the GSEAPreranked function.
-Anthony
Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
gsea-help+...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/gsea-help/9a42155f-f8c6-46f4-b3b2-2567f670c831n%40googlegroups.com.
Hi,
I would recommend converting it programmatically so that the gene symbols are preserved, but if that’s not possible, you can use excel if you’re careful by importing the data with the File>Import function and being sure to specify the gene symbol column as TEXT and not “General”. You’ll want to save the reformatted file as tab delimited text and change the file extension to .RNK
Since it also looks like your genes are from mouse, you’ll need to be sure to use the Mouse_Gene_Symbol orthology CHIP file for the version of MSigDB you’re selecting gene sets from with “Collapse” set when running GSEA Preranked.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/5c49fd09-62b6-45e8-a341-f8ff6cd9aa5an%40googlegroups.com.
That is the chip file for Mouse symbols for the 7.0 release of MSigDB, the current release of MSigDB is 7.2 and the corresponding mouse chip file is: Mouse_Gene_Symbol_Remapping_Human_Orthologs_MSigDB.v7.2.chip
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/6f2bd5e4-7930-4ee9-a7f9-31985a65f0afn%40googlegroups.com.
Hi Felicia,
Glad you were able to get the analysis to run! Yes, na_pos is the upregulated side of your list and na_neg is the downregulated side. In GSEA’s standard mode this would be populated by the phenotype data and directions from the CLS file, but in Preranked mode we don’t have that information so fill in placeholder values and base it solely on the supplied directions in the ranked list.
Feel free to reach out with other questions
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/0aaa69dc-33dc-4b27-af4e-d48a0f5a78c6n%40googlegroups.com.
Hi Felicia,
Yes, you can output GSEA’s image in SVG format. This option is under “Advanced fields” called “Create SVG plot images”.
One thing to note, you have to rerun GSEA in order to generate new plots, but this will result in a new random seed being used for permutation testing which will cause some variance in the results. If you want to create identical results you will need to copy the random seed value from the index.html page of your results. It should be at the very bottom under “Comments Timestamp used as the random seed: [value]” You’ll need to copy that value to the GSEA “Seed for permutation field” replacing where it says “timestamp”.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/de760ecd-d77c-4594-990e-47250d463360n%40googlegroups.com.
Hi Felicia,
If you look in the GSEA output directory for that analysis run on your hard drive using your file browser it should have written the SVG images there.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/76213fdb-610e-4949-8815-0dcbd4acd8c2n%40googlegroups.com.
For a ranked list, all the genes present in the file need to have a numeric value.
For gct files being run through standard GSEA, not GSEA preranked, a sample with a missing value for a given gene should have that missing value left blank.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/e40c60ba-fca5-4ed1-b901-c3119e76a802n%40googlegroups.com.
For a RNK file you need to delete the entire row.
If you’ve done this and are still getting error messages, please include the full text of the error.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/f48cfb5b-66dc-4d60-985d-c82c27092be7n%40googlegroups.com.
The only way to delete error runs is to relaunch GSEA, which clears all runs from the session log.
For interpreting the enrichment plot; “phenotype” in the documentation is referencing the phenotype parameters from running in the standard GSEA mode. In GSEA Preranked this is just the order of the ranked list as you created it (so whichever was the positive side of the distribution and was assigned to na_pos is the positive phenotype and vice versa).
A “hit” refers to a gene in the gene set that is also in your ranked list.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/a57acd7a-d06b-449e-a9e7-94a37234d12bn%40googlegroups.com.
This is caused by one of a couple different errors;
2 and 3 are technically possible causes of the root issue which in turn cause #1.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/bd264721-e77f-4ea0-b4ff-cffbd4ad21cdn%40googlegroups.com.
GSEA expects a dataset of 10,000 to 20,000 (or more genes). This corresponds to all of the expressed genes of an entire microarray or RNA-seq experiment. 25 genes is not sufficient to run GSEA.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/1458dbc0-1ea9-4c84-a522-78f42700ae96n%40googlegroups.com.
Hi Ram,
Please double check for hidden file extensions. This error occurs when a RNK file has inadvertently been loaded in as a .TXT file. Once loaded in with the correct parser it should show up under the left side bar's GSEA Preranked option.
-Anthony
Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/3aa554e4-d59e-42ed-87bb-d88b7141bcc8n%40googlegroups.com.