Create a .rnk file

202 views
Skip to first unread message

Marina Canyelles

unread,
Aug 30, 2022, 1:09:09 PM8/30/22
to gsea-help
Hello,
I create a ranked gene list for running the GSEA-P but when I tryied to upload in the software it appears an Error message
I create the list in excel, saved as txt and changed mannually .rnk
Can you help me?
Thank you

Anthony Castanza

unread,
Aug 30, 2022, 1:11:36 PM8/30/22
to gsea...@googlegroups.com

Hello,

 

Could you please provide the exact text of the error message you received?

 

The specification for the .RNK file format is here: https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#RNK:_Ranked_list_file_format_.28.2A.rnk.29

 

A common error is related to having a header row but not beginning the row with a # character which can cause the software to attempt to interpret text as a number (which fails).

 

-Anthony

 

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/85d87885-373d-4fe6-8d31-a8f70e0004fbn%40googlegroups.com.

Marina Canyelles

unread,
Aug 30, 2022, 1:30:18 PM8/30/22
to gsea...@googlegroups.com
image.png

In the first row I only put  # just above the gene list
Can it be a problem to use the Betas with a comma instead of a dot to separate decimals?

Thank you


You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/U_HKwcmmbYU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/SJ0PR05MB76091E23C664BE3315CC99FAF7799%40SJ0PR05MB7609.namprd05.prod.outlook.com.

Anthony Castanza

unread,
Aug 30, 2022, 1:31:38 PM8/30/22
to gsea-help
GSEA expects a dot to separate decimals not a comma so yes that could definitely be the issue.


-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Marina Canyelles

unread,
Sep 8, 2022, 4:04:37 PM9/8/22
to gsea...@googlegroups.com
Hello,

Which would be the best parameter to use for the rank gene list obtained from a GWAS? 
I used the Beta but I'm not sure that it would be better to use the p value? I can upload a list with 3 columns (Gene Symbol, Beta, p Value) instead of 2 in order to have more power of the analysis
I run a GWAS-Preranked analysis of almost 17,000 genes with their Betas but I only obtained a 3 gene sets with FDR<25%

Thank you in advance for your help

Best,

Marina

Anthony Castanza

unread,
Sep 9, 2022, 11:41:17 PM9/9/22
to gsea-help
Hi Marina,

Unfortunately I really can't give a hard answer here because GSEA isn't really designed to be run with GWAS data. The general expectation is that the ranking metric coorelates with the biological impact of the feature (i.e. gene) being ranked, so I think the Beta might be most appropriate. It is difficult to say why you're getting so few significant sets without a much deeper dive into the data than we're really able to do here.

Sorry I couldn't be of more help

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Marina Canyelles

unread,
Oct 5, 2022, 1:07:30 PM10/5/22
to gsea...@googlegroups.com
Hello,

Thank you for your response
I run the analysis and I want to know if it could be possible to obtain the real nominal p value? The software gives a nominal p value of 0.000 

Thank you again for your time
Best regards

Anthony Castanza

unread,
Oct 5, 2022, 1:44:56 PM10/5/22
to gsea...@googlegroups.com
Hi Marina,

GSEA generates the pValue from an empirical null distribution generated by the permutation method. If there are never any permutations with a stronger enrichment score ever observed in the null distribution, GSEA will return a true nominal pValue of 0. This isn't a mistake, or a rounding error, it is the accurate value derived from the observed distribution. You can attempt to resolve this to a finer level of detail by increasing the number of permutations (the default is 1000, you can try increasing this to 10000) but there are no guarantees that this will change the value observed.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Marina Canyelles

unread,
Oct 31, 2022, 4:53:28 PM10/31/22
to gsea...@googlegroups.com
Hello,

Thank you for your response
The p value that GSEA gives is calculated based on ES or on NES? In the GSEA user guide says "The nominal p value estimates the statistical significance of the enrichment score for a single gene set. However, when you are evaluating multiple gene sets, you must correct for gene set size and multiple hypothesis testing" however in the Supplementary Material of 2005 Subramian et al. (PNAS) says "glob.p.val: A global nominal P value for each gene set’s NES estimated by the percentage of all (S, π) with NES(S, π) ≥ NES(S). Theoretically, for a given level of significance (e.g., 0.05), this quantity measures whether the shift of the tail of the distribution of observed values is extreme enough to declare the observed distribution as different from the null"

Thank you again
Best, 

Marina

Anthony Castanza

unread,
Oct 31, 2022, 5:07:09 PM10/31/22
to gsea...@googlegroups.com
Hi Marina,

The NOM pValue reported in GSEA is based on the ES not the NES, the Global Nom pValue referred to in the paper was replaced by the FDR.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
Reply all
Reply to author
Forward
0 new messages