error in importing expression file, .txt format

102 views
Skip to first unread message

PAULA DIANA

unread,
Jan 30, 2021, 5:16:16 PM1/30/21
to gsea-help
Hello,
I am facing great difficulties to import an expression file (.txt) to the GSEA desktop.

The program gave me the following error message:
---- Complete error message ----
Errors occurred: ERROR (S) #: 1
Analysis problems
java.lang.NumberFormatException: ...

 The file follows all the requirements for .txt formatting, according to: https: //software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats



image.png

Anthony Castanza

unread,
Jan 30, 2021, 9:35:52 PM1/30/21
to gsea...@googlegroups.com

Hi Paula,

 

It looks like somewhere in your text file there are values for a gene’s expression of “na”. GSEA supports omitting missing values, but they need to be blank, not replaced with filler text (like NA, N/A, na, etc.) We’re working on a fix for this, but in the meantime just replace your na values with blanks and the data should load.

 

Let me know if this doesn’t work for you,

 

-Anthony

 

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

http://gsea-msigdb.org/

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/e69a804d-f790-41f5-b2d2-bdeb36164a15n%40googlegroups.com.

PAULA DIANA

unread,
Jan 31, 2021, 11:32:40 AM1/31/21
to gsea...@googlegroups.com
Hi Anthony. 
Thanks for your help.

Unfortunately, the error persists. I removed the missing values from my file. However, when I try to export to GSEA, the error persists.
The problem must be in my "Description" column, according to the .txt format, this column must be present. I don't have  information about the description, but I added "NA" to fill it out. As reported on the GSEA website
I'm sending a preview of my expression file for you to check ;/


Thank you
Paula.

log_norm_counts.txt

Anthony Castanza

unread,
Jan 31, 2021, 1:32:44 PM1/31/21
to gsea...@googlegroups.com
Hi Paula,

The treatment of the description field you describe should be fine. Looking at your file it appears that there are quotation marks around all the text fields. This frequently occurs when files are written out from R without setting quote=FALSE in the write.table() command. The quotation marks prevent GSEA from parsing the text fields properly. Try removing the quotes and then give it another go. Let me know if that works.


-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Anthony Castanza

unread,
Jan 31, 2021, 1:36:04 PM1/31/21
to gsea...@googlegroups.com
One other thing, the fine name you sent indicates that the data has been log transformed. We don't really recommend log transforming data for GSEA, at least not with the default ranking metrics. The transformation compresses the signal and can reduce GSEA's detection power. If you have access non-log transformed (but still normalized) counts you might get a better result.


-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

PAULA DIANA

unread,
Feb 1, 2021, 12:25:49 PM2/1/21
to gsea...@googlegroups.com
Hi Anthony,
I changed the save format and the expression data (just normalized) and it worked.

I appreciate your help :)

Reply all
Reply to author
Forward
0 new messages