Gaj Stan (BIGCAT)
unread,Jul 4, 2013, 5:34:47 AM7/4/13Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to go-e...@googlegroups.com
Hello all,
I have trouble generating custom gene lists to analyze within GO-Elite v1.2.5-Py. The online documentation refers twice to a 'custom gene set' and mentions only once that it can be either a txt file that contains 2 columns or some other supported formats (i.e gpml, etc). However, I've tried several situations (explained further below), of which most end up with a "<FILE> not formatted properly" error.
My main aim here is to perform an ORA analysis on several custom-made gene lists that I generated. These genelists are based on EntrezGene IDs.
Below a few attempts that were made:
1) After examining an existing relationship file in the /database/ directory (i.e. EntrezGene-KEGG.txt) I created a table with the same 3 column content (ID, empty column name (but is basically SystemCode) and OntologyID). An example output is shown below:
EntrezGene OntologyID
1244 En GeneList1
182 En GeneList1
80270 En GeneList1
954 En GeneList1
9971 En GeneList2
1581 En GeneList2
4853 En GeneList3
...
After running GO-Elite everything seems to work fine (no error message was displayed on the screen) and a custom_gene_set.txt file was generated in a new folder. However: there were no results and the custom_gene_set.txt file was empty (with the exception of the three column headers).
2) Per genelist a seperate file was created that contained only the ID and SystemCode. This resulted in a "<FILE> not formatted properly" error. Content of a file looks like this:
EntrezGene SystemCode
1244 En
182 En
...
3) An alteration of (1) where the 2nd column was removed --> resulted again in an not formatted properly error.
4) Followed the structure of the custom_gene_set.txt file, but then the error appeared that "1244 is not a valid systemcode".
Therefore, I assume that the first approach is the closest to the solution, but I wonder what I did wrong. Does anyone have any suggestions on how to proceed? Could it also be that custom genesets perhaps rely on EnsEMBL annotations and not EntrezGene?
Best,
-- Stan