GO-Elite v1.19 - Minor bug using Custom Affy IDs and denominator gene list

8 views
Skip to first unread message

Gaj Stan (BIGCAT)

unread,
Oct 3, 2008, 8:05:12 AM10/3/08
to go-e...@googlegroups.com

Hi Nathan,

 

As the topic suggest, I’m currently importing human Custom CDF Affymetrix IDs into GO-Elite. THese custom CDFs are based on EntrezGene annotation, so it’s fairly easy to strip them off their “_at” at the end (perhaps a feature that can be implemented too?).

 

I’ve received some data from a colleague of mine and running them on both Linux and Windows gave the same error. In total there are 9 experimental conditions she wants to check out. I’ve put the input files online at http://ftp2.bigcat.unimaas.nl/~stan.gaj/goelite/bugs/GOElite_119_IDNotFoundInDenominatorSet.zip

 

These input files were altered by me, so it now contains a ProbeID, SystemCode and Mean column. As mentioned before, I removed the _at from the custom Affy IDs to get hold on the EntrezGene IDs and used SystemCode “L”, as is used in the GenMAPP gene relationship database.

 

Linux:

After choosing default parameters (including EntrezGene as primary database), I get an error saying: “Identifier: 153579 not found in Denominator set”, and “WARNING!! Job stopped... Denominator Gene List does not match the input gene list for 1.GOElite_wat_visolie_sig.txt” and the program quits. I find this error very strange, because if I look up that identifier in the Denominator list, it is present there.

 

Any suggestions?

 

  -- Stan

 

--------------------------------------------------

 

 

image001.jpg
Stan Gaj.vcf

Nathan Salomonis

unread,
Oct 4, 2008, 10:47:06 AM10/4/08
to go-e...@googlegroups.com
Hi Stan,

As you mentioned, everything sounds like it's done correctly. I'm having problems accessing the link you sent and can't find a http://ftp2.bigcat.unimaas.nl or ftp://ftp2.bigcat.unimaas.nl.  Usually the problem that causes the complaint you see is that there are some blank spaces in the IDs of either the denominator or input file IDs, unless there is some other formatting issue (OS specific text formatting rather than simple tab delimited).

Can you just send the two files as a zip (remove .zip from the name)?
NS

Nathan Salomonis

unread,
Oct 4, 2008, 10:53:46 PM10/4/08
to go-e...@googlegroups.com
Hey Stan,
 
I was able to download the files on another computer and analyze them. I am glad you brought this up because I may not explicitelly address this issue in the documentation anywhere.  The first issue was that the denominator file did not have a header row, so the first gene ID in the denominator file is ignored (which is the ID reported as missing). When I add a header row, this file processes just fine, but another file ("8.GOelite_pbmc_fibraat_sig") did have the same error, but the gene ID was actually missing from the denominator file (338339).
 
Also, while it doesn't hurt to have the number proceeding the filename, it is no longer necessary unless you have two denominator files (e.g., 1. and 2. ) and you want specific files to match to these denominators (have the same proceeding number in the name). Also, make sure you have the latest version of GO-Elite (9-27-08 or 9-30-08 date of GO_Elite.exe).
 
Best,
Nathan

On Fri, Oct 3, 2008 at 5:05 AM, Gaj Stan (BIGCAT) <Stan...@bigcat.unimaas.nl> wrote:

Gaj Stan (BIGCAT)

unread,
Oct 5, 2008, 12:15:33 PM10/5/08
to go-e...@googlegroups.com
Hey Nathan,

Thanks again for your quick reply. I'm feeling utterly stupid now. (-;

I received the files from someone who uses a Mac. All files she sent to me contained that extra space in each column. I'm no Mac-user and I don't know for sure if this is introduced there by exporting the files to a .txt file. I thought I corrected all <space>-issues before I started the job, but it seems I was wrong.

And I was aware of the necessary header row in the raw input files, but not in the denominator list. Thanks for pointing that out!

Best wishes,

-- Stan
image001.jpg
Reply all
Reply to author
Forward
0 new messages