Latest CHEBI.obo causes problems for Phenote and OBO-Edit

6 views
Skip to first unread message

Nomi Harris

unread,
May 18, 2010, 2:00:41 PM5/18/10
to obo-ed...@googlegroups.com, OBO phenote dev, Nomi Harris
Many of the Phenote configurations require the CHEBI ontology.
Yesterday, I found that the version of CHEBI.obo from 07:04:2010
13:24 (which was the one on the local mirror, which hadn't yet been
updated to the May 3rd version) had a line with too many quotes that
broke the OBO/Phenote parser. The new version of CHEBI.obo (dated
03:05:2010 06:47) doesn't have the quote problem, but it has other
issues. Phenote runs out of memory while trying to load it. OBO-Edit
errors out:

Error: 714 unrecognized parent terms:
line 170212: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 162966: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 126140: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 104676: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 113278: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 105168: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 298889: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 226894: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 75120: CHEBI:648589 of file:/Users/nomi/Documents/workspace/
Phenote/CHEBI.may3.obo
line 177451: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 208588: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 217779: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 132608: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 226730: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 282613: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 243458: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 97117: CHEBI:648589 of file:/Users/nomi/Documents/workspace/
Phenote/CHEBI.may3.obo
line 121000: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo
line 76530: CHEBI:648589 of file:/Users/nomi/Documents/workspace/
Phenote/CHEBI.may3.obo
line 177910: CHEBI:648589 of file:/Users/nomi/Documents/
workspace/Phenote/CHEBI.may3.obo

on line: 177910 of file:/Users/nomi/Documents/workspace/Phenote/
CHEBI.may3.obo
charnum: -1
line: relationship: is_conjugate_base_of CHEBI:648589
at
org.obo.dataadapter.DefaultOBOParser.endParse(DefaultOBOParser.java:
1594)
at
org.obo.dataadapter.AbstractParseEngine.parse(AbstractParseEngine.java:
74)
at
org.obo.dataadapter.OBOFileAdapter.doOperation(OBOFileAdapter.java:268)
at
org
.bbop
.dataadapter
.DataAdapterOperationTask.execute(DataAdapterOperationTask.java:43)
at
org.bbop.util.AbstractTaskDelegate.run(AbstractTaskDelegate.java:60)
at org.bbop.swing.BackgroundEventQueue
$BackgroundEventThread.executeTask(BackgroundEventQueue.java:137)
at org.bbop.swing.BackgroundEventQueue
$BackgroundEventThread.run(BackgroundEventQueue.java:79)
org.bbop.dataadapter.DataAdapterException: Load Error, line 177910
at
org.obo.dataadapter.OBOFileAdapter.doOperation(OBOFileAdapter.java:280)
at
org
.bbop
.dataadapter
.DataAdapterOperationTask.execute(DataAdapterOperationTask.java:43)
at
org.bbop.util.AbstractTaskDelegate.run(AbstractTaskDelegate.java:60)
at org.bbop.swing.BackgroundEventQueue
$BackgroundEventThread.executeTask(BackgroundEventQueue.java:137)
at org.bbop.swing.BackgroundEventQueue
$BackgroundEventThread.run(BackgroundEventQueue.java:79)

Any thoughts on what to do? As a temporary fix for Phenote, I can
bundle with it a fixed version of the April 7th CHEBI.obo (I only had
to change that one line) and set the configurations so they don't
download a new version of CHEBI.obo from the repository. That's
obviously not a good long-term solution, though.

Nomi

--
You received this message because you are subscribed to the Google Groups "OBO-Edit Developers Group" group.
To post to this group, send email to obo-ed...@googlegroups.com.
To unsubscribe from this group, send email to obo-edit-dev...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/obo-edit-dev?hl=en.

Nomi Harris

unread,
May 20, 2010, 12:56:15 PM5/20/10
to obo-ed...@googlegroups.com, OBO phenote dev, Nomi Harris
I hate to do this, but for now I'm going to just commit to the Phenote
repository the last CHEBI.obo that doesn't break OBO-Edit and Phenote
(my fixed version of the 7 April 2010 release, with the extra quotes
removed from line 287172), and make the configs that use CHEBI get it
locally rather than downloading a new one.

I think the OBO parser was just not designed to handle such large
files, and fixing this out-of-memory problem is going to involve a
major rewrite. But I would be happy to be wrong about that!

Nomi

Amina Abdulla

unread,
May 26, 2010, 4:50:30 PM5/26/10
to obo-ed...@googlegroups.com, OBO phenote dev, Nomi Harris
Hi Nomi,
Sorry about the late response - I was out of office.

The problem here is that you have not assigned adequate memory while
running phenote - in order to load all the ontologies correctly. Can you
tell me what your memory settings are?

OBO-Edit loads chebi quickly with -Xmx1400M
CheckMemoryThread: max heap size = 1389 MB; warn if available memory <
138 MB
Loading took 4055 ms
Reading
[http://obo.cvs.sourceforge.net/*checkout*/obo/obo/ontology/chemical/chebi.obo]
(allowDangling = false)
(followImports = true)
Done parsing file
Reloaded OTE in 424 ms (expanding took 0 ms)

1.4 gigs is a reasonable amount and most systems now are equipped to
handle those kind of requirements. Let me know which other ontologies
phenote users need to load and I can get back with some recommended
memory settings.
hope this helps,
Amina

Nomi Harris

unread,
May 27, 2010, 5:08:01 PM5/27/10
to Amina Abdulla, obo-ed...@googlegroups.com, OBO phenote dev
No, I have OBO-Edit using 1600M of memory--that wasn't the problem.

The problem turns out to be due to CHEBI, and not due to OBO-Edit or
Phenote.

I was puzzled that Amina was able to load http://obo.cvs.sourceforge.net/*checkout*/obo/obo/ontology/chemical/chebi.obo
(which is dated 03:05:2010 06:47) in OBO-Edit and yet I couldn't
load my saved copy of chebi.obo with the same date. Well, it turned
out that the CHEBI folks have changed something in chebi.obo
(actually, a whole bunch of lines, though they're all the same thing)
without changing the date stamp, so the version I had saved on May
18th is not the same as the one at that URL. The new one at the URL
does indeed load successfully in OBO-Edit. And Phenote can now load
this latest CHEBI.obo as well, so I can stop making Phenote use the
April version of CHEBI.obo.

When you depend on outside sources for updated data files, you're
always vulnerable to problems when the owners release a buggy
version. However, I think this incident confirms my assertion that
it's very important to have a unique date stamp and/or version number
in each ontology file. Otherwise, one can end up wasting a lot of
time trying to figure out why things seem broken one way (loading
saved ontology file) and yet work another way (via URL).

I'll mention this to the CHEBI folks.

Nomi

Reply all
Reply to author
Forward
0 new messages