[Obo-format] Problem converting from OWL to OBO

12 views
Skip to first unread message

Steffen Neumann

unread,
Mar 4, 2015, 10:37:59 AM3/4/15
to obo-f...@lists.sourceforge.net, Philippe Rocca-Serra
Dear all,

In the EU project http://www.cosmos-fp7.eu/
we're developing a controlled vocabular / ontology
http://nmrml.org/cv/ for use in metabolomics / NMR.

The current nmrCV ontology is located at
https://github.com/nmrML/nmrML/blob/master/ontologies/nmrCV.owl

Since we are using tools from the http://open-ms.sourceforge.net/
team, we need to convert the nmrCV.owl into the OBO format nmrCV.obo.
Generally, this should work, as we do not use
and concepts from OWL which are not available in OBO.

We have done the conversion successfully through the export in Protege,
but since we need this in an automated manner, Philippe Rocca-Serra
recommended in https://github.com/nmrML/nmrML/issues/42#issuecomment-25201303
to use obolib-owl2obo.

However, I get some weird exceptions https://github.com/nmrML/nmrML/issues/43
which I am unable to interpret. This happened with both the obolib SVN
back in September 2013, and with the version updated today:

I use: ./obolib-owl2obo nmrCV.owl -o /tmp/nmrCV.obo
and get

Exception in thread "main" org.obolibrary.oboformat.model.FrameStructureException:
multiple is_obsolete tags not allowed. in frame:Frame(null ontology(
....
[and then a whole lot of metadata from our ontology]
....
Ontology(
AnnotationAssertion(rdfs:comment <http://nmrML.org/nmrCV#NMR:1000409> "")
){}))
at org.obolibrary.oboformat.model.Frame.checkMaxOneCardinality(Frame.java:255)
at org.obolibrary.oboformat.model.Frame.check(Frame.java:235)
at org.obolibrary.oboformat.model.OBODoc.check(OBODoc.java:235)
at org.obolibrary.oboformat.writer.OBOFormatWriter.write(OBOFormatWriter.java:126)
at org.obolibrary.oboformat.writer.OBOFormatWriter.write(OBOFormatWriter.java:120)
at org.obolibrary.cli.OBORunner.runConversion(OBORunner.java:171)
at org.obolibrary.cli.OBORunner.main(OBORunner.java:88)

The full trace is below. My question:

=> How do I debug / find out *where* these "multiple is_obsolete tags"
come from ? I don't see them in
sneumann@acryl:~/nmrML/code/ontologies (master)$ grep is_obsolete nmrCV.owl
sneumann@acryl:~/nmrML/code/ontologies (master)$
I am also not aware that we actually import any other resource,
in my local copy of nmrCV.owl I have commented out the only import we have:
<!-- <owl:imports rdf:resource="&obo;bfo.owl"/> -->


Or did I miss some constraints, which would make my attempts futile
in first place ? If not, should I open an obolib issue on this ?
Anything else I could check ?

Thanks in advance,
yours,
Steffen

sneumann@acryl:~/src/oboformat-read-only/bin$ ./obolib-owl2obo /home/sneumann/nmrML/code/ontologies/nmrCV.owl -o /tmp/nmrCV.obo
CMDARGS= /home/sneumann/nmrML/code/ontologies/nmrCV.owl -o /tmp/nmrCV.obo
/usr/bin/java -d64 -Xmx2048M -Xms2048M -DentityExpansionLimit=512000 -DlauncherDir=. -jar ./oboformat-all.jar --owl2obo /home/sneumann/nmrML/code/ontologies/nmrCV.owl -o /tmp/nmrCV.obo
2015-03-04 15:47:26,368 INFO (OBORunner:159) saving to /tmp/nmrCV.obo
Exception in thread "main" org.obolibrary.oboformat.model.FrameStructureException: multiple is_obsolete tags not allowed. in frame:Frame(null import( http://purl.obolibrary.org/obo/bfo.owl{})ontology( http://nmrML.org/nmrCV{})created_by( Daniel Schober{})property_value( http://purl.org/dc/elements/1.1/contributor Since this is a prolonged effort spanning a larger time period, there naturally were many people involved in the creation over the years and during different times.

People involved in the term creation from ID >1400000 :

This part of the NMR ontology was originally developed by the ontology working group (http://msi-ontology.sourceforge.net/) of the msi-metabolomicssociety (msi-workgroups.sf.net):

Daniel Schober (EBI)
Chris Taylor (EBI and HUPO-PSI)
Dennis Rubtsov (Un of Cambridge, UK)
Helen Jenkins (Un of Wales, Aberystwyth, UK)
Irena Spasic (Center for Integrative Systems Biology, Manchester, UK)
Larissa Soldatova (University of Wales, Aberystwyth, UK)
Philippe Rocca-Serra (EBI and MGED Society)
Susanna-Assunta Sansone (EBI)

People involved in the term creation from ID<1400000:

Joseph Cruz
Michael Wilson
David Wishard

Terms with IDs ID<1400000 that were NOT asserted in the original Wishard obo file were created by Daniel Schober (COSMOS WP2). Its IDs were autogenerated with the Protege ID generator.

Other people that substantially helped in revising the latest and Cosmos governed CV additions were:

Michael Wilson, Wishard Group, Edmonton, Alberta, Canada
Daniel Jacobs, INRA, Bordeaux, France
Reza Salek, EBI, Hinxton, UK
Philippe Rocca-Serra, University of Oxford, Oxford, UK
Andrea Porzel, IPB-Halle, Germany
and the COSMOS WP2 team xsd:string{})property_value( http://purl.org/dc/elements/1.1/coverage Nuclear magnetic resonance (NMR) data annotation as required by the nmrML developed by the COSMOS EU project. xsd:string{})property_value( maintainer http://www.cosmos-fp7.eu/WP2 xsd:string{})format-version( 1.2{})def( This artefact is an MSI approved controlled vocabulary developed under COSMOS WP2 governance. This CV was derived from two predecessors (The NMR CV from the David Wishard Group, developed by Joseph Cruz) and the MSI nmr CV developed by Daniel Schober at the EBI.
This simple taxonomy of terms (no DL semantics used, for a short overview on the differences of Cv, taxonomy and ontology look at http://infogrid.org/trac/wiki/Reference/PidcockArticle) serves the nuclear magnetic resonance markup language (nmrML) with meaningful descriptors to amend the nmrML xml. Metabolomics scientists are encouraged to use this CV to annotrate their raw and experimental context data. The approach to have an exchange syntax mixed of an xsd and CV stems from the PSI mzML effort. The reason to branch out from an xsd into a CV is, that in areas where the terminology is likely to change faster than the nmrML xsd could be updated and aligned, an externally and possably decentrallised maintained CV can accompensate for such dynamics in a more flexible way. A second reason for this set-up is that semantic validity of CV terms used in a valid nmrML XML instance (allowed CV terms, position/relation to each other, cardinality) can be validated by rule-based proprietary validators:
By means of cardinality specifications and XPath expressions defined in an XML mapping file (an instances of the CvMappingRules.xsd ), one can define what ontology terms are allowed in a specific location of the data model.{})auto-generated-by( OBO-Edit 2.2{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400032 xsd:string{})property_value( defaultLanguage en xsd:string{})property_value( http://purl.org/dc/elements/1.1/title nuclear magnetic resonance CV xsd:string{})is_obsolete( http://www.metabolomicscentre.ca/nmrML/msi-nmr.obo{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400073 xsd:string{})property_value( owl:versionInfo 1.0.0 xsd:string{})xref( In case we like to be able to convert this owl CV back into the obo format, we should only use DL/owl constructs that are supported by obo. Hence, editors of this CV should take care not to use any higher descriptrion logics semantics, i.e. cardinality restrictions or defined terms using constructors. We should start to build the taxonomic backbone first and later connect the main axis via relations.
If we want to use restrictions, we should only use existential quantifiers as the OBO format does not support universal quantification.

List of terms required by current XSD (August 2013): these were bookmarked in CV (annotation property) and are visible in the new nmrTab:

CVTerm occurrences:
buffer-->buffer
solvent-->solvent
concentration standard type-->calibration compound , what is chemical shift reference ? What calibration_reference_shift under calibration compound ?
concentration standard name we here see a use-mention problem arising for the CV. The xsd should probably change here to avoid this.
encoding method (Quadrature detection method) is this the same as encoding method ?
sample container-->NMR_sample_holder
(spectrum) y axis type-->coordinate system descriptor
post acquisition solvent suppression method Two usages in xsd, but with differrent type ? -->solvent suppression method
calibration compound Two usages in xsd, but with differrent type ?-->calibration compound
data transformation method-->data transformation method
(spectral) projection method-->projection method
spectral denoising method-->spectral denoising method
window function method-->window function method
baseline correction method-->baseline correction
sample type-->NMR sample

CVParam occurrences:
file content-->data file content
software type-->software
source file type-->data file attribute (needs refactoring)
instrument configuration type-->instrument configuration
processing method type-->data processing method

CVParamType occurrences:
chemical shift standard-->chemical shift standard
solvent suppression method-->solvent suppression method
encoding scheme (Quadrature detection method)-->encoding method
window function parameter-->window function parameter

CVParamWithUnitType occurrences:
CVParamWithUnitType is currently not used in the xsd and dangling ! I assume ValueWithUnitType substitutes it ?

UserParamType occurrences:
No CV terms needed

ValueWithUnitType occurrences:
These will have to be used from the Unit ontology.{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400042 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1000013 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400068 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1002010 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1000524 xsd:string{})created_by( COSMOS consortium{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400033 xsd:string{})property_value( bug-database https://github.com/nmrML/nmrML/issues?labels=enhancement&state=open
Please label your CV issues in the git 'CVenhancement' or add the "CV:" prefix into the subject line. For more general critisism and recommendations please use our nmrML email list. xsd:string{})remark( This version uses the Basic Formal Ontology (BFO) as its top level ontology. All previosous versions used the BiotopLight (btl2) bio upper level ontology. We might close the resulting semantic gap by using OBI and IAO as intermediate bridges later.{})property_value( implements https://github.com/nmrML/nmrML xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400074 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1000014 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400043 xsd:string{})property_value( mailing-list https://groups.google.com/forum/?hl=en#!forum/nmrml/join xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1000531 xsd:string{})property_value( http://purl.org/dc/elements/1.1/rights Creative Commons Public Domain Mark 1.0 xsd:string{})property_value( http://xmlns.com/foaf/0.1/homepage http://nmrml.org/cv/ xsd:string{})saved-by( dschober{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400096 xsd:string{})is_obsolete( http://bioportal.bioontology.org/ontologies/1033{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400128 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1000011 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400075 xsd:string{})property_value( bookmark http://nmrML.org/nmrCV#NMR:1400062 xsd:string{})property_value( location https://github.com/nmrML/nmrML/tree/master/ontologies xsd:string{})creation_date( 03.03.2015{})property_value( audience This CV is to be used by metabolomics researchers who apply the nmrML xml to store their NMR experimental results(primarily raw data) and (limited) basic metadata. xsd:string{})property_value( http://purl.org/dc/elements/1.1/format Rather flat CV in OWL syntax. Taxonomic backbone with few relations used. No OWL DL complexity such as cardinalities, blank nodes, nested class definitions. The Semantic Validator used an OBO converted file format due to historic reasons. The OBO file is auto-generated-by the OWL API (version 3.4.2). xsd:string{})property_value( documenter Daniel Schober xsd:string{})owl-axioms( Prefix(owl:=<http://www.w3.org/2002/07/owl#>)
Prefix(rdf:=<http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
Prefix(xml:=<http://www.w3.org/XML/1998/namespace>)
Prefix(xsd:=<http://www.w3.org/2001/XMLSchema#>)
Prefix(rdfs:=<http://www.w3.org/2000/01/rdf-schema#>)


Ontology(
AnnotationAssertion(rdfs:comment <http://nmrML.org/nmrCV#NMR:1000409> "")
){}))
at org.obolibrary.oboformat.model.Frame.checkMaxOneCardinality(Frame.java:255)
at org.obolibrary.oboformat.model.Frame.check(Frame.java:235)
at org.obolibrary.oboformat.model.OBODoc.check(OBODoc.java:235)
at org.obolibrary.oboformat.writer.OBOFormatWriter.write(OBOFormatWriter.java:126)
at org.obolibrary.oboformat.writer.OBOFormatWriter.write(OBOFormatWriter.java:120)
at org.obolibrary.cli.OBORunner.runConversion(OBORunner.java:171)
at org.obolibrary.cli.OBORunner.main(OBORunner.java:88)





--
IPB Halle AG Massenspektrometrie & Bioinformatik
Dr. Steffen Neumann http://www.IPB-Halle.DE
Weinberg 3 http://msbi.bic-gh.de
06120 Halle Tel. +49 (0) 345 5582 - 1470
+49 (0) 345 5582 - 0
sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409
signature.asc

Chris Mungall

unread,
Mar 4, 2015, 12:31:20 PM3/4/15
to Steffen Neumann, obo-f...@lists.sourceforge.net, Philippe Rocca-Serra
Try

owltools nmrCV.owl -o -f obo --no-check nmrCV.obo

You'll need https://github.com/owlcollab/owltools

But as a general rule an owl file in the wild will need massaging to get
into something usable. The obo spec is a superset of what legacy tools
typically expect and I assume the obo conversion is for a legacy tool
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub
> for all
> things parallel software development, from weekly thought leadership
> blogs to
> news, videos, case studies, tutorials and more. Take a look and join
> the
> conversation now.
> http://goparallel.sourceforge.net/_______________________________________________
> Obo-format mailing list
> Obo-f...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/obo-format

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Obo-format mailing list
Obo-f...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/obo-format

Steffen Neumann

unread,
Mar 4, 2015, 1:53:35 PM3/4/15
to Chris Mungall, obo-f...@lists.sourceforge.net, Philippe Rocca-Serra
Dear Chris,

thanks for the prompt answer, owltools did the job!

Yours,
Steffen
signature.asc
Reply all
Reply to author
Forward
0 new messages