HPO and Gene Ontology Licenses

30 views
Skip to first unread message

Peter Ansell

unread,
Aug 7, 2012, 9:03:08 PM8/7/12
to Chris Mungall, Michel Dumontier, HCLS, bio2rdf, Peter Robinson
On 8 August 2012 02:46, Chris Mungall <cjmu...@lbl.gov> wrote:
> Hi Michael
>
> I can't seem to connect to the triplestore.
>
> Have you considered adding associations between OMIM and phenotype ontology
> classes? These can be downloaded from
> http://www.human-phenotype-ontology.org/ as tab delimited files that can
> trivially be converted to an rdf model of choice (we will be providing OWL
> for this ourselves in the future, it will likely differ in modeling and URIs
> from bio2rdf).

The HPO files cannot be modified though given the following license condition:

"That neither the content of the HPO file(s) nor the logical
relationships embedded within the HPO file(s) be altered in any way."
[1]

in the same way that Gene Ontology files cannot be legally modifed
using a very similar license condition:

"That neither the content of the GO file(s) nor the logical
relationships embedded within the GO file(s) be altered in any way."
[2]

Therefore Bio2RDF should not be converting the HPO classes to RDF
ourselves, and I doubt that it is legal for anyone including Bio2RDF
to do so for Gene Ontology either.

I understand that ontology authors want to avoid confusion by avoiding
having multiple possibly inconsistent versions of an ontology
available from different locations under the same name.

However, as a consequence Bio2RDF and other Linked Data providers
really should not be republishing these and other ontologies that are
released under similar conditions, as we require the ability to change
the URIs and convert blank nodes to Bio2RDF URIs to make them
available as Linked Data for reuse and possibly modification by any
downstream user.

I went though doing a quick evaluation of the openness of bioportal
ontologies on The DataHub [3] recently and found very few that
actually provide any license statement on their websites, for those
that provide current websites. Since HPO and GO both clearly define
their licenses I switched them from Open Data to Not open data, as
they fail the Reuse condition in the Open Data Definition [4]. In
addition, any license incorporating an "academic-only" or
"non-commercial" clause fails the other clauses in the Open Data
Definition. It is not clear why bioportal [5] has loaded all of the
bioportal ontologies into The DataHub as if they are all "Open Data
with unknown conditions", by default, as bioportal does not seem to
internally verify the openness of any of the ontologies it provides
for download.

Peter

[1] http://www.human-phenotype-ontology.org/contao/index.php/legal-issues.html
[2] http://www.geneontology.org/GO.cite.shtml
[3] http://thedatahub.org/group/bioportal
[4] http://opendefinition.org/okd/
[5] http://thedatahub.org/user/bioportal

Peter Ansell

unread,
Aug 8, 2012, 1:36:26 AM8/8/12
to Robinson, Peter, Chris Mungall, Michel Dumontier, HCLS, bio2rdf
Hi Peter,

I understand completely. The usage policy is very liberal in terms of
distribution and we are glad for that!

Would it be possible for us (Michel and I) to make suggestions with
the goal of publishing a version that matches the no-blank-node policy
that Bio2RDF attempts to follow and uses URIs structures that can
resolve using http://bio2rdf.org/. We don't want to make material
changes to any of the terms but we would like to make the resulting
RDF graph browsable as Linked Data, as far as possible. To enable that
we need to directly resolve URIs for items to their definitions, by
replacing fragment/hash identifiers with
http://bio2rdf.org/ns:identifier equivalents, for example.

Thanks,

Peter Ansell

On 8 August 2012 15:20, Robinson, Peter <peter.r...@charite.de> wrote:
> Hi Peter,
>
> given that the HPO is being used by medical groups for real patient data, we think it is potentially dangerous to allow external groups to change the data and present it elsewhere, given some of the notorious difficulties in actually understanding what some medical terms mean (even for us MDs). This was the reason for the license statement, which other than that is quite liberal. However, we would be happy to work with you to find a solution, which could forsee us providing RDF on our website which you could import.
>
> BW Peter
>
>
>
>
>
> PD Dr. med. Peter N. Robinson, MSc.
> Institut für Medizinische Genetik und Humangenetik
> Charité - Universitätsmedizin Berlin
> Augustenburger Platz 1
> 13353 Berlin
> Germany
> +4930 450566006
> Mobile: 0160 93769872
> peter.r...@charite.de
> http://compbio.charite.de
> http://www.human-phenotype-ontology.org
> Introduction to Bio-Ontologies: http://www.crcpress.com/product/isbn/9781439836651
> ________________________________________
> Von: Peter Ansell [ansell...@gmail.com]
> Gesendet: Mittwoch, 8. August 2012 03:03
> An: Chris Mungall
> Cc: Michel Dumontier; HCLS; bio2rdf; Robinson, Peter
> Betreff: HPO and Gene Ontology Licenses

Robinson, Peter

unread,
Aug 8, 2012, 1:20:27 AM8/8/12
to Peter Ansell, Chris Mungall, Michel Dumontier, HCLS, bio2rdf
Hi Peter,

given that the HPO is being used by medical groups for real patient data, we think it is potentially dangerous to allow external groups to change the data and present it elsewhere, given some of the notorious difficulties in actually understanding what some medical terms mean (even for us MDs). This was the reason for the license statement, which other than that is quite liberal. However, we would be happy to work with you to find a solution, which could forsee us providing RDF on our website which you could import.

BW Peter





PD Dr. med. Peter N. Robinson, MSc.
Institut für Medizinische Genetik und Humangenetik
Charité - Universitätsmedizin Berlin
Augustenburger Platz 1
13353 Berlin
Germany
+4930 450566006
Mobile: 0160 93769872
peter.r...@charite.de
http://compbio.charite.de
http://www.human-phenotype-ontology.org
Introduction to Bio-Ontologies: http://www.crcpress.com/product/isbn/9781439836651
________________________________________
Von: Peter Ansell [ansell...@gmail.com]
Gesendet: Mittwoch, 8. August 2012 03:03
An: Chris Mungall
Cc: Michel Dumontier; HCLS; bio2rdf; Robinson, Peter
Betreff: HPO and Gene Ontology Licenses

Robinson, Peter

unread,
Aug 8, 2012, 2:07:31 AM8/8/12
to Peter Ansell, Chris Mungall, Michel Dumontier, HCLS, bio2rdf
Hi Peter,

please do make some suggestions and or send me a link to some documentation.

cheers Peter


PD Dr. med. Peter N. Robinson, MSc.
Institut für Medizinische Genetik und Humangenetik
Charité - Universitätsmedizin Berlin
Augustenburger Platz 1
13353 Berlin
Germany
+4930 450566006
Mobile: 0160 93769872
peter.r...@charite.de
http://compbio.charite.de
http://www.human-phenotype-ontology.org
Introduction to Bio-Ontologies: http://www.crcpress.com/product/isbn/9781439836651
________________________________________
Von: Peter Ansell [ansell...@gmail.com]
Gesendet: Mittwoch, 8. August 2012 07:36
An: Robinson, Peter
Cc: Chris Mungall; Michel Dumontier; HCLS; bio2rdf
Betreff: Re: HPO and Gene Ontology Licenses

Alison Callahan

unread,
Aug 8, 2012, 9:52:44 AM8/8/12
to peter.r...@charite.de, bio...@googlegroups.com, Peter Ansell, Chris Mungall, Michel Dumontier, HCLS
Hi Peter,

We have made a preliminary RDFization guide for Bio2RDF, available here: http://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization-Guide-v1.0

This guide will give you a general idea of how we generate Bio2RDF URIs when converting datasets to Bio2RDF linked data.

If you have any questions, please feel free to email the Bio2RDF mailing list at bio...@googlegroups.com.

Cheers,

Alison

--
You received this message because you are subscribed to the Google Groups "bio2rdf" group.
To post to this group, send email to bio...@googlegroups.com.
To unsubscribe from this group, send email to bio2rdf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/bio2rdf?hl=en.


Michel Dumontier

unread,
Aug 8, 2012, 12:12:02 PM8/8/12
to Alan Ruttenberg, Peter Ansell, Robinson, Peter, Chris Mungall, HCLS, bio2rdf
Hi Alan,
  

On Wed, Aug 8, 2012 at 10:00 AM, Alan Ruttenberg <alanrut...@gmail.com> wrote:
We have discussed that the OBO Foundry policy is to use CC0 or CC-BY
and it has been put to the GO that we would like to migrate to that
license. I don't know the status of that discussion.


We certainly welcome the adoption of standard licenses for OBO ontologies.

 
That said, I would be strongly discouraging of (but unable to prevent)
any "no-blank-node" rendering of GO ontologies, and in particular
would note that such a transformation would render any OWL we publish
unsyntactic.

Not sure what you mean by "unsyntactic". The objective here would be to provide an RDF and SPARQL friendly version of OBO ontologies. It would reduce the ontological commitment to RDF, so in this sense, it would be a semantic loss, but makes it easier to retrieve relations between entities.  We could always provided links to OWL versions, if they are available.
 

Further, the OBO ID Policy has been, for the most part, been put in
place and we do not use hash URIs and are moving to having all OBO
URIs resolving to page per view. See for example
http://purl.obolibrary.org/obo/IAO_0000032


does the OBO Foundry automatically check conformance?  Is there a report page for each ontology?


 
So the Foundry is already in the process of making all of the OBO
available as linked ontology data. I would suggest other groups join
this effort rather than setting out to duplicate and add confusion by
having a parallel set of identifiers for the same set of entities.

I know about berkeley's download page -


is this what you are referring to?

 
In fact, there have been a number of OBO participants who prefer the
the current GO license precisely because it prevents this kind of
duplicative, confusing practice, a practice that is discouraged even
by the W3C standards these groups are chartered to work with.

For more information about OBO efforts in this area, please see
http://code.google.com/p/oboformat/  and
http://code.google.com/p/owltools/

-Alan


I don't see RDF or SPARQL endpoints being provided at either of those links.


m.



--
Michel Dumontier
Associate Professor of Bioinformatics, Carleton University
Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group

Michel Dumontier

unread,
Aug 8, 2012, 5:42:04 PM8/8/12
to Alan Ruttenberg, Peter Ansell, Robinson, Peter, Chris Mungall, HCLS, bio2rdf


> The objective here would be to provide an RDF and SPARQL friendly version of OBO ontologies. It would reduce the ontological commitment to RDF, so in this sense, it would be a semantic loss, but makes it easier to retrieve relations between entities.  We could always provided links to OWL versions, if they are available.

There have been discussions of, e.g. Skolemization of existing resources, and those transformations are destructive.

Reducing to RDF would not change the ontological commitment, but would lose information. In any case such a transformation should not entail minting new Uris for all resources. In addition, I question the value of transforming the ontologies in this way, given the disadvantages of not encouraging a uniform, author provided view on the resources. The OWL resources are SPARQL friendly enough to build onto bee displays. If you look at the bottom of the page there are links to see the SPARQL queries used to construct the pages. A more constructive effort, in my opinion, would be to build SPARQL parser extensions along the lines of TERP that make it easier to query the ontology as it is. 

  It's certainly a step forward to have software like OntoBee to fetch and render OWL ontologies from triplestores. But it's also important to acknowledge that there are many other uses than just humans looking at it.  As you know, we and others have demonstrated that alternative representations and reformulation of knowledge is desirable for certain kinds of scientific inquiry. 

  In any case, the issue at hand is that of control - and what constitutes open and free software.  It's worth noting that more than one OBO ontology has come about because all there was available was non-open, non-free or "proprietary" terminologies and ontologies. Well, if one isn't free to copy, modify, distribute, change and improve the ontology, then it isn't free and it isn't open.


Making ontologies available in RDF is no different than providing data in SQL database dumps, XML files, flat files, REST APIs or whatever format makes it easier for somebody to reuse content. 


>> Further, the OBO ID Policy has been, for the most part, been put in
>> place and we do not use hash URIs and are moving to having all OBO
>> URIs resolving to page per view. See for example
>> http://purl.obolibrary.org/obo/IAO_0000032
>>
>
> does the OBO Foundry automatically check conformance?  Is there a report page for each ontology?

Conformance to what? To the OWL Spec? Yes, I believe it does, through the OORT and Jenkins build tools, but I'll leave it to Chris to detail that. If there is something you are looking specifically for I expect it could be provided. Or you could collaborate with us to build such services. I believe that collaboration towards building a stronger single distribution is a much better way to spend effort, in the long run.

I personally like and support the NCBO's bioportal as a central repository for accessing and downloading ontologies. They poll for the latest, bring it into their system, mint URIs, index, map - lots of added value.  Lots of the stuff you mention below has already been done, and has the funding to continue supporting this. It might be worth investigating how OntoBee technology can add pretty rendering of OBO/OWL ontologies in BioPortal.


>> So the Foundry is already in the process of making all of the OBO
>> available as linked ontology data. I would suggest other groups join
>> this effort rather than setting out to duplicate and add confusion by
>> having a parallel set of identifiers for the same set of entities. 
>>
> I know about berkeley's download page -
> http://www.berkeleybop.org/ontologies/
> is this what you are referring to?

We are moving towards completely using Web standards. Eventually, we will have all OBO ontologies available at http://purl.obolibrary.org/obo/<namespace>.owl . This, and other information about deployment is in http://obofoundry.org/obo/id-policy.shtml, which describes what we are  trying to put in place. Again, we are making progress in this effort, but help could certainly be used. Chris pays attention to where ontologies are before they arrive at their documented location. I attend to ensuring that once there they behave as expected according to web standards. If you look at http://www.ontobee.org and select an ontology there should be metadata about where the ontology was downloaded from to get it into Ontobee.

>> In fact, there have been a number of OBO participants who prefer the
>> the current GO license precisely because it prevents this kind of
>> duplicative, confusing practice, a practice that is discouraged even
>> by the W3C standards these groups are chartered to work with.
>>
>> For more information about OBO efforts in this area, please see
>> http://code.google.com/p/oboformat/  and
>> http://code.google.com/p/owltools/
>>
>> -Alan
>>
>
> I don't see RDF or SPARQL endpoints being provided at either of those links.

Indeed. They are not where you would expect to find them. 

There are two sparql endpoints at the moment, each with different approaches. We are working toward deciding on and documenting expected behavior and then ensuring we can provision them well enough to stand up to regular use.

Understand also that we are coming to the end of a multi-year effort to regularize our URIs, defining a proper OWL translation of OBO, and providing a new BFO to be the basis for these ontologies. We are not quite finished. I would be most comfortable publishing a stable endpoint once this transition was over. Again, assistance in deploying mirrors and in helping with all the various loose ends needed before the resource can be considered stable would be very much welcomed.

http://sparql.obo.neurocommons.org/ intended to serve ontologies using the legacy URIs (needs to be reviewed - hasn't been in a while.)
http://sparql.obodev.neurocommons.org/ intended to serve ontologies using the current URIs (same as above)
http://sparql.hegroup.org/sparql serves the ontobee server (not meant for wide consumption, but useful for prototyping)

 
don't forget the NCBO's SPARQL endpoint - I've already started using it with much success.  Bio2RDF also has an endpoint, but it's subject to being revisited and added to our growing pipeline.
 
Members of HCLS who wish to assist with maintenance of the Neurocommons endpoints would be welcomed. Once they are reviewed and brought up to date on which ontologies they load, the Neurocommons RDF Bundling system will provide an addition distribution mechanism for creating mirrors.
 
So Michel, and other HCLS users, consider this an invitation: The OBO Foundry is very close to providing a stable, well thought through process for semantic web deployment of OBO ontologies. We could very much use technical support in finishing a number of technical loose ends, in providing tools that build on these efforts, and on making it easy to access existing endpoints or provide mirrors of the content. If there is sufficient interest in this within the group perhaps Chris and I can schedule a time when we could meet with those interested and see what the possibilities are.

I think that's a good idea. Let's aim for sometime in September.
 
m.


Sincerely,
Alan Ruttenberg
> --
> Michel Dumontier
> Associate Professor of Bioinformatics, Carleton University
> Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group
> http://dumontierlab.com
>

Peter Ansell

unread,
Aug 8, 2012, 8:28:29 PM8/8/12
to Alan Ruttenberg, Robinson, Peter, Chris Mungall, Michel Dumontier, HCLS, bio2rdf
On 9 August 2012 00:00, Alan Ruttenberg <alanrut...@gmail.com> wrote:
> We have discussed that the OBO Foundry policy is to use CC0 or CC-BY
> and it has been put to the GO that we would like to migrate to that
> license. I don't know the status of that discussion.

Was there a discussion of the merits of CC-BY-SA as a possible license?

> That said, I would be strongly discouraging of (but unable to prevent)
> any "no-blank-node" rendering of GO ontologies, and in particular
> would note that such a transformation would render any OWL we publish
> unsyntactic.

I have always been intrigued as to why OWL was designed to make it
syntactically incorrect to give URIs to restrictions etc. What was the
background behind this choice?

> Further, the OBO ID Policy has been, for the most part, been put in
> place and we do not use hash URIs and are moving to having all OBO
> URIs resolving to page per view. See for example
> http://purl.obolibrary.org/obo/IAO_0000032

That is a great step! Along with blank nodes and annotating resources
with known links in the Linked Data sphere, resolvable URIs are the
main goals of Bio2RDF.

> So the Foundry is already in the process of making all of the OBO
> available as linked ontology data. I would suggest other groups join
> this effort rather than setting out to duplicate and add confusion by
> having a parallel set of identifiers for the same set of entities.

I think you and I have always had a difference of opinion about the
relative benefits and difficulties of a non-unique URI Linked Data
sphere. It is great to have a freely accepted single effort on a
single goal, but disjoint communities need to be Free/Open to explore
other value-added options in my opinion. Where we know another URI for
an item Bio2RDF always attempts to refer back to it in some
semantically useful way.

> In fact, there have been a number of OBO participants who prefer the
> the current GO license precisely because it prevents this kind of
> duplicative, confusing practice, a practice that is discouraged even
> by the W3C standards these groups are chartered to work with.

It may be useful while communities are active to synchronise efforts,
but as soon as a community becomes inactive the locked down artifacts
it produced are unable to be revived. In addition, if the community
legitimately has a fundamental disagreement, then there may only be
one possible winner if the data is not Open, which may not be the best
thing in evolutionary terms. Even inside of an organisation as active
as OBO, the community around a particular ontology might become
inactive and discourage people from either contributing suggestions,
or worse, ignore suggestions by not responding, leading eventually to
a completely new effort--with the huge startup costs attached to that
process--*if* the ontology cannot be forked.

I would prefer that there was another method used by GO to encourage
the use of a single definitive source, without requiring it. For
example, one alternative may be to using trademarks to prevent
redistribution under the same or similar names, but not preventing
forking if it becomes necessary due to inactivity or other legitimate
reasons that are not known at the current time.

It would also be nice to have people submit their license terms to
BioPortal, in particular, (if possible in a similar way to when OBO
has a standard across the board of either CC0 or CC-BY/CC-BY-SA) so
people don't have to go fishing to find them. It would also be nice if
people avoided describing things as Open when their intentions are
incompatible with the generally accepted principles of Open'ness in
terms of Open Source [1] and Open Data [2]. Trademarking is designed
to prevent this duplicity/confusion--which noone desires--without
impeding on either copyright or Open/copyleft principles.

Cheers,

Peter

[1] http://opensource.org/docs/osd
[2] http://opendefinition.org/okd/

Alan Ruttenberg

unread,
Aug 8, 2012, 9:52:58 PM8/8/12
to Peter Ansell, Robinson, Peter, Chris Mungall, Michel Dumontier, HCLS, bio2rdf
On Wed, Aug 8, 2012 at 8:28 PM, Peter Ansell <ansell...@gmail.com> wrote:
> On 9 August 2012 00:00, Alan Ruttenberg <alanrut...@gmail.com> wrote:
>> We have discussed that the OBO Foundry policy is to use CC0 or CC-BY
>> and it has been put to the GO that we would like to migrate to that
>> license. I don't know the status of that discussion.
>
> Was there a discussion of the merits of CC-BY-SA as a possible license?
>
>> That said, I would be strongly discouraging of (but unable to prevent)
>> any "no-blank-node" rendering of GO ontologies, and in particular
>> would note that such a transformation would render any OWL we publish
>> unsyntactic.
>
> I have always been intrigued as to why OWL was designed to make it
> syntactically incorrect to give URIs to restrictions etc. What was the
> background behind this choice?

I don't have enough information to answer this question. The choice
was made in the first OWL working group, which I did not participate
in. However I don't really what all the fuss around blank nodes and
skolemization is tbh. I've not found them to be a problem, and I'm a
little suspicious about the technical chops of those who do.

>> Further, the OBO ID Policy has been, for the most part, been put in
>> place and we do not use hash URIs and are moving to having all OBO
>> URIs resolving to page per view. See for example
>> http://purl.obolibrary.org/obo/IAO_0000032
>
> That is a great step! Along with blank nodes and annotating resources
> with known links in the Linked Data sphere, resolvable URIs are the
> main goals of Bio2RDF.

Before there was Bio2RDF, there was Neurocommons, and we have had the
same goal all along. However none of the efforts we have seen have
done justice to serving up ontology terms, not working through what it
means to serve OWL. So I agree that Ontobee is a great step. BTW, view
source for the RDF. Ontobee adopts the design we worked early in the
Neurocommons effort, which is pretty much a direct implementation of
what was stated as a goal of the semantic web stack. If a machine does
a get on the resource, you get a document meant for machine use. If
you view it in a browser, you see something helpful.

>
>> So the Foundry is already in the process of making all of the OBO
>> available as linked ontology data. I would suggest other groups join
>> this effort rather than setting out to duplicate and add confusion by
>> having a parallel set of identifiers for the same set of entities.
>
> I think you and I have always had a difference of opinion about the
> relative benefits and difficulties of a non-unique URI Linked Data
> sphere. It is great to have a freely accepted single effort on a
> single goal, but disjoint communities need to be Free/Open to explore
> other value-added options in my opinion. Where we know another URI for
> an item Bio2RDF always attempts to refer back to it in some
> semantically useful way.

You are free to do what you want.

That said, in a background of freedom of choice I'm looking to be
effective and I've not observed there to be benefit to having multiple
URIs for a single term. Rather the opposite. TBL realized this early
on which is why the advise from the beginning was to reuse URIs.

In the context of ontologies and resources maintained by no one, or
built and forgot, I can see the merits of perhaps reforming a resource
in order to improve it. That isn't the case with the OBO Foundry. Here
you have an organization with buy-in from a lot of organizations,
funders, and biologists, with expertise in OWL and semweb tools and
deployment, and with a stated commitment to doing work that can last
and is worth lasting. So while you are free to do what you want, in my
opinion, the goals we all want want to achieve will be best done so by
collaboration towards building the best resource we can, rather than
in multiple disjoint efforts. But if you disagree, by all means: knock
yourself out.

>> In fact, there have been a number of OBO participants who prefer the
>> the current GO license precisely because it prevents this kind of
>> duplicative, confusing practice, a practice that is discouraged even
>> by the W3C standards these groups are chartered to work with.
>
> It may be useful while communities are active to synchronise efforts,
> but as soon as a community becomes inactive the locked down artifacts
> it produced are unable to be revived.

Not a problem with OBO. We are quite active if you have a look.

> In addition, if the community
> legitimately has a fundamental disagreement, then there may only be
> one possible winner if the data is not Open, which may not be the best
> thing in evolutionary terms.

I am unaware of any fundamental disagreements, only outstanding issues
that need solutions.

> Even inside of an organisation as active
> as OBO, the community around a particular ontology might become
> inactive and discourage people from either contributing suggestions,
> or worse, ignore suggestions by not responding, leading eventually to
> a completely new effort--with the huge startup costs attached to that
> process--*if* the ontology cannot be forked.

A bunch of us have worked for years trying to bring about the OBO
Foundry, which has principles designed to avoid this. While your
concerns are valid in the general case, I would again urge you to
consider how you might most effectively spend your time *in this case,
in this field, with the current situation that exists*. Do you want to
duplicate efforts of many, _just in case_, probably reducing the
effectiveness of the whole effort, or do you want to try to join and
help make the thing more likely to succeed in the long run?

Having the ontology forked is the worst of all outcomes. It raises
doubt about the meaning of every terms. For OBO Foundry ontologies,
the principle we're working with now is that if something better comes
along the groups collaborate. Where that needs facilitation we've got
a board that tries to help. Is this a process that is perfect? No. But
I'll offer that in doing this we're making a lot of progress in
understanding where difficulties arise and trying to adjust policies
make such situations easier in the future.

Frankly, from a technical point of view, I don't understand the need
to "fork" rather than augment. The MIREOT work we've done has
established a pattern that makes the unit of reuse be a term rather
than a whole ontology. If another effort wants to build on (or rebuild
below) some term, they can use that term and then extend, choosing to
ignore terms they don't need.

> I would prefer that there was another method used by GO to encourage
> the use of a single definitive source, without requiring it. For
> example, one alternative may be to using trademarks to prevent
> redistribution under the same or similar names, but not preventing
> forking if it becomes necessary due to inactivity or other legitimate
> reasons that are not known at the current time.

It is not the intention of the current Foundry members to prevent
forking by legal means, only persuasion. I have told you our policy
and the status wrt GO, and the process of making that happen will
continue.

You also have a mistaken view that using the GO means you are getting
information from a single definitive source - there are many many
contributors to the GO. One of my side projects is encouraging that
steps be taken to make that clear by attributing more carefully on a
term by term basis. That'll come. You might even help if you shifted
your focus a bit from rearranging RDF to exposing unavailable
information.

And again, I don't agree that having a single definitive source for a
term is a problem. You can always define *a different term* that means
something different and become the authority for that. Or even means
the same thing (though it is hard to imagine why you would want to -
there's plenty of remaining new work to be done without wasting time
redoing things that don't need to be redone).

> It would also be nice to have people submit their license terms to
> BioPortal, in particular, (if possible in a similar way to when OBO
> has a standard across the board of either CC0 or CC-BY/CC-BY-SA) so
> people don't have to go fishing to find them.

Submit a ticket to them. I would like if they, as an organization,
even respected the license terms of the ontologies they current serve.

> It would also be nice if people avoided describing things as Open when their intentions are
> incompatible with the generally accepted principles of Open'ness in
> terms of Open Source [1] and Open Data [2]. Trademarking is designed
> to prevent this duplicity/confusion--which noone desires--without
> impeding on either copyright or Open/copyleft principles.

There are distinct issues that are involved with making a license for
artifacts such as ontologies and standards that are also open. Many
believe that there should be an
as-open-as-possible-without-killing-the-goose license specifically
designed for these situations. You'll find that there are issues with
reuse of W3C standards documents. That said, recall the foundry
position in advocating CC0 or CC-BY. And be aware that the only real
pushback we've had to this are those (especially Hilmar) pushing us to
keep to CC0. (warms my heart). Compare, for example, UMLS.

And please don't lecture me or the GO about open. When you have
accomplished even a small fraction of what the GO has for open come
back and we'll have a discussion.


Regards,
Alan

Alan Ruttenberg

unread,
Aug 8, 2012, 1:54:39 PM8/8/12
to Michel Dumontier, Peter Ansell, Robinson, Peter, Chris Mungall, HCLS, bio2rdf


On Wednesday, August 8, 2012, Michel Dumontier <michel.d...@gmail.com> wrote:
> Hi Alan,
>   
>
> On Wed, Aug 8, 2012 at 10:00 AM, Alan Ruttenberg <alanrut...@gmail.com> wrote:
>>
>> We have discussed that the OBO Foundry policy is to use CC0 or CC-BY
>> and it has been put to the GO that we would like to migrate to that
>> license. I don't know the status of that discussion.
>>
>
> We certainly welcome the adoption of standard licenses for OBO ontologies.
>  
>>
>> That said, I would be strongly discouraging of (but unable to prevent)
>> any "no-blank-node" rendering of GO ontologies, and in particular
>> would note that such a transformation would render any OWL we publish
>> unsyntactic.
>
> Not sure what you mean by "unsyntactic".

> The objective here would be to provide an RDF and SPARQL friendly version of OBO ontologies. It would reduce the ontological commitment to RDF, so in this sense, it would be a semantic loss, but makes it easier to retrieve relations between entities.  We could always provided links to OWL versions, if they are available.

There have been discussions of, e.g. Skolemization of existing resources, and those transformations are destructive.

Reducing to RDF would not change the ontological commitment, but would lose information. In any case such a transformation should not entail minting new Uris for all resources. In addition, I question the value of transforming the ontologies in this way, given the disadvantages of not encouraging a uniform, author provided view on the resources. The OWL resources are SPARQL friendly enough to build onto bee displays. If you look at the bottom of the page there are links to see the SPARQL queries used to construct the pages. A more constructive effort, in my opinion, would be to build SPARQL parser extensions along the lines of TERP that make it easier to query the ontology as it is. 

>> Further, the OBO ID Policy has been, for the most part, been put in
>> place and we do not use hash URIs and are moving to having all OBO
>> URIs resolving to page per view. See for example
>> http://purl.obolibrary.org/obo/IAO_0000032
>>
>
> does the OBO Foundry automatically check conformance?  Is there a report page for each ontology?

Conformance to what? To the OWL Spec? Yes, I believe it does, through the OORT and Jenkins build tools, but I'll leave it to Chris to detail that. If there is something you are looking specifically for I expect it could be provided. Or you could collaborate with us to build such services. I believe that collaboration towards building a stronger single distribution is a much better way to spend effort, in the long run.

>> So the Foundry is already in the process of making all of the OBO
>> available as linked ontology data. I would suggest other groups join
>> this effort rather than setting out to duplicate and add confusion by
>> having a parallel set of identifiers for the same set of entities.
>>
> I know about berkeley's download page -
> http://www.berkeleybop.org/ontologies/
> is this what you are referring to?

We are moving towards completely using Web standards. Eventually, we will have all OBO ontologies available at http://purl.obolibrary.org/obo/<namespace>.owl . This, and other information about deployment is in http://obofoundry.org/obo/id-policy.shtml, which describes what we are  trying to put in place. Again, we are making progress in this effort, but help could certainly be used. Chris pays attention to where ontologies are before they arrive at their documented location. I attend to ensuring that once there they behave as expected according to web standards. If you look at http://www.ontobee.org and select an ontology there should be metadata about where the ontology was downloaded from to get it into Ontobee.
>> In fact, there have been a number of OBO participants who prefer the
>> the current GO license precisely because it prevents this kind of
>> duplicative, confusing practice, a practice that is discouraged even
>> by the W3C standards these groups are chartered to work with.
>>
>> For more information about OBO efforts in this area, please see
>> http://code.google.com/p/oboformat/  and
>> http://code.google.com/p/owltools/
>>
>> -Alan
>>
>
> I don't see RDF or SPARQL endpoints being provided at either of those links.
Indeed. They are not where you would expect to find them. 

There are two sparql endpoints at the moment, each with different approaches. We are working toward deciding on and documenting expected behavior and then ensuring we can provision them well enough to stand up to regular use.

Understand also that we are coming to the end of a multi-year effort to regularize our URIs, defining a proper OWL translation of OBO, and providing a new BFO to be the basis for these ontologies. We are not quite finished. I would be most comfortable publishing a stable endpoint once this transition was over. Again, assistance in deploying mirrors and in helping with all the various loose ends needed before the resource can be considered stable would be very much welcomed.

http://sparql.obo.neurocommons.org/ intended to serve ontologies using the legacy URIs (needs to be reviewed - hasn't been in a while.)
http://sparql.obodev.neurocommons.org/ intended to serve ontologies using the current URIs (same as above)
http://sparql.hegroup.org/sparql serves the ontobee server (not meant for wide consumption, but useful for prototyping)

Members of HCLS who wish to assist with maintenance of the Neurocommons endpoints would be welcomed. Once they are reviewed and brought up to date on which ontologies they load, the Neurocommons RDF Bundling system will provide an addition distribution mechanism for creating mirrors.

So Michel, and other HCLS users, consider this an invitation: The OBO Foundry is very close to providing a stable, well thought through process for semantic web deployment of OBO ontologies. We could very much use technical support in finishing a number of technical loose ends, in providing tools that build on these efforts, and on making it easy to access existing endpoints or provide mirrors of the content. If there is sufficient interest in this within the group perhaps Chris and I can schedule a time when we could meet with those interested and see what the possibilities are.

Sincerely,
Alan Ruttenberg

>

Alan Ruttenberg

unread,
Aug 8, 2012, 10:00:19 AM8/8/12
to Peter Ansell, Robinson, Peter, Chris Mungall, Michel Dumontier, HCLS, bio2rdf
We have discussed that the OBO Foundry policy is to use CC0 or CC-BY
and it has been put to the GO that we would like to migrate to that
license. I don't know the status of that discussion.

That said, I would be strongly discouraging of (but unable to prevent)
any "no-blank-node" rendering of GO ontologies, and in particular
would note that such a transformation would render any OWL we publish
unsyntactic.

Further, the OBO ID Policy has been, for the most part, been put in
place and we do not use hash URIs and are moving to having all OBO
URIs resolving to page per view. See for example
http://purl.obolibrary.org/obo/IAO_0000032

So the Foundry is already in the process of making all of the OBO
available as linked ontology data. I would suggest other groups join
this effort rather than setting out to duplicate and add confusion by
having a parallel set of identifiers for the same set of entities.

In fact, there have been a number of OBO participants who prefer the
the current GO license precisely because it prevents this kind of
duplicative, confusing practice, a practice that is discouraged even
by the W3C standards these groups are chartered to work with.

For more information about OBO efforts in this area, please see
http://code.google.com/p/oboformat/ and
http://code.google.com/p/owltools/

-Alan

On Wed, Aug 8, 2012 at 1:36 AM, Peter Ansell <ansell...@gmail.com> wrote:

Alan Ruttenberg

unread,
Aug 8, 2012, 10:37:02 PM8/8/12
to Michel Dumontier, Peter Ansell, Robinson, Peter, Chris Mungall, HCLS, bio2rdf
Sorry, I'm unaware of such demonstration. Could you cite some references?

What I am aware of are results such as those which have shown that the
*content* of the GO and other sources of prior knowledge can lead to
discoveries and greater understanding.

BTW, view source. Ontobee doesn't only fetch and render OWL for
humans. Ontobee is one leg of a variety of distribution channels for
OBO ontologies. Ontobee is responsible for the "linked data" serving
of ontologies, something I think it does rather well, and which is
still improving. Namely: It is the software that we are increasingly
using to provide resolvable URLs which respond appropriately for
computational as well as browser agents.We also distribute ontologies
as a whole from PURLs, have readable subversion repositories, have and
will have more SPARQL endpoints, and if you ask Chris I'm sure he's
tell you of another 1/2 dozen ways in which the OBOs are available.

> In any case, the issue at hand is that of control - and what constitutes
> open and free software. It's worth noting that more than one OBO ontology
> has come about because all there was available was non-open, non-free or
> "proprietary" terminologies and ontologies. Well, if one isn't free to copy,
> modify, distribute, change and improve the ontology, then it isn't free and
> it isn't open.
>
> http://www.gnu.org/philosophy/free-sw.html

Boy, run on lecturers to OBO about openness. Amusing. Please see my
remarks to Peter.

> Making ontologies available in RDF is no different than providing data in
> SQL database dumps, XML files, flat files, REST APIs or whatever format
> makes it easier for somebody to reuse content.

It is very different. In doing so you are subverting the design of the
semantic web by proposing to fuzz up the space with copies with
unclear differences from the original, published with different
identifiers than the original. The semweb vision is different than the
vision for other kinds of information technologies. It comes with its
own social contract, and my view is that the kind of effort you are
proposing breaks that contract. Sorry to disagree. Let me be clear:
Just rearranging RDF and minting new IDs is not a social good. I will
keep reminding you and others that there is a lot of work that needs
to be done that you all could be collaborating with us on to make a
stronger resource for everyone instead of spending time in this way.
And by the way you might even find the work more fun!

>> >> Further, the OBO ID Policy has been, for the most part, been put in
>> >> place and we do not use hash URIs and are moving to having all OBO
>> >> URIs resolving to page per view. See for example
>> >> http://purl.obolibrary.org/obo/IAO_0000032
>> >>
>> >
>> > does the OBO Foundry automatically check conformance? Is there a report
>> > page for each ontology?
>>
>> Conformance to what? To the OWL Spec? Yes, I believe it does, through the
>> OORT and Jenkins build tools, but I'll leave it to Chris to detail that. If
>> there is something you are looking specifically for I expect it could be
>> provided. Or you could collaborate with us to build such services. I believe
>> that collaboration towards building a stronger single distribution is a much
>> better way to spend effort, in the long run.
>
>
> I personally like and support the NCBO's bioportal as a central repository
> for accessing and downloading ontologies.

Enjoy.

> They poll for the latest, bring it into their system, mint URIs, index, map - lots of added value.

They wouldn't be able to poll for anything were it not for the case of
people putting it there in the first place. They aren't doing a
service if they are minting URIs for resources that already have them.
They aren't doing a service when they don't make it clear they are
trying to add value to existing standards instead of replacing them.

They do certain very useful things, particularly indexing, enabling
search over the space of ontologies, displaying terms. But I want them
to see them focus more on these efforts. I still can't reliable view
an individual that happens to be member of an ontology document. They
still don't reason over ontologies and report issues. There is still
not a visual browser that is particularly helpful.

> Lots of the stuff you mention below has already been done, and has the funding to
> continue supporting this.

None of what we have done was done before we did it. Many of the
things we do are still not done by anybody else. Bioportal will not
forever have funding to continue to support what they are doing. Given
efforts we have made to have Bioportal adopt practices we've explored,
prototyped, and then found useful, I have doubts whether they will
ever have anything as useful as Ontobee is for me.

> It might be worth investigating how OntoBee technology can add pretty rendering of OBO/OWL ontologies in BioPortal.

I have submitted numerous tickets to Bioportal over years pointing out
issues and making constructive suggestions about what would be useful
for ontology developers to have. I think I'm in a pretty good position
to know something about this. Very little has been followed up on.
Where it has been followed up on there is rarely acknowledgement.

I'm happy to compliment Bioportal on what they've done well, and there
are a number of things they have. But I'm not happy to let stand
misinformation about what they've accomplished, or leave stand any
doubt that I and others within the OBO have made anything less than
extraordinary efforts to supply ideas and prototypes to the Bioportal
team.
I'm glad you are getting value out of it. I'll be frank that I haven't
really had a look. I tried to collaborate with Bioportal for years to
encourage them to adopt what we'd developed with Neurocommons,
encourage that they adopt standards, and then proceed to the next
level. Since they didn't we maintain resources that provide for our
community's needs. Where they don't we're working with newer
technology to prospect for where we can bring the next level of value.
As far as Bio2RDF, I'll have to admit as well that I haven't tried it
recently. Over the couple of years when I periodically did try it I
found a) That endpoints I queried were down a significant fraction of
the time and b) that translation of resources it incorporated was of
highly varying quality, and in many cases the elements of interest
were missing.

It's possible that this has changed in the interim, and if so, I'm
glad to hear it.

However the problem is that none of the efforts we are making have
long term funding and so there is continued risk of more of the same.
So one has to pick a strategy.

My own effort in the last few years has been towards doing my piece of
helping develop what's available in the OBOs, working to improve
quality where possible, (slowly) helping move that to a place where
it uses good standards properly, and contributing to efforts to make
it easy to get the whole load of it, intact, usable as a whole, into
the hands of whoever wants it. I hope that this leads either to a
stable source of funding for this kind of work, or to a set of
artifacts and protocols that make it simple enough to maintain and use
that such funding isn't necessary.

>> Members of HCLS who wish to assist with maintenance of the Neurocommons
>> endpoints would be welcomed. Once they are reviewed and brought up to date
>> on which ontologies they load, the Neurocommons RDF Bundling system will
>> provide an addition distribution mechanism for creating mirrors.
>>
>>
>> So Michel, and other HCLS users, consider this an invitation: The OBO
>> Foundry is very close to providing a stable, well thought through process
>> for semantic web deployment of OBO ontologies. We could very much use
>> technical support in finishing a number of technical loose ends, in
>> providing tools that build on these efforts, and on making it easy to access
>> existing endpoints or provide mirrors of the content. If there is sufficient
>> interest in this within the group perhaps Chris and I can schedule a time
>> when we could meet with those interested and see what the possibilities are.
>
>
> I think that's a good idea. Let's aim for sometime in September.

OK, that should be interesting. Let's talk on the phone some time
before then to see about what we can do to make the meeting
productive.

Regards,
Alan

Phillip Lord

unread,
Aug 10, 2012, 7:44:00 AM8/10/12
to Alan Ruttenberg, Michel Dumontier, Peter Ansell, Robinson, Peter, Chris Mungall, HCLS, bio2rdf
Alan Ruttenberg <alanrut...@gmail.com> writes:

>> As you know, we and others have demonstrated that alternative
>> representations and reformulation of knowledge is desirable for certain
>> kinds of scientific inquiry.
>
> Sorry, I'm unaware of such demonstration. Could you cite some references?


http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0012258

A few examples of where multiple representations of the same knowledge
have been used for good reasons:

- multiple syntaxes for RDF
- multiple syntaxes for OWL
- two APIs for XML (DOM and SAX).
- multiple computer languages which are reducable to lambda calculus
- lambda calculus and a Turing Machine
- continued use of Newtonian mechanics, although its an approximation
of relativistic mechanics
- multiple statisical techniques for expression of central tendancy
- PDFs are still better for reading in the bath than HTML

And so on. Any model is a compromise between accuracy, usability,
convenience and so on. Sometimes having more than one compromise is a
better solution than trying to shoe-horn everything into one bucket.
This is a compromise too.

Phil

Michael Miller

unread,
Aug 13, 2012, 12:44:12 PM8/13/12
to Phillip Lord, Alan Ruttenberg, Michel Dumontier, Peter Ansell, Robinson, Peter, Chris Mungall, HCLS, bio2rdf
hi all,

i would also say to take a look at the principles underlying model driven
architecture (MDA) [1] which is a systematic way of describing how a model
can become instantiated in multiple platform specific ways. for instance,
if every RDF representation derives from an original OWL model, then a
mapping between the different RDF representations can be made by going
back through the original mapping from owl to the RDF. this mapping might
not be perfect because not all platforms implement the same features but
the mapping describes what is lost or gained.

michael

[1] http://www.omg.org/mda/

Michael Miller
Software Engineer
Institute for Systems Biology
Reply all
Reply to author
Forward
0 new messages