[Obo-format] request for new tags: modified_by and modification_date

4 views
Skip to first unread message

Erick Antezana

unread,
Oct 30, 2012, 3:42:26 AM10/30/12
to obo format
Hi,

I would like to request the addition of two new tags: modified_by and
modification_date (unless there is a better way/practice to keep track
of atomic/term modifications...)

modified_by: erick
modification_date: 2012-10-30T08:40:22Z

I recall those tags were at some point present in TO
(https://groups.google.com/forum/?fromgroups=#!topic/obo-format/de0vV-2KrxQ)...

cheers,
Erick

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Obo-format mailing list
Obo-f...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/obo-format

Chris Mungall

unread,
Oct 30, 2012, 11:42:08 AM10/30/12
to Erick Antezana, obo format
Hi Erick

I recommend you use property_value tag and request the properties
below from IAO. The modification date would be an xsd:date.

Given that there may be multiple modification I suggest requesting
"last modification date", or modeling each modification as a separate
entity (outside what can be done easily in obo-format).

It's not clear how this would be used - at the moment neither Protege
not OE tracks this (though it may be useful for them to do so)

Melanie Courtot

unread,
Oct 30, 2012, 12:29:00 PM10/30/12
to Chris Mungall, obo format
On a side note, we did something like date in the context of SBO. After a little while, we realized that the date was not enough and that users wanted to actually store what the last modification was (e.g., store the term as it was before the modification) It probably depends on what you want to achieve by storing the modification date.

If I remember correctly, we also store the date at which the term was created. Maybe we could have a similar system were one term could have several properties "modification_date" and then annotations on this one for the name of the modifier and the previous version of the term.

Melanie

Melanie Courtot

unread,
Oct 30, 2012, 12:36:37 PM10/30/12
to Yu Lin, obo format
The Systems Biology Ontology, http://obofoundry.org/cgi-bin/detail.cgi?id=systems_biology

Melanie


On 2012-10-30, at 9:35 AM, Yu Lin wrote:

> Melanie,
>
> What SBO stands for?
>
> Thanks,
> Asiyah

Chris Mungall

unread,
Oct 30, 2012, 12:44:21 PM10/30/12
to Melanie Courtot, obo format

creation_date and created_by are already built into the obo syntax - they just map to oboInOwl annotation properties at the moment.

Erick Antezana

unread,
Oct 31, 2012, 6:07:03 AM10/31/12
to Chris Mungall, Melanie Courtot, Yu Lin, obo format
Hi,

I will implement (in my pipeline) a way to capture those entries as
property_value's so that the generated ontologies capture the id/name
of the person who modified a term as well as the date of its *latest*
modification (ISO-8601 format as for the creation_date), for instance:

property_value: latest_modification_by "MYO:erick"
property_value: latest_modification_date "2012-10-30T08:40:22Z" xsd:date

where:

MYO = namespace of my ontology holding the instance "erick"
erick = instance
'latest_modification_by' and 'latest_modification_date' = two
relationship type id's to keep track of the latest (not last) entry
modification
"2012-10-30T08:40:22Z" = date in ISO 8601 format
xsd:date = datatype identifier

As mentioned by Chris, there could be indeed several other ways to
capture/model those elements, e.g. storing multiple dates reflecting
the dates the entry (term/relationship) has been modified...I do not
see too many advantages of doing it so...terms/relationships will get
"polluted" or overloaded of "unnecessary" data...? Nonetheless, I
recall that many years ago we discussed in a meeting about giving
credit to people/groups who contributed with new terms or simply
refined them (e.g. improved definition, extra references, extra
synonyms,..) ...with such a "multiple-date/multiple-user" modelling we
could clearly give credit to ALL the contributors. In my particular
case, we are interested in keeping track of the latest date an entry
has been modified as well as the person who performed that
modification. Having that latest person's name/id is as having
somebody *responsible* for the current contents of that entry. I would
be curious to see how other team approached those aspects... any
thoughts/ideas are very welcome!

On the other hand, It would indeed be great if Protege and OBOEdit
could handle those properties somehow; nevertheless, w.r.t. Melanie's
comment (users asking to store how the term was *before* the
modification), I think that goes beyond of what we should keep in an
ontology file... In my opinion, those things should be managed by
tools such as OORT (http://code.google.com/p/owltools/wiki/OortIntro).

cheers,
Erick

Chris Mungall

unread,
Oct 31, 2012, 2:02:52 PM10/31/12
to Erick Antezana, Phillip Lord, Melanie Courtot, obo format, Yu Lin, Matthew Brush

Hi Erick,

I cc'd Phil and Matt as this is veering into discussion of assigning credit to ontology contributors, and they have some ideas here. Comments below.

On Oct 31, 2012, at 3:07 AM, Erick Antezana wrote:

> Hi,
>
> I will implement (in my pipeline) a way to capture those entries as
> property_value's so that the generated ontologies capture the id/name
> of the person who modified a term as well as the date of its *latest*
> modification (ISO-8601 format as for the creation_date), for instance:
>
> property_value: latest_modification_by "MYO:erick"

you should explicitly model yourself as a string (use xsd:string as an initial arg) or as an entity (no additional arg, but unquote yourself)

This is not well standardized across ontologies yet. GO has a registry of editors http://www.geneontology.org/doc/GO.curator_dbxrefs who are denoted by a GOC:nnn ID. In future we may add researcherids/orcids to this or encourage using such globally unique IRIs directly

> property_value: latest_modification_date "2012-10-30T08:40:22Z" xsd:date
>
> where:
>
> MYO = namespace of my ontology holding the instance "erick"
> erick = instance
> 'latest_modification_by' and 'latest_modification_date' = two
> relationship type id's to keep track of the latest (not last) entry
> modification
> "2012-10-30T08:40:22Z" = date in ISO 8601 format
> xsd:date = datatype identifier

ok

> As mentioned by Chris, there could be indeed several other ways to
> capture/model those elements, e.g. storing multiple dates reflecting
> the dates the entry (term/relationship) has been modified...I do not
> see too many advantages of doing it so...terms/relationships will get
> "polluted" or overloaded of "unnecessary" data...?

The only advantage is if you are tracking *each* modification plus person-modifying (n per term), rather than last_modification (a functional property, 1 per term)

> Nonetheless, I
> recall that many years ago we discussed in a meeting about giving
> credit to people/groups who contributed with new terms or simply
> refined them (e.g. improved definition, extra references, extra
> synonyms,..) ...with such a "multiple-date/multiple-user" modelling we
> could clearly give credit to ALL the contributors. In my particular
> case, we are interested in keeping track of the latest date an entry
> has been modified as well as the person who performed that
> modification. Having that latest person's name/id is as having
> somebody *responsible* for the current contents of that entry. I would
> be curious to see how other team approached those aspects... any
> thoughts/ideas are very welcome!

The current practice would be to add these to the definition dbxref field, but this doesn't really indicate the nature of the contribution, the time of the contribution etc.

In these cases, sometimes a separate GO_REF is created, e.g.
http://www.geneontology.org/cgi-bin/references.cgi#GO_REF:0000026

These can be treated as mini publications, and associated with relevant metadata such as authors, author contributions, dates, etc.

The GO_REF system is not perfect - I find the abstracts too high level, and they don't describe the actual rationale for modeling something a certain way on the ontology. Plus they are low visibility as far as credit is concerned.

I would advocate a kind of 4 level process for ontology maintainers:

1. As a minimum, each class and the ontology as a whole can be annotated using a dc:contributor or dc:creator property. This is now possible in obo-format (but not oboedit, as yet)

2. Keep ontology discussions on public fora such as mailing lists and trackers, and link to these from the ontology where appropriate (also - get a purl for your tracker)

3. Any ontology contributions involving a piece of unique non-trivial biology or ontology modeling is often deserving of at least a small article published on a pre-print site. The ontogenesis kblog ( http://knowledgeblog.org/ ) is ideal for this. The article can then be linked from the originating ontology, or indeed from other ontologies, publications and so on. I recommend that anyone involved in making ontologies (everyone on this list I expect) read Robert Steven's and others excellent ontology articles published on this forum ( http://ontogenesis.knowledgeblog.org/ ), and considers writing an article on their own ontology for publication and community comments here.

By all means follow the traditional publication route too but it's harder to publish ontology papers and the traditional review process can be at odds with the goal of getting comprehensive documentation out there, even in venues like PLoS ONE.

4. As in software engineering, an ontology should include comprehensive in-line documentation, stating why things are modeled a certain way. The primary goal here is maintainability and long-term sustainability, but it also serves as an audit trail and potential mechanism of credit. Note that there is some level of overlap between 4 and 3 - it may be the case that these are combined, with a high level summary article on the kblog accompanying detailed ontology-specific documentation embedded in the ontology.

As yet the mechanism for achieving #4 is under discussion, but inspired by systems developed by David OS and Matt Brush, the basic idea is:

* Maintain a separate 'ontology' that contains individual articles or design decisions; for owl users, this can be directly imported.

* Each article entity has a dc:description field that contains text formatted in a lightweight format such as markdown

* Each artcile entity has all the metadata you'd expect such as creator, contributor, date, article type, etc

* Articles are linked to from the main ontology via a to-be-decided list of properties. For example, there might be an article type of "modeling pattern", and individual classes may be tagged as either conforming to or in the process of being refactored to conform to this modeling pattern. Matt's REO ontology has some good examples of this.

* HTML can be generated from this for end users, and ontology editing environments may in future better support creation of this kind of metadata, but in the absence of these the system is still extremely useful.

We have a demo of a lightweight system stitched together from some 3rd party components (e.g. using pandoc to generate the html) using some of the documentation on the uberon multi-species ontology

An example article:

* http://purl.obolibrary.org/obo/uberon/references/reference_0000014

Index of articles:

* http://purl.obolibrary.org/obo/uberon/references/index.html

(note that many of these are currently placeholders)

Meta-article describing documentation system:

* http://purl.obolibrary.org/obo/uberon/references/reference_0000000

> On the other hand, It would indeed be great if Protege and OBOEdit
> could handle those properties somehow; nevertheless, w.r.t. Melanie's
> comment (users asking to store how the term was *before* the
> modification), I think that goes beyond of what we should keep in an
> ontology file... In my opinion, those things should be managed by
> tools such as OORT (http://code.google.com/p/owltools/wiki/OortIntro).

Yes, the ontology editing environment should automatically generate all required edit info. OE currently stamps the created_by/date fields, but no others. It would be useful to extend to last_modification

One problem with the current property_value system is that currently these are neither visible nor editable within OE, which limits some of the above, requiring some workarounds.

Cheers
Chris

Phillip Lord

unread,
Nov 1, 2012, 10:58:07 AM11/1/12
to Chris Mungall, Melanie Courtot, obo format, Yu Lin, Matthew Brush


I would agree with this. Currently, we see publication as a way of
describing work that has already been done. But writing short articles,
explain design decision as you go, I think, it a very valuable thing.

I've been using my own blog this way for a while; it's not quite a lab
notebook; the articles are a little bit more abstracted from the
immediate process of what I have done. The problem with this at the
current time, is that this keeps the history of what has been done, but
doesn't infer "credit" as science is still tied up with the idea that
credit comes only from Impact Factor. Not much that can be done with
this.

I have some concerns about putting author information in too deeply into
the ontology since this is going to produce potentially bloated work. I
think we need to avoid duplicated the knowledge that is already being
stored in the versioning system.

Phil

Chris Mungall <cjmu...@lbl.gov> writes:
> 3. Any ontology contributions involving a piece of unique non-trivial biology
> or ontology modeling is often deserving of at least a small article published
> on a pre-print site. The ontogenesis kblog ( http://knowledgeblog.org/ ) is
> ideal for this. The article can then be linked from the originating ontology,
> or indeed from other ontologies, publications and so on. I recommend that
> anyone involved in making ontologies (everyone on this list I expect) read
> Robert Steven's and others excellent ontology articles published on this forum
> ( http://ontogenesis.knowledgeblog.org/ ), and considers writing an article on
> their own ontology for publication and community comments here.
>
> By all means follow the traditional publication route too but it's harder to
> publish ontology papers and the traditional review process can be at odds with
> the goal of getting comprehensive documentation out there, even in venues like
> PLoS ONE.
>
> 4. As in software engineering, an ontology should include comprehensive
> in-line documentation, stating why things are modeled a certain way. The
> primary goal here is maintainability and long-term sustainability, but it also
> serves as an audit trail and potential mechanism of credit. Note that there is
> some level of overlap between 4 and 3 - it may be the case that these are
> combined, with a high level summary article on the kblog accompanying detailed
> ontology-specific documentation embedded in the ontology.
>
> As yet the mechanism for achieving #4 is under discussion, but inspired by
> systems developed by David OS and Matt Brush, the basic idea is:

Phillip Lord

unread,
Nov 1, 2012, 12:37:36 PM11/1/12
to Yu Lin, Melanie Courtot, obo format, Matthew Brush

Think the issue is whether you *really* want to know the history, or you
just want an explanation. The problem with history is that you get all
the twists and turns.

If I may make a bad analogy, when students write dissertations almost
universally, they start off with it like a diary.

The problem at the moment is that we have a single sort of
documentation; that is comments that users are supposed to read. We need
developer documentation also. This split is understood when writing code
base; it needs to happen for ontology building also.

Phil

Yu Lin <lini...@gmail.com> writes:

> Hi, all,
>
> One thing encountered in our ontology development is that sometimes we
> would like to know the history of one specific term. For example, in what
> context or circumstances was it created. What kind of discussion did the
> developers go through based on the definition and modeling of this term.
> Different ontology developer groups may use mail list, issue trackers or
> skype call memos for debating or discussing.
>
> Sometimes one mail would lead to two or more different issues, and which is
> hard to seperate.
> Sometimes people don't put all the discussions into issue tracker.
> Or sometimes people put the meeting memo in a webpage.
>
> There must be a better organized way to coordinate all those efforts for a
> sustainable ontology development.
>
> What I want is a more focused method for tracking down all the references
> of one specific term from one specific ontology, which is somehow narrower
> that what KnowledgeBlog is doing.
>
> I wonder if anybody has some ideas or suggestions on this.
>
> Thanks,
> Asiyah Yu Lin
>
> On Thu, Nov 1, 2012 at 10:58 AM, Phillip Lord
> <philli...@newcastle.ac.uk>wrote:
>
>>
>>
>> I would agree with this. Currently, we see publication as a way of
>> describing work that has already been done. But writing short articles,
>> explain design decision as you go, I think, it a very valuable thing.
>>
>> I've been using my own blog this way for a while; it's not quite a lab
>> notebook; the articles are a little bit more abstracted from the
>> immediate process of what I have done. The problem with this at the
>> current time, is that this keeps the history of what has been done, but
>> doesn't infer "credit" as science is still tied up with the idea that
>> credit comes only from Impact Factor. Not much that can be done with
>> this.
>>
>> I have some concerns about putting author information in too deeply into
>> the ontology since this is going to produce potentially bloated work. I
>> think we need to avoid duplicated the knowledge that is already being
>> stored in the versioning system.
>>
>> Phil
>>
>> Chris Mungall <cjmu...@lbl.gov> writes:
>> > 3. Any ontology contributions involving a piece of unique non-trivial
>> biology
>> > or ontology modeling is often deserving of at least a small article
>> published
>> > on a pre-print site. The ontogenesis kblog ( http://knowledgeblog.org/) is
Phillip Lord, Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: philli...@newcastle.ac.uk
School of Computing Science, http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower, skype: russet_apples
Newcastle University, msn: m...@russet.org.uk
NE1 7RU twitter: phillord

Yu Lin

unread,
Oct 30, 2012, 12:04:11 PM10/30/12
to Erick Antezana, obo format
Erick, 

I was requesting the similar thing in protege-discussion mail list. Currently WebProtege will be able to track your changes.
Somebody says protege 3 is able to do so.
However I didn't test.

The following are the copy from the mailing list.
******************************************************************************************************
On Thu, Oct 18, 2012 at 5:20 AM, Jonathan Carter <jonatha...@e-asolutions.com> wrote:
I think the "Track Changes" capability in Protege 3 does capture this information, by user in the Changes Ontology.

Jonathan
_______________________________________

Jonathan Carter 
Enterprise Architecture Solutions Ltd
_______________________________________

Proud sponsors of The Essential Project
The free open-source Enterprise Architecture Management Platform
www.enterprise-architecture.org
_______________________________________

Enterprise Architecture Solutions Ltd, Registered in England and Wales: 04097721.
Registered Office: 76 High Street, Newport Pagnell, Milton Keynes, MK16 8AQ.

On 17 Oct 2012, at 18:33, Yu Lin wrote:

Why not do it? Who should I send the request?
It is very useful for evaluating and aligning ontologies.

Best,
Asiyah Yu Lin @ University of Michigan

On Wed, Oct 17, 2012 at 1:30 PM, Timothy Redmond <tred...@stanford.edu> wrote:

To my knowledge I don't think that either Protege 3 or Protege 4 have this capability out of the box.

-Timothy


On 10/17/12 10:10 AM, Yu Lin wrote:
Hi,  Protege developers,

As a user, I wonder if the protege 4 or 3 has the function to automatically save the time stamp when a new term has been created.
Please let me know, thanks.

Asiyah Yu Lin @ University of Michigan


_______________________________________________
protege-discussion mailing list
protege-d...@lists.stanford.edu
https://mailman.stanford.edu/mailman/listinfo/protege-discussion

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03


_______________________________________________
protege-discussion mailing list
protege-d...@lists.stanford.edu
https://mailman.stanford.edu/mailman/listinfo/protege-discussion

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03


_______________________________________________
protege-discussion mailing list
protege-d...@lists.stanford.edu
https://mailman.stanford.edu/mailman/listinfo/protege-discussion

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03


_______________________________________________
protege-discussion mailing list
protege-d...@lists.stanford.edu
https://mailman.stanford.edu/mailman/listinfo/protege-discussion

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03


**********************************************************************************************************

Yu Lin

unread,
Oct 30, 2012, 12:35:02 PM10/30/12
to Melanie Courtot, obo format
Melanie,

What SBO stands for?

Thanks,
Asiyah

On Tue, Oct 30, 2012 at 12:29 PM, Melanie Courtot <mcou...@gmail.com> wrote:

Yu Lin

unread,
Nov 1, 2012, 11:27:33 AM11/1/12
to Phillip Lord, Melanie Courtot, obo format, Matthew Brush
Hi, all,

One thing encountered in our ontology development is that sometimes we would like to know the history of one specific term. For example, in what context or circumstances was it created. What kind of discussion did the developers go through based on the definition and modeling of this term. Different ontology developer groups may use mail list, issue trackers or skype call memos for debating or discussing.

Sometimes one mail would lead to two or more different issues, and which is hard to seperate.
Sometimes people don't put all the discussions into issue tracker.
Or sometimes people put the meeting memo in a webpage.

There must be a better organized way to coordinate all those efforts for a sustainable ontology development. 

What I want is a more focused method for tracking down all the references of one specific term from one specific ontology, which is somehow narrower that what KnowledgeBlog is doing.

I wonder if anybody has some ideas or suggestions on this.

Thanks,
Asiyah Yu Lin

On Thu, Nov 1, 2012 at 10:58 AM, Phillip Lord <philli...@newcastle.ac.uk> wrote:

Yu Lin

unread,
Nov 1, 2012, 12:46:22 PM11/1/12
to Phillip Lord, Melanie Courtot, obo format, Matthew Brush
Good opinion! Philip.
So we will need a user version documentation and a developer version documentation.
And both documentations will need version control.

Best,
Asiyah

Matthew Brush

unread,
Nov 1, 2012, 1:02:01 PM11/1/12
to Yu Lin, Phillip Lord, obo format, Melanie Courtot

HI all.  A note on an approach for separating user and developer documentation. In developing the Reagent Ontology (http://code.google.com/p/reagent-ontology), we store developer documentation in a separate owl file (reo-dev.owl), which imports the core reo.owl file.  The core model and general annotations (definitions, comments, examples, etc) meant for public viewing live in the reo core. The reo-dev layer is used only to hold annotations about things like term histories, design decisions, and action items.  The core file is what is released and seen on ontobee, while the reo-dev file is used by developers – as we can make changes to the core and record notes in the dev layer from this reo-dev file.  This has worked well, and we hope to begin to standardize some of the developer annotations for re-use by the community using this approach.

 

Matt

 

---

Matthew H. Brush

OHSU Ontology Development Group

Department of Medical Informatics and Clinical Epidemiology

Oregon Health and Science University

phone :  cell 919-452-6914

fax :  503-346-6815

bru...@ohsu.edu

Erick Antezana

unread,
Nov 7, 2012, 9:34:02 AM11/7/12
to Chris Mungall, Phillip Lord, Melanie Courtot, obo format, Yu Lin, Matthew Brush
Chris,

do you mean something like either:

   property_value: latest_modification_by "MYO:erick" xsd:string

or this:
 
   property_value: latest_modification_by MYO:erick

   where 'erick' is an instance in an ontology with 'MYO' as IDSpace, such as:

     [Instance]
     id: MYO:erick
     name: Erick Antezana
     instance_of: person

I think I will go initially for the string solution, so that I don't have to create instances for my users....

On the other hand, I prefer to use "latest" instead of "last", since 'last' conveys the idea that there will be no more modifications; on the contrary, 'latest' keeps the door open for further improvements/changes.

I agree with the 4 points you suggested for ontology maintainers; however, I am afraid that without the appropriate tooling for ontology maintainers, it won't be possible to encourage people to capture all that information...which is a pity since in the future only the contributors who will have the appropriate tools will be duly acknowledged (the rest simply ignored...).

cheers,
erick

Chris Mungall

unread,
Nov 7, 2012, 2:48:03 PM11/7/12
to Erick Antezana, Phillip Lord, Melanie Courtot, obo format, Yu Lin, Matthew Brush
On Nov 7, 2012, at 6:34 AM, Erick Antezana wrote:

Chris,

do you mean something like either:

   property_value: latest_modification_by "MYO:erick" xsd:string

or this:
 
   property_value: latest_modification_by MYO:erick

either is valid

   where 'erick' is an instance in an ontology with 'MYO' as IDSpace, such as:

     [Instance]
     id: MYO:erick
     name: Erick Antezana
     instance_of: person

I don't recommend instances in obo format. This could be a separate owl document

I think I will go initially for the string solution, so that I don't have to create instances for my users....

ok

On the other hand, I prefer to use "latest" instead of "last", since 'last' conveys the idea that there will be no more modifications; on the contrary, 'latest' keeps the door open for further improvements/changes.

Seems fine - will you request from or add to iao/ontology-metadata?

I agree with the 4 points you suggested for ontology maintainers; however, I am afraid that without the appropriate tooling for ontology maintainers, it won't be possible to encourage people to capture all that information...which is a pity since in the future only the contributors who will have the appropriate tools will be duly acknowledged (the rest simply ignored...).

All the tooling already exists for 2-4 (and #1 - you're original request - should not be hard to support). For documentation the rate limiting factor is usually time - but documentation usually pays off in the long run.

Erick Antezana

unread,
Nov 12, 2012, 5:24:54 AM11/12/12
to Chris Mungall, Phillip Lord, Melanie Courtot, obo format, Yu Lin, Matthew Brush
Chris,

I have implemented a prototype in ONTO-perl (http://search.cpan.org/dist/ONTO-PERL/) to deal with property_values (parse OBO files with property_values, export them, etc...). I left a couple of TODO's to be implemented once the spec becomes more clear to me regarding the instances...For the time being, this API does the trick...

where can I find the latest iao/ontology-metadata? how can I request it?

cheers,
Erick

Chris Mungall

unread,
Nov 12, 2012, 11:59:56 AM11/12/12
to Erick Antezana, Phillip Lord, Melanie Courtot, obo format, Yu Lin, Matthew Brush

Thanks for adding this Erick. Will this also handle property_value tags in the ontology header?

For links to the stable and development version of ontology-metadata, see:

You can place requests here:
Reply all
Reply to author
Forward
0 new messages