OGG updates. RE: [Obo-discuss] Change name to OGG from GGO.

7 views
Skip to first unread message

He, Yongqun

unread,
Apr 29, 2014, 1:44:28 PM4/29/14
to ogg-d...@googlegroups.com, obo-d...@lists.sourceforge.net, ggo-d...@googlegroups.com, Liu, Yue, Cathy Wu
Dear colleagues and friends,

I would like to take this chance to provide some updates on the development of the Ontology of Genes and Genomes (OGG):

Currently we have developed and loaded the main OGG OWL document in Ontobee:
http://www.ontobee.org/browser/index.php?o=OGG
(note: This OGG document covers human, HIV, influenza virus, and four bacteria)

We have also developed many OGG subsets and loaded them in Ontobee:

- OGG-C. elegans: http://www.ontobee.org/browser/index.php?o=OGG-Ce
- OGG-Fruit Fly: http://www.ontobee.org/browser/index.php?o=OGG-Dm
- OGG-Mouse: http://www.ontobee.org/browser/index.php?o=OGG-Mm
- OGG-P. falciparum: http://www.ontobee.org/browser/index.php?o=OGG-Pf
- OGG-Yeast: http://www.ontobee.org/browser/index.php?o=OGG-Sc
- OGG-Zebrafish: http://www.ontobee.org/browser/index.php?o=OGG-Dr

Some new features that were not introduced before:
- We have also added different layers for genes based on gene type, for example, here is the human rRNA gene:
http://purl.obolibrary.org/obo/OGG_2020009606

- For each gene, we have added GO IDs associated with the gene and PubMed IDs (PMIDs) associated with gene (if there is any).
For example, human casp2:
http://purl.obolibrary.org/obo/OGG_3000000835

Bin and Yue from my group have contributed a lot on this project.

Any suggestions and comments are welcome. Thanks!

Oliver

Yongqun "Oliver" He, DVM, PhD
Associate Professor
Unit for Laboratory Animal Medicine
Department of Microbiology and Immunology
Center for Computational Medicine and Bioinformatics
and Comprehensive Cancer Center
University of Michigan Medical School
Mail: 018 ARF, 1150 W. Medical Center Dr.
Ann Arbor, MI 48109
Email: yong...@med.umich.edu
Tel: 734-615-8231 (O)
http://www.hegroup.org/


-----Original Message-----
From: He, Yongqun [mailto:yong...@med.umich.edu]
Sent: Tuesday, October 29, 2013 7:47 PM
To: obo-d...@lists.sourceforge.net; Lindsay Cowell
Cc: ggo-d...@googlegroups.com; ogg-d...@googlegroups.com; Liu, Yue; Cathy Wu; Ramona Walls
Subject: [Obo-discuss] Change name to OGG from GGO. RE: [Obi-devel] Announce a Genome and Gene Ontology (GGO)

Dr. Barry Smith kindly commented to me that the GGO (Genome and Gene Ontology) is not well named since it might suggest a confused relation to the Gene Ontology (GO). I agreed with the comment and proposed a new name "Ontology of Genes and Genomes (OGG)". Barry agreed. Asiyah also thought that it would be a good change. I have also communicated with PRO. Darren commented that with the new OGG ontology, "there is no clash with PRO". These are all good news. I look forward to close collaboration with BFO, PRO, and the OBO foundry community. Thanks!

Therefore, we will change GGO to the new name OGG or "Ontology of Genes and Genomes" from now on.

Corresponding to this name change, I have done a few things:
(1) Generated a new OGG OWL version. Current OGG is the same as previous GGO except that all "GGO_" and "ggo.owl" have been changed to "OGG_" and "ogg.owl".

(2) Generated a new OGG Google code project:
https://code.google.com/p/ogg/
I have also committed the new OGG OWL file and svn hierarchy to the Google code project.

(3) Generated a new ogg-discuss Google Group:
https://groups.google.com/forum/#!forum/ogg-discuss
It is open to the public for registration and posting. I feel very sorry that many have already registered for the ggo-discuss group, and now need to change to ogg-discuss. Thanks for your support!
Anyone is welcome to register and post in this Google Group. We may soon not post many emails to the obo-discuss mailing list, and instead primarily use the ogg-d...@googlegroups.com email address for discussion. If you have consistent interest in OGG development and applications and want to receive related emails and involve more discussions, please register an account in ogg-discuss. Thank you!

(4) Loaded the OGG to the Ontobee. Here is the Ontobee site for OGG:
http://www.ontobee.org/browser/index.php?o=OGG
With this new site available now in Ontobee, I have removed GGO from Ontobee.

My strong feeling is that the enthusiasm for OGG is very high. We will keep working on it and keep you updated. Thanks, all!

Oliver He
University of Michigan Medical School

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

Erick Antezana

unread,
Apr 29, 2014, 4:50:12 PM4/29/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Cathy Wu, Liu, Yue
Hi,

a few questions:

- any plans to include plant species (e.g. A thaliana)?

- what's the source of the gene names? NCBI? HGBC?


http://www.ncbi.nlm.gov/gene is not a valid URL

cheers,
Erick


------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Obo-discuss mailing list
Obo-d...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/obo-discuss

He, Yongqun

unread,
Apr 29, 2014, 8:33:42 PM4/29/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Liu, Yue, Cathy Wu

Hi Erick,

 

Thanks for your questions. Here are my answers to your questions:

 

- any plans to include plant species (e.g. A thaliana)?

Answer: As you wish, I have just added the Arabidopsis thaliana one as an OGG subset:

http://www.ontobee.org/browser/index.php?o=OGG-At

 

- what's the source of the gene names? NCBI? HGBC?

Answer: It’s NCBI, primarily NCBI Gene database.

Answer: The source of the NEWENTRY definition is here:

http://www.ncbi.nlm.nih.gov/gene/?term=2829857

 

http://www.ncbi.nlm.gov/gene is not a valid URL

Answer: Thank you for finding this out. We have updated the URL now to:

http://www.ncbi.nlm.nih.gov/gene

 

Please let me know if you have any more comments, questions, and suggestions. Thanks!

 

Oliver

Erick Antezana

unread,
Apr 30, 2014, 4:53:11 AM4/30/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Cathy Wu, Liu, Yue
Hi Oliver,

Many thanks for considering A thaliana and for opening the door to other plant species.

- are you planing to have any cross-link between genes across species? (based on homology for instance) I ask that since my users would be interested in navigating thru the ontology: from gene G1 (in species S1) to gene G2 (in species S2) -- where G1 and G2 share some properties for instance...

- how often do you update OGG?

- would it be possible to have the NCBI id's instead of the OGG ones ? (PURL)

- is there a way to get a local copy (VM image?) of Ontobee?

cheers,
Erick

Pankaj Jaiswal (OSU)

unread,
Apr 30, 2014, 1:25:34 PM4/30/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Liu, Yue, Cathy Wu
http://obo.cvs.sourceforge.net/viewvc/obo/obo/ontology/genomic-proteomic/gene/genes-3702-as-class.obo

For Arabidopsis

On 04/29/2014 1:50 PM, Erick Antezana wrote:
> Hi,
>
> a few questions:
>
> - any plans to include plant species (e.g. A thaliana)?
>
> - what's the source of the gene names? NCBI? HGBC?
>
> - what is the NEWENTRY ?:
> http://www.ontobee.org/browser/rdf.php?o=OGG&iri=http://purl.obolibrary.org/obo/OGG_3002829857
>
> - http://www.ncbi.nlm.gov/gene is not a valid URL
>
> cheers,
> Erick
>
>
> On 29 April 2014 19:44, He, Yongqun <yong...@med.umich.edu
> Email: yong...@med.umich.edu <mailto:yong...@med.umich.edu>
> <mailto:ogg-d...@googlegroups.com> email address for discussion.
> If you have consistent interest in OGG development and applications
> and want to receive related emails and involve more discussions,
> please register an account in ogg-discuss. Thank you!
>
> (4) Loaded the OGG to the Ontobee. Here is the Ontobee site for OGG:
> http://www.ontobee.org/browser/index.php?o=OGG
> With this new site available now in Ontobee, I have removed GGO from
> Ontobee.
>
> My strong feeling is that the enthusiasm for OGG is very high. We
> will keep working on it and keep you updated. Thanks, all!
>
> Oliver He
> University of Michigan Medical School
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should
> not be used for urgent or sensitive issues
>
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos. Get
> unparalleled scalability from the best Selenium testing platform
> available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Obo-discuss mailing list
> Obo-d...@lists.sourceforge.net
> <mailto:Obo-d...@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/obo-discuss

Scheuermann, Richard

unread,
Apr 30, 2014, 1:14:06 PM4/30/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Cathy Wu, Liu, Yue
It's a little unclear to me what OGG is? Rather than an ontology, it
seems to be simply a database of genes with a list of attributes derived
from other database resources, making it largely redundant to Entrez Gene
except that it is represented in obo format. The only hierarchy I could
find was a list of gene types that appear to be redundant to the Sequence
Ontology (e.g. OGG_0000000010 protein-coding gene type versus SO:0001217
protein coding gene). I am very concerned with this type of development
since it does not appear to include extensive community involvement, it
does not respect the boundaries of other OBO Foundry candidate ontologies,
it blurs the distinction between ontologies and databases, and it does not
appear to be driven by compelling use cases.

Sincerely,

Richard

--------------------------------------------
Richard H. Scheuermann, Ph.D.
Director of Informatics
J. Craig Venter Institute
4120 Torrey Pines Rd.
La Jolla, CA 92037

rscheu...@jcvi.org
858-200-1876
>--------------------------------------------------------------------------

He, Yongqun

unread,
Apr 30, 2014, 2:57:40 PM4/30/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Liu, Yue, Cathy Wu
Hi Richard,

I am sorry that I did not provide some background about the OGG, which represents the Ontology of Genes and Genomes. It has been about half a year since I announced the ontology (It was initially named GGO) in the obo-discuss email list. Some of previous discussion email threads could be found here:

https://groups.google.com/forum/#!searchin/ogg-discuss/GGO/ogg-discuss/wy0132CCdNA/zFWQ8EjWU7EJ
https://groups.google.com/forum/#!topic/ogg-discuss/Woi05g0nf0c

These email communications also introduced the relation between OGG and SO.

Here I would like to summarize some information related to your questions:

Rationale of OGG development and use cases:
-- My typical use case (scenario) starts with the fact that I need to represent a lot of genes in the Vaccine Ontology (VO) and Brucellosis Ontology (IDOBRU). Hundreds of individual genes from different organisms (e.g., various bacteria, viruses, and parasites) have been used for vaccine development or for gene mutant generation. However, since there is no ontology representing individual genes, I had to generate VO-specific IDs and IDOBRU-specific IDs to represent these genes. Some genes represented in VO and IDOBRU are really the same genes from the same organisms. Using different ontology IDs to represent the same genes does not make much sense according to OBO principles.
Therefore, we need an ontology of individual genes from individual organisms, such as Brucella gene x, human gene y, and HIV gene Z.
GO does not do it because GO represents the molecular functions, biological processes, and cellular components of genes or gene products. SO represents general gene sequence information. So either GO or SO represents individual genes from individual organisms.
This is why I initiated the OGG development.

How to develop the OGG:
-- Yes. NCBI provides the largest and most comprehensive gene and genome information. This is why I started to rely on NCBI resources to generate OGG. We cannot directly use NCBI gene IDs like NCBITaxon IDs as PURL ontology IDs because there may be ID conflicts. For example, NCBI Gene ID 1 is " A1BG alpha-1-B glycoprotein". NCBI Genome ID 1 is "Pyropia yezoensis". They have the same ID "1" but have different meanings. Therefore, I designed a naming strategy to take into account the NCBI Gene IDs with other factors so that there are no conflicts. The detailed scheme is described in a previous email thread.

OGG is "largely redundant to Entrez Gene except that it is represented in obo format"
-- Yes, at least at current stage. We do it on purpose to make the ontology reliable and having a large amount of data. The key part is to wrap it up as an ontology format by aligning it with BFO and other OBO Foundry ontologies, so that we don't miss an important part in our ontology development as a whole. After the first stage, we may later include more features to make it more useful.
-- This technique is something called " Data wrangling". Data wrangling is basically a " process of manually converting or mapping data from one "raw" form into another format that allows for more convenient consumption of the data with the help of semi-automated tools. This may include further munging, data visualization, data aggregation, training a statistical model, as well as many other potential uses." (Cited from: Wiki: http://en.wikipedia.org/wiki/Data_wrangling )
-- Note: "Data wrangling" is a part of a recent NIH U01 program as a Big Data to Knowledge (BD2K) initiative ( http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-14-020.html ).

Usages of OGG:
-- Now we can OGG gene IDs to represent the genes in VO, IDOBRU, and possibly other ontologies. The same gene IDs can be used to represent the genes in different ontologies. This support data integration and sharing.
-- Support SPARQL ontology queries. For example, a few lines of SPARQL query code could be developed to identify much information, such as the total number of RNAs or unknown gene types in the human genome. We can also query the GO IDs and PubMed IDs associated with specific gene(s) in OGG. Later I will post out some simple SPARQL examples to do some of these tasks.
-- We can possibly use OGG gene IDs for instance data representation, for example, gene expression level for some gene with a specific OGG gene ID. If we all use the same gene IDs, we can compare results from different studies.

Your Comment: "The only hierarchy I could find was a list of gene types that appear to be redundant to the Sequence Ontology (e.g. OGG_0000000010 protein-coding gene type versus SO:0001217 protein coding gene)."
-- Yes, currently we use the gene types as hierarchy structure. Later we can assign more hierarchies, for example, by chromosome IDs or other features.
-- OGG_0000000010 is 'protein-coding gene type'. It is an OGG 'gene type', which is a BFO 'disposition'. Different genes have the disposition of having different gene types. URL: http://purl.obolibrary.org/obo/OGG_0000000010
-- SO:0001217 is 'protein_coding_gene'. It is a gene, which is a "sequence feature" in SO. URL: http://purl.obolibrary.org/obo/SO_0001217.
-- So these two terms are different.
-- More discussion about SO and OGG term sharing is discussed in a previous email thread.

The OGG ontology scope and development method were discussed for a few days in obo-discuss email list last October, and was agreed that it would be a good effort to go ahead. Therefore, it was not out of discussion and consensus.

I hope that I have addressed your concerns. Any comments and suggestions are more than welcome and appreciated.

Oliver He
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Obo-discuss mailing list
Obo-d...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/obo-discuss

Judith Blake

unread,
Apr 30, 2014, 3:21:23 PM4/30/14
to <obo-discuss@lists.sourceforge.net>, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Cathy Wu, Liu, Yue
Why are you not using a prefix to distinguish NCBIgene:##from NCBIgenome:## so that you don't need to build yet another system of IDs?

Sent from my iPhone
The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.

Erick Antezana

unread,
Apr 30, 2014, 3:31:36 PM4/30/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Liu, Yue, Cathy Wu
Hi,

I do not see any issue with the development of an *application* ontology (in contrast to domain ontologies such as GO or the ones under the OBO umbrella). However, it could indeed be seen as a knowledge base (= ontology backbone + specific data) and not as an 'ontology ' sensu OBO. Moreover, I don't see how this type of resource does not respect the boundaries of other OBO foundry (candidate) ontologies. On the contrary, I believe such an exercise shows how (''real'') data could be glued (using domain ontologies such as BFO, SO) so that researchers could take advantage of standarisations, data integration, explorations, etc.

cheers,
Erick


------------------------------------------------------------------------------

He, Yongqun

unread,
Apr 30, 2014, 6:45:28 PM4/30/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com

Hi Erick,

 

Yes. It’s possible to use the NCBI HomoloGene resource to construct the information for gene cross-link between species. We may try it later. Thanks for the suggestion.

 

For your three other questions:

 

- how often do you update OGG?

Answer: We will update OGG periodically, probably monthly? Later, we plan to label the date of every update in OGG as well.  

 

- would it be possible to have the NCBI id's instead of the OGG ones ? (PURL)

Answer: It’s an alternative. We provided cross-linked NCBI IDs for the OGG terms when we can find NCBI IDs. However, for many OGG terms, we cannot find corresponding NCBI IDs, for example, ‘bacterial gene’ (or ‘gene of Bacteria’), ‘virus gene’ (or ‘gene of Viruses’), HIV protein-coding gene, protein-coding gene type, etc. Also, there may be political issues if we want to use NCBI IDs. For example, we may update our gene records later. If we use NCBI IDs, should we ask for permission every time we update OGG? What will happen if NCBI does not like our updates? Also, OGG genes cross-link to many other resources, some of which may not belong to NCBI resources. Anyway, I think we have designed a smart algorithm to manage the ontology IDs. Our program can automatically manage the IDs well. We will see how it works in the future.     

 

- is there a way to get a local copy (VM image?) of Ontobee?

Answer: Sure. You are welcome to get a local copy of Ontobee. The Ontobee source is openly available (see the download information in our website). You can try it out and let us know if you need help.

 

Thanks!

 

Oliver   

 

 

From: Erick Antezana [mailto:erick.a...@gmail.com]
Sent: Wednesday, April 30, 2014 3:18 PM
To: obo-d...@lists.sourceforge.net
Subject: Re: [Obo-discuss] OGG updates. RE: Change name to OGG from GGO.

 

Hi,

 

 

cheers,

Erick

 

On 30 April 2014 15:50, Asiyah Yu Lin <lini...@gmail.com> wrote:

Erick,
Homolog is about protein.
What relation you will propose for the genes of homologs?
Thanks,
Asiyah

He, Yongqun

unread,
Apr 30, 2014, 7:22:36 PM4/30/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Liu, Yue, Cathy Wu
This is a good question. For sure it is possible. We thought about this option. We did not use this option because of several reasons.

First, for a large number of OGG terms, we cannot find corresponding NCBI IDs, for example, 'bacterial gene' (or 'gene of Bacteria'), 'virus gene' (or 'gene of Viruses'), HIV protein-coding gene, protein-coding gene type, etc. We will need to design our algorithm to assign IDs to avoid redundancy for these terms any way.

Second, we would like to keep away from possible politics issues as much as possible. We may update our gene records later. If we use NCBI IDs, should we ask for permission every time we update OGG? What will happen if NCBI does not like our updates? Also, OGG genes cross-link to many other resources, some of which may not belong to NCBI resources. How would the other resources think about our options of using NCBI #s? Unlike NCBI Taxonomy, gene-related information may be continuously updated. For example, some new literature reports may associate some genes to some important diseases. Some groups (which may be independent from NCBI) may like us to include these types of information to OGG.
Therefore, it may be better to reference the NCBI resources for using their data, and later reference other resources which will provide new information to our ontology.

Anyway, I think we have designed a smart algorithm to manage the OGG ontology IDs. Our program can automatically generate the IDs based on our algorithm design. I am now preparing some text to describe our algorithm and program in detail. Any future discussions on this can be resumed if needed.

-----
As promised in my previous email, Bin and I have generated two simple SPARQL queries to obtain OGG information and post the information online:
-- Query the number of human tRNA genes:
http://www.ontobee.org/tutorial/tutorial_sparql.php#ex6
-- Count the number of mouse genes associated with GO:mitochondrial DNA repair (or know which genes they are):
http://www.ontobee.org/tutorial/tutorial_sparql.php#ex7

Each query is only a few lines.
You can execute the queries using the Ontobee SPARQL web interface by clicking Example 6 and Example 7:
http://www.ontobee.org/sparql/index.php

You can also modify the queries or generate new queries to get what you would like to retrieve. Please feel free to let us know if you have questions or would like to ask for help.

Thanks!

Oliver

Chris Mungall

unread,
May 1, 2014, 2:41:21 AM5/1/14
to He, Yongqun, obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com


On 30 Apr 2014, at 15:45, He, Yongqun wrote:

> Hi Erick,
>
> Yes. It’s possible to use the NCBI HomoloGene resource to construct
> the information for gene cross-link between species. We may try it
> later. Thanks for the suggestion.
>
> For your three other questions:
>
> - how often do you update OGG?
> Answer: We will update OGG periodically, probably monthly? Later, we
> plan to label the date of every update in OGG as well.
>
> - would it be possible to have the NCBI id's instead of the OGG ones ?
> (PURL)
> Answer: It’s an alternative. We provided cross-linked NCBI IDs for
> the OGG terms when we can find NCBI IDs. However, for many OGG terms,
> we cannot find corresponding NCBI IDs, for example, ‘bacterial
> gene’ (or ‘gene of Bacteria’), ‘virus gene’ (or ‘gene of
> Viruses’), HIV protein-coding gene, protein-coding gene type, etc.

Sure, of course you won't find NCBI gene IDs for groupings like this. If
these kinds of query classes are really so useful to you, use a
different ID space for them.

(with the exception of "protein coding gene", which is in SO(

> Also, there may be political issues if we want to use NCBI IDs. For
> example, we may update our gene records later. If we use NCBI IDs,
> should we ask for permission every time we update OGG? What will
> happen if NCBI does not like our updates? Also, OGG genes cross-link
> to many other resources, some of which may not belong to NCBI
> resources. Anyway, I think we have designed a smart algorithm to
> manage the ontology IDs. Our program can automatically manage the IDs
> well. We will see how it works in the future.

I'm not sure what scenario or kinds of modifications you have in mind
but I think most people would rather take NCBI gene as a trusted source
for genes rather than ontologists.

Why would cross-linking to other resources be a problem? This is common.

Your programs may be able to manage IDs well but why make everyone else
deal with yet another ID mapping issues? This is maybe not as bad as
some resources that mint their own duplicate IDs, at least OGG/GGO
attempts to retain some semblance of the original numeric portion of the
ID, but why complicate things?

>
> - is there a way to get a local copy (VM image?) of Ontobee?
> Answer: Sure. You are welcome to get a local copy of Ontobee. The
> Ontobee source is openly available (see the download information in
> our website). You can try it out and let us know if you need help.
>
> Thanks!
>
> Oliver
>
>
> From: Erick Antezana [mailto:erick.a...@gmail.com]
> Sent: Wednesday, April 30, 2014 3:18 PM
> To: obo-d...@lists.sourceforge.net
> Subject: Re: [Obo-discuss] OGG updates. RE: Change name to OGG from
> GGO.
>
> Hi,
>
> I had in mind http://www.ncbi.nlm.nih.gov/homologene
>
> cheers,
> Erick
>
> On 30 April 2014 15:50, Asiyah Yu Lin
> <lini...@gmail.com<mailto:lini...@gmail.com>> wrote:
>
> Erick,
> Homolog is about protein.
> What relation you will propose for the genes of homologs?
> Thanks,
> Asiyah
> On Apr 30, 2014 4:53 AM, "Erick Antezana"
> <erick.a...@gmail.com<mailto:erick.a...@gmail.com>> wrote:
> Hi Oliver,
>
> Many thanks for considering A thaliana and for opening the door to
> other plant species.
>
> - are you planing to have any cross-link between genes across species?
> (based on homology for instance) I ask that since my users would be
> interested in navigating thru the ontology: from gene G1 (in species
> S1) to gene G2 (in species S2) -- where G1 and G2 share some
> properties for instance...
>
> - how often do you update OGG?
>
> - would it be possible to have the NCBI id's instead of the OGG ones ?
> (PURL)
>
> - is there a way to get a local copy (VM image?) of Ontobee?
>
> cheers,
> Erick
>
> On 30 April 2014 02:33, He, Yongqun
> <yong...@med.umich.edu<mailto:yong...@med.umich.edu>> wrote:
> Hi Erick,
>
> Thanks for your questions. Here are my answers to your questions:
>
> - any plans to include plant species (e.g. A thaliana)?
> Answer: As you wish, I have just added the Arabidopsis thaliana one as
> an OGG subset:
> http://www.ontobee.org/browser/index.php?o=OGG-At
>
> - what's the source of the gene names? NCBI? HGBC?
> Answer: It’s NCBI, primarily NCBI Gene database.
>
> - what is the NEWENTRY ?:
> http://www.ontobee.org/browser/rdf.php?o=OGG&iri=http://purl.obolibrary.org/obo/OGG_3002829857
>
> Answer: The source of the NEWENTRY definition is here:
> http://www.ncbi.nlm.nih.gov/gene/?term=2829857
>
> - http://www.ncbi.nlm.gov/gene is not a valid URL
> Answer: Thank you for finding this out. We have updated the URL now
> to:
> http://www.ncbi.nlm.nih.gov/gene
>
> Please let me know if you have any more comments, questions, and
> suggestions. Thanks!
>
> Oliver
>
>
> From: Erick Antezana
> [mailto:erick.a...@gmail.com<mailto:erick.a...@gmail.com>]
> Sent: Tuesday, April 29, 2014 4:50 PM
> To:
> obo-d...@lists.sourceforge.net<mailto:obo-d...@lists.sourceforge.net>
>
> Cc: ogg-d...@googlegroups.com<mailto:ogg-d...@googlegroups.com>;
> ggo-d...@googlegroups.com<mailto:ggo-d...@googlegroups.com>;
> Liu, Yue; Cathy Wu
> Subject: Re: [Obo-discuss] OGG updates. RE: Change name to OGG from
> GGO.
>
> Hi,
>
> a few questions:
>
> - any plans to include plant species (e.g. A thaliana)?
>
> - what's the source of the gene names? NCBI? HGBC?
>
> - what is the NEWENTRY ?:
> http://www.ontobee.org/browser/rdf.php?o=OGG&iri=http://purl.obolibrary.org/obo/OGG_3002829857
>
> - http://www.ncbi.nlm.gov/gene is not a valid URL
>
> cheers,
> Erick
>
> On 29 April 2014 19:44, He, Yongqun
> Email: yong...@med.umich.edu<mailto:yong...@med.umich.edu>
> Tel: 734-615-8231<tel:734-615-8231> (O)
> ogg-d...@googlegroups.com<mailto:ogg-d...@googlegroups.com>
> email address for discussion. If you have consistent interest in OGG
> development and applications and want to receive related emails and
> involve more discussions, please register an account in ogg-discuss.
> Thank you!
>
> (4) Loaded the OGG to the Ontobee. Here is the Ontobee site for OGG:
> http://www.ontobee.org/browser/index.php?o=OGG
> With this new site available now in Ontobee, I have removed GGO from
> Ontobee.
>
> My strong feeling is that the enthusiasm for OGG is very high. We will
> keep working on it and keep you updated. Thanks, all!
>
> Oliver He
> University of Michigan Medical School
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should
> not be used for urgent or sensitive issues
>
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos. Get
> unparalleled scalability from the best Selenium testing platform
> available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Obo-discuss mailing list
> Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should
> not be used for urgent or sensitive issues
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos. Get
> unparalleled scalability from the best Selenium testing platform
> available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Obo-discuss mailing list
> Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos. Get
> unparalleled scalability from the best Selenium testing platform
> available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Obo-discuss mailing list
> Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos. Get
> unparalleled scalability from the best Selenium testing platform
> available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Obo-discuss mailing list
> Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should
> not be used for urgent or sensitive issues
>
> --
> You received this message because you are subscribed to the Google
> Groups "ogg-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to ogg-discuss...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Darren Natale

unread,
May 1, 2014, 10:54:12 AM5/1/14
to obo-d...@lists.sourceforge.net, Scheuermann, Richard, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Liu, Yue, Cathy Wu
Regarding use case, the Protein Ontology does have need to reference
genes. We have thus far done so by creating our own OBO set in much the
same way as OGG, though with far less annotation. Aside from that, the
only differences I see is that we:

1) Take as the primary identifier the one from the model organism
databases (HGNC, MGI, PomBase, etc.) where possible, and use NCBI Gene
when those fail (or have yet to be imported). Thus, it is similar to
what is done for NCBI Taxonomy.

2) We use SO as the source of parent term for each defined gene.

Thus, in this stopgap method, the OBO stanza for CASP2 in human is:

[Term]
id: HGNC:1503
name: CASP2 (human)
def: "A protein coding gene CASP2 in human." [PRO:DNx]
comment: Category=external.
is_a: SO:0001217 ! protein_coding_gene
relationship: only_in_taxon NCBITaxon:9606 ! Homo sapiens

Not too elegant, admitted, and not something we'd like to maintain
(hence the comment about the category being 'external').

-
Darren Natale
> ------------------------------------------------------------------------------

Asiyah Yu Lin

unread,
May 1, 2014, 12:08:05 PM5/1/14
to obo-d...@lists.sourceforge.net, He, Yongqun, ogg-d...@googlegroups.com
The gene as an entity needs to be modeled in ontology, so that some links between gene and other entities can be formally established.
For example, gene with protein. Homolog genes, gene with promoters, gene with variants, gene with transcript, gene with mRNAs.

However, the approach can be different. 
I think NCBI's resource should be clarified in such an ontology or a knowledge base, as well as many other resources, for example Ensemble.

I have discussed with Oliver, one use case of OGG is that in many bioinformatics applications, or papers, a gene mentioned by its general name with specify a specific strain.
But in NCBI the gene is always strain specific, when we need a general gene entity, we do need an entity for that.
Of course this situation is different with Human genes.

My two cents,

Best,
Asiyah



################################################
Jedi Order:
There is no emotion, there is peace.
There is no ignorance, there is knowledge.
There is no passion, there is serenity.
There is no chaos, there is harmony.
There is no death, there is Force.

Our Jedi Code: May peace and force be with you.



He, Yongqun

unread,
May 1, 2014, 6:28:03 PM5/1/14
to obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com, ggo-d...@googlegroups.com, Cathy Wu, Scheuermann, Richard, Liu, Yue
Hi Darren,

Thanks for providing the very good use case. This further demonstrates where OGG and Protein Ontology (PR) can really collaborate.

As we know, a gene is a material entity of DNA sequence for synthesis of a functional protein or RNA molecule. Therefore, the relation between a gene and a protein is:

a protein 'encoded by' some gene

With the usage of both PR and OGG, we can say that:

Caspase-2 protein is 'encoded by' a casp2 gene

Another example, in our Brucellosis Ontology use case, we have many gene mutations. These gene mutations will lead to Brucella mutants. These mutants will not express the corresponding proteins (virulence factors) and become attenuated. It would be better to have both genes and proteins recorded in the ontology using OGG and PR. Similar things occur in the Vaccine Ontology.

Oliver


-----Original Message-----
From: Darren Natale [mailto:da...@georgetown.edu]
Sent: Thursday, May 01, 2014 10:54 AM
To: obo-d...@lists.sourceforge.net
> -------- "Accelerate Dev Cycles with Automated Cross-Browser Testing -
> For FREE Instantly run your Selenium tests across 300+ browser/OS
> combos. Get unparalleled scalability from the best Selenium testing
> platform available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Obo-discuss mailing list
> Obo-d...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Obo-discuss mailing list
Obo-d...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/obo-discuss

He, Yongqun

unread,
May 2, 2014, 7:11:42 PM5/2/14
to Asiyah Yu Lin, obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com

I cannot catch all that Asiyah has said. Here I want to provide some clarification on those issues based on my understanding:

 

Yes, the genes in the NCBI Gene resource represents strain-specific genes. This is indeed what I want in my Vaccine Ontology and Brucellosis Ontology use cases. Those genes used in vaccines and mutant generations are also strain-specific. Strain-specific genes represent those genes belong to a strain, which can be cultured in different labs and described studies. Therefore, the strain-specific genes do represent gene entities or classes that are the abstract form of the instances of genes in the tubes or stocks in real physical labs.

 

A higher level of gene entities can be the homolog genes that are shared by different strains or even different species. The NCBI HomoloGene resource records the information of many such genes among eukaryotic organisms. The homolog genes are useful for various studies as Erick also indicated in an earlier email. OGG may represent such homolog genes later.

 

So in my mind, OGG is not generated to represent the data in the NCBI Gene resource per se. It is generated to represent the gene entities as indicated above. We don’t want to have a situation where there is one “ontology” representing the NCBI Gene resource data, and there is another “ontology” representing the Ensembl Gene resource data.  As long as they represent the same  gene entities for the same strain (or homolog genes among strains/species), they should be represented using the same ontology terms (in OBO Foundry). According to the ontological realism principle, ontology terms represent the entities in reality. In comparison, a resource data represent the data of entities stored in the resource. They are different. This is another reason why I don’t want to use the NCBI Gene namespace per se to represent the gene entities at the class level. Ontology terms do have difference from the resource data items at the philosophical level.

 

Oliver  

Chris Mungall

unread,
May 2, 2014, 10:23:35 PM5/2/14
to He, Yongqun, Asiyah Yu Lin, obo-d...@lists.sourceforge.net, ogg-d...@googlegroups.com


On 2 May 2014, at 16:11, He, Yongqun wrote:

> I cannot catch all that Asiyah has said. Here I want to provide some
> clarification on those issues based on my understanding:
>
> Yes, the genes in the NCBI Gene resource represents strain-specific
> genes.

It depends on the species

> This is indeed what I want in my Vaccine Ontology and Brucellosis
> Ontology use cases. Those genes used in vaccines and mutant
> generations are also strain-specific. Strain-specific genes represent
> those genes belong to a strain, which can be cultured in different
> labs and described studies. Therefore, the strain-specific genes do
> represent gene entities or classes that are the abstract form of the
> instances of genes in the tubes or stocks in real physical labs.
>
> A higher level of gene entities can be the homolog genes that are
> shared by different strains or even different species. The NCBI
> HomoloGene resource records the information of many such genes among
> eukaryotic organisms. The homolog genes are useful for various studies
> as Erick also indicated in an earlier email. OGG may represent such
> homolog genes later.

I'm not sure we should think of these as being in OGG, or in PRO, or in
anything other than the source database.

Instead I would focus on documenting some simple standard rules anyone
can run to produce an RDF/OWL representation of their data resource of
choice. Then people can use standard web services to obtain whatever
subset of the genomics universe they need for a purpose, perhaps via
different implementations of the same rules.

> So in my mind, OGG is not generated to represent the data in the NCBI
> Gene resource per se. It is generated to represent the gene entities
> as indicated above. We don’t want to have a situation where there is
> one “ontology” representing the NCBI Gene resource data, and there
> is another “ontology” representing the Ensembl Gene resource data.
> As long as they represent the same gene entities for the same strain
> (or homolog genes among strains/species), they should be represented
> using the same ontology terms (in OBO Foundry).

No one resource will suffice for all purposes. Darren's approach of
having levels of precedence seems reasonable. But there will always be a
use for ID mappings (translated to owl sameAs or equivalentTo axioms).

> According to the ontological realism principle, ontology terms
> represent the entities in reality. In comparison, a resource data
> represent the data of entities stored in the resource. They are
> different. This is another reason why I don’t want to use the NCBI
> Gene namespace per se to represent the gene entities at the class
> level. Ontology terms do have difference from the resource data items
> at the philosophical level.

I don't understand the philosophy of wanting to introduce an extra layer
of indirection. If you want to represent the biological entity that an
NCBI gene ID refers to, use the NCBI gene ID.
>> <lini...@gmail.com<mailto:lini...@gmail.com><mailto:lini...@gmail.com<mailto:lini...@gmail.com>>>
>> wrote:
>>
>> Erick,
>> Homolog is about protein.
>> What relation you will propose for the genes of homologs?
>> Thanks,
>> Asiyah
>> On Apr 30, 2014 4:53 AM, "Erick Antezana"
>> <erick.a...@gmail.com<mailto:erick.a...@gmail.com><mailto:erick.a...@gmail.com<mailto:erick.a...@gmail.com>>>
>> wrote:
>> Hi Oliver,
>>
>> Many thanks for considering A thaliana and for opening the door to
>> other plant species.
>>
>> - are you planing to have any cross-link between genes across
>> species?
>> (based on homology for instance) I ask that since my users would be
>> interested in navigating thru the ontology: from gene G1 (in species
>> S1) to gene G2 (in species S2) -- where G1 and G2 share some
>> properties for instance...
>>
>> - how often do you update OGG?
>>
>> - would it be possible to have the NCBI id's instead of the OGG ones
>> ?
>> (PURL)
>>
>> - is there a way to get a local copy (VM image?) of Ontobee?
>>
>> cheers,
>> Erick
>>
>> On 30 April 2014 02:33, He, Yongqun
>> <yong...@med.umich.edu<mailto:yong...@med.umich.edu><mailto:yong...@med.umich.edu<mailto:yong...@med.umich.edu>>>
>> ogg-d...@googlegroups.com<mailto:ogg-d...@googlegroups.com><mailto:ogg-d...@googlegroups.com<mailto:ogg-d...@googlegroups.com>>;
>> ggo-d...@googlegroups.com<mailto:ggo-d...@googlegroups.com><mailto:ggo-d...@googlegroups.com<mailto:ggo-d...@googlegroups.com>>;
>> Liu, Yue; Cathy Wu
>> Subject: Re: [Obo-discuss] OGG updates. RE: Change name to OGG from
>> GGO.
>>
>> Hi,
>>
>> a few questions:
>>
>> - any plans to include plant species (e.g. A thaliana)?
>>
>> - what's the source of the gene names? NCBI? HGBC?
>>
>> - what is the NEWENTRY ?:
>> http://www.ontobee.org/browser/rdf.php?o=OGG&iri=http://purl.obolibrary.org/obo/OGG_3002829857
>>
>> - http://www.ncbi.nlm.gov/gene is not a valid URL
>>
>> cheers,
>> Erick
>>
>> On 29 April 2014 19:44, He, Yongqun
>> <yong...@med.umich.edu<mailto:yong...@med.umich.edu><mailto:yong...@med.umich.edu<mailto:yong...@med.umich.edu>>>
>> yong...@med.umich.edu<mailto:yong...@med.umich.edu><mailto:yong...@med.umich.edu<mailto:yong...@med.umich.edu>>
>> Tel:
>> 734-615-8231<tel:734-615-8231><tel:734-615-8231<tel:734-615-8231>>
>> ogg-d...@googlegroups.com<mailto:ogg-d...@googlegroups.com><mailto:ogg-d...@googlegroups.com<mailto:ogg-d...@googlegroups.com>>
>> Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net><mailto:Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net>>
>> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>>
>>
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should
>> not be used for urgent or sensitive issues
>>
>> ------------------------------------------------------------------------------
>> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For
>> FREE
>> Instantly run your Selenium tests across 300+ browser/OS combos. Get
>> unparalleled scalability from the best Selenium testing platform
>> available.
>> Simple to use. Nothing to install. Get started now for free."
>> http://p.sf.net/sfu/SauceLabs
>> _______________________________________________
>> Obo-discuss mailing list
>> Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net><mailto:Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net>>
>> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>>
>>
>> ------------------------------------------------------------------------------
>> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For
>> FREE
>> Instantly run your Selenium tests across 300+ browser/OS combos. Get
>> unparalleled scalability from the best Selenium testing platform
>> available.
>> Simple to use. Nothing to install. Get started now for free."
>> http://p.sf.net/sfu/SauceLabs
>> _______________________________________________
>> Obo-discuss mailing list
>> Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net><mailto:Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net>>
>> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>>
>> ------------------------------------------------------------------------------
>> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For
>> FREE
>> Instantly run your Selenium tests across 300+ browser/OS combos. Get
>> unparalleled scalability from the best Selenium testing platform
>> available.
>> Simple to use. Nothing to install. Get started now for free."
>> http://p.sf.net/sfu/SauceLabs
>> _______________________________________________
>> Obo-discuss mailing list
>> Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net><mailto:Obo-d...@lists.sourceforge.net<mailto:Obo-d...@lists.sourceforge.net>>
>> https://lists.sourceforge.net/lists/listinfo/obo-discuss
>>
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should
>> not be used for urgent or sensitive issues
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "ogg-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send
>> an email to
>> ogg-discuss...@googlegroups.com<mailto:ogg-discuss%2Bunsu...@googlegroups.com>.
Reply all
Reply to author
Forward
0 new messages