Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Secondary ID support for BridgeDb

6 views
Skip to first unread message

Manas Awasthi

unread,
Jun 6, 2019, 3:58:41 AM6/6/19
to bridgedb-discuss
Hello Everyone,

As part of my Google Summer of Code Project, adding secondary id support is one of its parts. I am having trouble with accessing the secondary id. Has any of the developers that have worked with the secondary id before can give me a lead so as to how to access secondary ids. Any help and leads are appreciated. Initially, we thought that the MySQL database would contain the secondary id but after the schema check, we found out that the database doesn't contain the secondary id's.


Example of a Secondary Id:


Here the primary id is CheBi ID that is CHEBI:17992, whereas the secondary ids here are CHEBI:45795, CHEBI:9314, CHEBI:15128, CHEBI:26812  

Regards,
Manas Awasthi
Sophomore Year
University of Delhi.

Egon Willighagen

unread,
Jun 6, 2019, 4:56:44 AM6/6/19
to bridgedb...@googlegroups.com
Hi all,

On Thu, Jun 6, 2019 at 9:58 AM Manas Awasthi <marv...@gmail.com> wrote:
I am having trouble with accessing the secondary id. Has any of the developers that have worked with the secondary id before 
can give me a lead so as to how to access secondary ids. Any help and leads are appreciated.

The schema Nuno recovered (PR 111) shows the schema of a "datanode":


Each "datanode" (terminology likely comes from the GPML format used by WikiPathways), but read it as a crossreference (Xref in the Java library).

So, any primary and any secondary identifiers *is* a datanode. They are just datanodes with a special property, are they primary or not (and then thus secondary)?

So, I think the datanode SQL schema just needs a boolean field "isPrimary".

Does that make sense? If not, what am I missing in your concerns?

Egon

--
Hi, do you like citation networks? Already 51% of all citations are available available for innovative new uses. Join me in asking the American Chemical Society to join the Initiative for Open Citations too. SpringerNature, the RSC and many others already did.

-----
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: 0000-0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

Nuno M

unread,
Jun 6, 2019, 6:06:05 AM6/6/19
to bridgedb...@googlegroups.com
Hi Manas,

> Has any of the developers that have worked with the secondary id before can give me a lead so as to how to access secondary ids. > Any help and leads are appreciated. Initially, we thought that the MySQL database would contain the secondary id but after the > schema check, we found out that the database doesn't contain the secondary id's.

Can you give a bit more context on this? 
What did you expect to find, in terms of structure, and what do you want to achieve, as a goal?


--
You received this message because you are subscribed to the Google Groups "bridgedb-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bridgedb-discu...@googlegroups.com.
To post to this group, send email to bridgedb...@googlegroups.com.
Visit this group at https://groups.google.com/group/bridgedb-discuss.
To view this discussion on the web visit https://groups.google.com/d/msgid/bridgedb-discuss/ffb5930d-984f-4e5b-b7c4-7996e4b6c783%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Manas Awasthi

unread,
Jun 7, 2019, 6:16:45 AM6/7/19
to bridgedb-discuss


On Thursday, June 6, 2019 at 2:26:44 PM UTC+5:30, Egon Willighagen wrote:
Hi all,

On Thu, Jun 6, 2019 at 9:58 AM Manas Awasthi <marv...@gmail.com> wrote:
I am having trouble with accessing the secondary id. Has any of the developers that have worked with the secondary id before 
can give me a lead so as to how to access secondary ids. Any help and leads are appreciated.

The schema Nuno recovered (PR 111) shows the schema of a "datanode":


Each "datanode" (terminology likely comes from the GPML format used by WikiPathways), but read it as a crossreference (Xref in the Java library).

So, any primary and any secondary identifiers *is* a datanode. They are just datanodes with a special property, are they primary or not (and then thus secondary)?

This makes sense and would fulfill our purpose but then again how exactly will the database be populated of the values of isPrimary.   

So, I think the datanode SQL schema just needs a boolean field "isPrimary".

Once we can populate the database with correct values in isPrimary attribute we can create the function in the code to return the same 

Manas Awasthi

unread,
Jun 7, 2019, 6:22:43 AM6/7/19
to bridgedb-discuss
Hello Sir,

We are trying to create an attribute to distinguish between secondary and primary ids. We may find any legacy data stored in these old annotations / ids' while mapping them. And we also want to create a function getPrimaryID that would return the primary id associated with the secondary id.


On Thursday, June 6, 2019 at 3:36:05 PM UTC+5:30, Nuno M wrote:
Hi Manas,

> Has any of the developers that have worked with the secondary id before can give me a lead so as to how to access secondary ids. > Any help and leads are appreciated. Initially, we thought that the MySQL database would contain the secondary id but after the > schema check, we found out that the database doesn't contain the secondary id's.

Can you give a bit more context on this? 
What did you expect to find, in terms of structure, and what do you want to achieve, as a goal?


Manas Awasthi <marv...@gmail.com> escreveu no dia quinta, 6/06/2019 à(s) 09:58:
Hello Everyone,

As part of my Google Summer of Code Project, adding secondary id support is one of its parts. I am having trouble with accessing the secondary id. Has any of the developers that have worked with the secondary id before can give me a lead so as to how to access secondary ids. Any help and leads are appreciated. Initially, we thought that the MySQL database would contain the secondary id but after the schema check, we found out that the database doesn't contain the secondary id's.


Example of a Secondary Id:


Here the primary id is CheBi ID that is CHEBI:17992, whereas the secondary ids here are CHEBI:45795, CHEBI:9314, CHEBI:15128, CHEBI:26812  

Regards,
Manas Awasthi
Sophomore Year
University of Delhi.

--
You received this message because you are subscribed to the Google Groups "bridgedb-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bridgedb...@googlegroups.com.

Egon Willighagen

unread,
Jun 7, 2019, 6:24:50 AM6/7/19
to bridgedb...@googlegroups.com
On Fri, Jun 7, 2019 at 12:16 PM Manas Awasthi <marv...@gmail.com> wrote:
On Thursday, June 6, 2019 at 2:26:44 PM UTC+5:30, Egon Willighagen wrote:
So, I think the datanode SQL schema just needs a boolean field "isPrimary".
>
> Once we can populate the database with correct values in isPrimary attribute we can create the function in the code to return the same 

Can you please elaborate? I am not sure I understand why you mean with "once we can populate".... becuase I think you already can: I suggest to make a JUnit test with some mock data and go ahead. Just pick the data from that example you gave. 


Check this link (never mind that that method problaby needs renaming too, and be genomics focused): https://github.com/bridgedb/create-bridgedb-metabolites/blob/master/createDerby.groovy#L244

That line will read after your updates are done: database.addGene(shortRef, isPrimary);

The second, new parameter is the boolean that needs to propagate to the database, Derby or SQL.

Egon

Manas Awasthi

unread,
Jun 7, 2019, 6:38:43 AM6/7/19
to bridgedb-discuss
Hello Sir,

What I mean by populating the database with the values is when we are creating the database and adding the values to it how will the values in the attribute isPrimary get stored? To elaborate it further, we have 4 secondary ids for sucrose in CheBI database so when all the 4 secondary ids will datanodes, how will the attribute isPrimary take false as a value in it?

Regards,
Manas Awasthi 

Egon Willighagen

unread,
Jun 7, 2019, 7:04:49 AM6/7/19
to bridgedb...@googlegroups.com
Hi all,

On Fri, Jun 7, 2019 at 12:38 PM Manas Awasthi <marv...@gmail.com> wrote:
What I mean by populating the database with the values is when we are creating the database and adding the values to it how will the values in the attribute isPrimary get stored?

If it is a primary identifier (which is defined by the ChEBI in the source code line I gave), then the code would use:

database.addGene(shortRef, true);

And otherwise (if not primary):

database.addGene(shortRef, false);

Or, alternatively, and in retrospect better:

Xref shortRef = new Xref(shortid, BioDataSource.CHEBI, false); // for secondary identifiers
Xref shortRef = new Xref(shortid, BioDataSource.CHEBI, true); // for primary identifiers
addError = database.addGene(shortRef);
 
To elaborate it further, we have 4 secondary ids for sucrose in CheBI database so when all the 4 secondary ids will datanodes, how will the attribute isPrimary take false as a value in it?

So, each identifier (an Xref in the source code) will now have three properties: the data source, the identifier string, and the isPrimary boolean.

Did I understand your question correctly now?

Egon

Nuno M

unread,
Jun 7, 2019, 8:36:25 AM6/7/19
to bridgedb...@googlegroups.com
Hi Egon, Manas,

I think what Manas is questioning is where to store, persistently, the property "isPrimary" / "isSeconday" - at the database level.
Is that correct Manas?


--
You received this message because you are subscribed to the Google Groups "bridgedb-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bridgedb-discu...@googlegroups.com.

To post to this group, send email to bridgedb...@googlegroups.com.
Visit this group at https://groups.google.com/group/bridgedb-discuss.
Reply all
Reply to author
Forward
0 new messages