GSoC 2020 Project Introduction

25 views
Skip to first unread message

hemant yadav

unread,
May 22, 2020, 11:58:26 PM5/22/20
to COBRA Toolbox

Dear COBRA community,


For the next months, I will be contributing to COBRApy for the project Implementation of the SBML-JSON converter and SBML-JSON scheme as part of Google Summer of Code 2020. My project aims to develop an updated JSON representation fixing many known issues and shortcomings of COBRApy’s JSON representation of models and its compliance with SBML.


My primary mentors are Dr. Andreas Dräger and Dr. Matthias König. Here is the link to the proposal, and the link to the blog summarizing my work.


A key point in this effort is getting feedback from users and communities which depend upon JSON representation of COBRA models. Everybody concerned with the current JSON schema is welcome to participate in the discussions on issues labeled with JSON. At the end of this project, we will have version 2 for JSON schema with complete support for annotation, notes, and additional SBML packages like FBC and Groups. All the existing issues related to it will be solved.


Importantly, we will still support the current version 1.0 of the COBRApy JSON schema and be backward-compatible wherever possible with the new version 2.0.


I am looking forward to working with you all!


With best regards

Hemant

hy27...@gmail.com

Indian Institute of Technology, Roorkee, India


Ronan M.T. Fleming

unread,
May 26, 2020, 8:04:08 AM5/26/20
to COBRA Toolbox, Brett Olivier, Sarah Keating, Mike Hucka, Andreas Dräger, koen...@hu-berlin.de
Dear Hemant,

I would be happy to see more interoperability between COBRApy and the
COBRA Toolbox. At the moment, we fully support SBML input and output,
and will continue to do so. It is not yet clear to me from your
proposal what the added advantage of JSON is, compared to SBML. Maybe
it would be good to add some explanation of that to your proposal so
that the rationale is more clear. As I understand it, both JSON and
SBML are data-interchange formats, so there seems to be some
redundancy involved in developing something new that reproduces
functionality that is already there.

More generally, what I really think that the COBRA community needs is
a standardised "compute" format for storing models. (To those computer
scientists amongst you, please suggest a more appropriate term than
"compute" format, if there is one.) In your Fig 1., it shows the
COBRApy model format in the middle but what should really be in the
middle is a language agnostic standardised "compute" format. That way,
a user of any COBRA modelling software that supported this
standardised "compute" format could build a COBRA pipeline, e.g., in a
Jupyter notebook, that leveraged COBRA code in multiple languages. I
discussed this concept with Brett Oliver, in cc, some time ago and he
also had similar ideas. Such a standardized compute format could be,
for example, an SQL database.

In case it helps, here is a definition of the existing standard cobra
model fields in the cobra toolbox:
https://github.com/opencobra/cobratoolbox/blob/master/docs/source/notes/COBRAModelFields.md

Regards,

Ronan
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "COBRA Toolbox" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cobra-toolbo...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/cobra-toolbox/a50b58d2-27af-4e79-922a-8940219797d7%40googlegroups.com.



--
--
Mr. Ronan MT Fleming B.V.M.S. Dip. Math. Ph.D.
----------------------------------------------------------------------------
Assistant Professor,
Division of Systems Biomedicine and Pharmacology,
Leiden Academic Centre for Drug Research,
Faculty of Science,
Leiden University.
https://www.universiteitleiden.nl/en/staffmembers/ronan-fleming
&
H2020 Project Coordinator,
Systems Medicine of Mitochondrial Parkinson’s Disease,
http://sysmedpd.eu
&
Adjunct Lecturer,
School of Medicine,
National University of Ireland, Galway.
----------------------------------------------------------------------------
Peer-reviewed publications: https://goo.gl/FZPG23
Mobile: +353 873 413 072
Skype: ronan.fleming
----------------------------------------------------------------------------
(This message is confidential and may contain privileged information.
It is intended for the named recipient only. If you receive it in
error please notify me and permanently delete the original message and
any copies.)

Andres Mauricio Pinzon Velasco

unread,
May 26, 2020, 8:13:57 AM5/26/20
to cobra-toolbox, Brett Olivier, Sarah Keating, Mike Hucka, Andreas Dräger, koen...@hu-berlin.de
Dear Dr. Fleming,

Is there any further documentation on the proposed idea of a "compute"format?
Sounds really interesting.

Regards,

Andrés M. Pinzón Ph.D.
AssociateProfessor
Instituto de Genética - Universidad Nacional de Colombia
+57 (1) 3165000 Ext. 11618 Office: 218





Ronan M.T. Fleming

unread,
May 26, 2020, 8:51:04 AM5/26/20
to COBRA Toolbox, Brett Olivier, Sarah Keating, Mike Hucka, Andreas Dräger, koen...@hu-berlin.de, Christopher Henry
Hi Andrés,

at the moment, it is at the conversational concept stage!

I am not a computer scientist, so I am not sure what the best
technical implementation might be, but I had imagined a scalable
solution using an in-memory SQL database that would be a compromise
between the database representation in, e.g., VHM
(https://academic.oup.com/view-large/figure/201828864/gky992fig1.jpeg,
schema here: https://orbilu.uni.lu/handle/10993/35530), BIGG
(https://academic.oup.com/nar/article/48/D1/D402/5614178), etc.

Regards,

Ronan
> To view this discussion on the web visit https://groups.google.com/d/msgid/cobra-toolbox/CACX41UTVqe-9XJm%3DHE24K%2Bge57HrvQNZ-4iJXGYqbv%3D8ccHaxQ%40mail.gmail.com.

hemant yadav

unread,
May 26, 2020, 9:25:18 AM5/26/20
to COBRA Toolbox
Hi Ronan,

SBML is, for sure, the most widely used format for the exchange of biological models. JSON, however, is the main data exchange format on the web.

Currently, there exist many issues with the JSON representation of the SBML model. So those tools which depend upon JSON format for the biological model are currently facing many issues. My project aims at adding the complete support of these SBML models in JSON format. To implement this, we (those who are relying upon JSON formats) need to agree upon a single format for the unsupported components of the SBML model in JSON so that we can develop a single JSON schema which all tools can use altogether. Though the project is going to be a part of COBRApy, the JSON schema will have a general representation of the model which everybody can use.

So it was just a gentle invitation to all the communities depending upon JSON format to take part in the discussion going on the COBRApy issue list for JSON format. This will be helpful for us to complete the project within the specified time and with a most general JSON schema which everybody can use.

Regards,
Hemant
> To unsubscribe from this group and stop receiving emails from it, send an email to cobra-...@googlegroups.com.

Andres Mauricio Pinzon Velasco

unread,
May 26, 2020, 9:28:37 AM5/26/20
to cobra-toolbox, Brett Olivier, Sarah Keating, Mike Hucka, Andreas Dräger, koen...@hu-berlin.de, Christopher Henry
HI Ronan,
In that sense I think that maybe the JSON schema could be appropriate to "in memory" represent  the data,
that could possibly be a better way to implement JSON in an innovative way in the field.
If you think we all could start thinking on how to approach this, initially as a "mailing list project", please count me in.

Best,

Matthias König

unread,
May 27, 2020, 3:30:41 AM5/27/20
to COBRA Toolbox
Hi all,

I am one of the mentors of the JSON-SBML project.

The main idea is to have a simple web-enabled exchange format which can directly be used and interacted with in Javascript and on the web.
This will not replace SBML which will be the defining standard for models, but enable additional use cases by providing a better conversion between SBML <-> JSON. There is a current JSON scheme which is used by tools like Escher and the BiGG database, but it has many limitations. The plan is to improve the encoding of information in the JSON format so that for instance
- annotations are handled correctly
- meta-data can be exchanged
- group information (formerly subsystems) can be encoded and exchanged
- additional constraints encoded in models (besides stoichiometric constraints), but not exchanged currently can be exchanged.

The JSON is basically an in-memory representation (at least in Javascript it can be easily converted in a first class object), same in other languages. Otherwise very easy to manipulate because basically nested HashMaps and Lists. By providing a JSON schema it is easy to generate the corresponding class objects in any programming language (schema -> class converters). So yes, this is one of the direct use cases enabled via the JSON. Often one only wants to do simple things with a model and a full SBML parsing is overhead if one just wants to access attributes from a HashMap.

It is basically an additional format for special use cases similar to MAT files for models (which also don't replace the SBML).

If you have any questions please let me know. The proposal describes what is planned for the project.

Best Matthias
>> > To unsubscribe from this group and stop receiving emails from it, send an email to cobra-...@googlegroups.com.
>> To unsubscribe from this group and stop receiving emails from it, send an email to cobra-...@googlegroups.com.

>> To view this discussion on the web visit https://groups.google.com/d/msgid/cobra-toolbox/CAOivGYuk4UyDsKExOM6eqAngyRvuYXO5efu%3DWpSU_d0h0KEyjw%40mail.gmail.com.
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "COBRA Toolbox" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cobra-...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to cobra-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages