Need your comments and support about BioModels OMEX new metadata.rdf

22 views
Skip to first unread message

Tung V. N. Nguyen

unread,
Oct 11, 2024, 3:37:01 PM10/11/24
to combine-annot
Dear all,

We recently developed a module to regenerate OMEX's metadata.rdf by adding as much as possible of the metadata returned by the REST endpoint https://www.ebi.ac.uk/biomodels/BIOMD0000000500?format=json, for example. 

The new version generated by our service includes more information about models, such as the metadata you see when accessing models in the BioModels repository.
For example, accessing BIOMD0000000500 on wwwdev: https://wwwdev.ebi.ac.uk/biomodels/model/generate-omex-metadata-rdf/BIOMD0000000500, metadata.rdf looks much different from the older version.

I have tried my best to respect RDF grammar and OMEX specs. But I am not very sure that the new content might be useful for you. Is the new content compatible with existing tools/software? Can you give us some comments and your thoughts?

It would be helpful if someone who had experience with Apache Jena, the library for building Semantic Web and Linked Data applications sharing with me how to reorder properties when adding them to the RDF Graph, how to reoder namespaces by user-defined rules, and how to add some vital attributes to Property.

We highly appreciate your comments and suggestions.

Cheers,
Tung




Lucian Smith

unread,
Oct 15, 2024, 7:50:50 PM10/15/24
to Tung V. N. Nguyen, combine-annot
The biosimulations web site does indeed rely on some information being present in the rdf file, which I believe your auto-generated file does not have.  So at the very least, your autogeneration scheme should add to any existing metadata file, not replace it.  The metadata file for 500, for example, is downloadable from biodmodels, or viewable https://github.com/sys-bio/temp-biomodels/blob/main/final/BIOMD0000000500/metadata.rdf  The one that's there is *very* bare-bones; ideally, *all* of the annotation information from the SBML file would make its way into the metadata file.

Also, I have the script that generated all the new metadata.rdf files we just updated the entries with, so incorporating that also seems like a good idea?

Looking at the actual metadata file that you created, you seem to have created new terms for a lot of things that have existing terms.  You should definitely use existing terms instead of new ones if at all possible.

Critique aside, however, I do think it would be nice to include more information from the REST API into the metadata file; that seems like a reasonable place to put it, and I think it would be useful to have it there.

-Lucian

--
You received this message because you are subscribed to the Google Groups "combine-annot" group.
To unsubscribe from this group and stop receiving emails from it, send an email to combine-anno...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/combine-annot/1610f7d7-4e10-4b2b-871f-c56cea9d8417n%40googlegroups.com.

Rahuman Sheriff

unread,
Oct 16, 2024, 3:32:11 AM10/16/24
to Lucian Smith, Tung V. N. Nguyen, combine-annot
Hi Lucian,
Thanks for the feedback and sharing the example.
We will use the existing terms instead of new ones, could you point us to where could we find the list of those?
Best regards
Sheriff


Goksel Misirli

unread,
Oct 16, 2024, 4:53:15 AM10/16/24
to Rahuman Sheriff, Lucian Smith, Tung V. N. Nguyen, combine-annot
Thanks for sharing the examples. The service looks very helpful.

As Lucian mentined, if the JSON file is about metadata, then all of it can be represented in RDF.

Some suggestions:
https://wwwdev.ebi.ac.uk/biomodels/BIOMD0000000500#Files is not referenced in https://wwwdev.ebi.ac.uk/biomodels/BIOMD0000000500

You can consider adding the following. Rather than hasFiles there might be a better term!
BIOMD0000000500 hasFiles BIOMD0000000500#Files

Alternatively, you can remove BIOMD0000000500#Files completely. Instead,  you can use the following pattern for each file:
BIOMD0000000500 hasfile manifest.xml

Similarly, you can link the history as well. Again double check if there is an existing term instead of hasHistory.

BIOMD0000000500 hasHistory BIOMD0000000500#History

Also, you can consider if you need a separate History resource.

Other suggestions:<bmTerms:file>manifest.xml</bmTerms:file>
-->
<bmTerms:file rdf:resource="manifest.xml"/>


<bmTerms:description
--> 
<rdfs:comments>


<bmTerms:name>
-->
<rdfs:label>

bmTerms:revision --> dcterms:hasVersion

I hope this helps.

Goksel

Tung V. N. Nguyen

unread,
Oct 17, 2024, 12:25:15 PM10/17/24
to combine-annot
Dear Lucian and Goksel,

Thank you so much for your valuable comments and suggestions.

I have tried my best to reuse as much as possible existing terms.
We have inevitably to propose new definitions.
We've had to postpone this work for ages due to the lack of resources and controlled vocabularies. 

Cheers,
Tung

David Nickerson

unread,
Oct 17, 2024, 5:58:37 PM10/17/24
to combine-annot
Hi Tung,

Just to throw a few more comments into the mix - and to agree with the others that its great to see movement in this direction.

I'm curious to know why you'd include a list of files in the RDF? The OMEX manifest file is generally the way such a list would be defined and includes more information about file types etc. So if you were looking to include these metadata files into a COMBINE Archive then I can't see a need to also list the files in there. Or perhaps there is a another integration worth exploring to import manifest files into an RDF graph for enhanced querying?

The OMEX Metadata Specification (https://identifiers.org/combine.specifications:omex-metadata.1.2) has a bunch of recommended vocabularies to use. For example, Dublin Core for descriptions, Biomodels Qualifiers for linking to external references, foaf for linking to people, etc.

If there was a mapping between the bmTerms vocabulary and existing ones, then that could be used...but generally we find it easier just to adopt the existing terms unless you actually intend the annotation to mean something different.

Also, for the model history there has been quite a bit of work presented/discussed at recent HARMONY and COMBINE meetings about using PROV. See https://dl.acm.org/doi/abs/10.5555/3586210.3586387 (uses BioModels as the example repository) or  https://doi.org/10.1007/978-3-319-98379-0_17 for a couple of examples in this direction. I'm not sure if anything has been formalised beyond the OMEX Metadata Spec which has very simple provenance annotation recommendations (for example, bqmodel:isDerviedFrom to link to previous versions of a model).

Cheers,
David.

Tung V. N. Nguyen

unread,
Oct 21, 2024, 8:12:05 AM10/21/24
to David Nickerson, combine-annot
Hi David,

Thank you for all your constructive comments! I have some inline comments below.

Cheers,
Tung
On Thursday 17 October 2024 at 22:58:37 UTC+1 david.n...@gmail.com wrote:
Hi Tung,

Just to throw a few more comments into the mix - and to agree with the others that its great to see movement in this direction.

I'm curious to know why you'd include a list of files in the RDF? The OMEX manifest file is generally the way such a list would be defined and includes more information about file types etc. So if you were looking to include these metadata files into a COMBINE Archive then I can't see a need to also list the files in there. Or perhaps there is a another integration worth exploring to import manifest files into an RDF graph for enhanced querying?
The list of files has other info like checksums, file descriptions and sizes not included in the manifest.xml. I haven't seen any specific use cases for using those attributes yet. However, I believe that enriching more data alongside the other existing fields in RDF won't affect any running applications and software.

The OMEX Metadata Specification (https://identifiers.org/combine.specifications:omex-metadata.1.2) has a bunch of recommended vocabularies to use. For example, Dublin Core for descriptions, Biomodels Qualifiers for linking to external references, foaf for linking to people, etc.
Yes. I am recycling some of the recommended vocabularies. I unavoidably define new vocabularies that I cannot find out anywhere or at least those terms do not express closely enough our context.
 
If there was a mapping between the bmTerms vocabulary and existing ones, then that could be used...but generally we find it easier just to adopt the existing terms unless you actually intend the annotation to mean something different.
Certainly, no one wants to reinvent the wheel unless it is hard to find out expected controlled vocabularies. For example: the curation statusmodelling approachmodel format, or model tags (aka. keywords) are the things I cannot find anywhere at the top of my head.
 
Also, for the model history there has been quite a bit of work presented/discussed at recent HARMONY and COMBINE meetings about using PROV. See https://dl.acm.org/doi/abs/10.5555/3586210.3586387 (uses BioModels as the example repository) or  https://doi.org/10.1007/978-3-319-98379-0_17 for a couple of examples in this direction. I'm not sure if anything has been formalised beyond the OMEX Metadata Spec which has very simple provenance annotation recommendations (for example, bqmodel:isDerviedFrom to link to previous versions of a model).
 
Thanks for these suggested publications! After scanning two publications, my understanding is that they are presenting provenance, origins of the models, how one model reused other ones and the evolution of models. 
Meanwhile, the History tab in BioModels is slightly different. It records what has been changed on the model in question. We can reluctantly say that, for example, MODEL241020001.2 (revision 2) is a better version of MODEL2410200001.1 (revision 1) in the concept of provenance. It seems that there is no concrete implementation at all like we did for biomodel-qualifiers or biology-qualifiers. Please help correct me if I were wrong.

Lucian Smith

unread,
Oct 23, 2024, 1:08:10 PM10/23/24
to Tung V. N. Nguyen, David Nickerson, combine-annot
Maybe we could help best if you provided a list of new vocabulary terms you're adding, and we could help find existing ones?  It might also be that while you believe the intent might not be close enough to use someone else's vocabulary term, others might disagree, and believe it would be better to go ahead and use the existing term, even though the definition might not be a perfect match.

-Lucian

Reply all
Reply to author
Forward
0 new messages