Hi Aaron and Xiaoxia, co-authors if you accept the offer ....
Jeremy and I are hoping to submit a paper to PSS (Pacific Symposium on
Biocomputing).
http://psb.stanford.edu/cfp-semweb.html
The deadline is totally unrealistic, July 18th, but I felt
we already have enough material for a paper, if we could just 'edit it.'
I started with recasting our notes from the tutorial we gave at ISMB. The
notes and
slides can be found here:
http://www.biopathways.org/ismb2005tutorial-am6/
The recast paper is attached
During our tutorial we found we needed to introduce people to RDF
and the semantic web. So, my notion for the paper is to take the integration
case
studies and show what kinds of issues arise and then show how they were
resolved and how they could more easily be addressed with RDF and semantic
web
technologies.
For example, we already have text that discusses how SBML can be extended
(annotated) with BioPAX; something Jeremy and I worked out a while ago,
but have not yet published.
Recently there has been a lot of discussion on the biopax-discuss list
about representing generic entities. I haven't caught up with it yet,
but I've got a sense of some of the issues. The discussions lead to
the text Jeremy wrote below. Jeremy thought it would be good to start with
the generic idea and I agreed to recast that into PSB format. It's below,
but I'm not sure how to structure it in the new doc.... Jeremy???
Joanne
-----Original Message-----
From: Jeremy Zucker [mailto:
zuc...@research.dfci.harvard.edu]
Sent: Tuesday, July 12, 2005 1:05 PM
To:
jluc...@predmed.com
Subject: Categories of Generic entity representation issues
Hello folks,
After discussions with Xiaoxia and Aaron yesterday, it seems there are
several representation issues that are being confounded.
The overall goal is to use BioCyc data to develop complete and consistent
metabolic flux models. Issues arise when the data is incorrect or the
representation scheme is ambiguous.
Below is a discussion of some of the ambiguities that currently exist in
BioCyc.
BioCyc organizes compounds into a class hierarchy from more general to more
specific. Instances of these classes contain information such as molecular
weights, chemical formulas, and/or atomic structure. This information is
useful for determining whether a reaction is balanced, so that it can be
incorporated into a metabolic flux model. As long as all the participants of
a reaction are instances, the meaning is clear.
However, BioCyc also permits classes of compounds to participate in a
reaction. We call this a generalized reaction. Generalized reactions
contain several ambiguities that need to be resolved.
1. A generalized reaction is typically shorthand for several specific
reactions. The challenge in this case is to infer the specific reactions
from the general reaction. We present several categories of generalized
reactions and discuss the strategy for recognizing and resolving the correct
inference.
To describe the various categories, it is important to have a prototypical
example of each type of generic.
atomic structure matching:
EC#
1.1.1.1: an alcohol -> an aldehyde or keytone
Polymerization reaction:
Glycogens + Glucose -> Glycogens
Symbol/name matching:
NAD(P)H -> NAD(P)
More to follow...
Jeremy