Additional class for other analyses of models

4 views
Skip to first unread message

Jonathan Karr

unread,
May 17, 2021, 8:00:03 AM5/17/21
to sed-ml-...@googlegroups.com
I'd like to propose another addition to L1V4, an additional class for representing other analyses of models. This proposal is particularly important for qualitative modeling. Such analyses are also common in kinetic modeling.

Examples:
  • Calculation of eigenvalues
  • Calculation of Lyapunov exponents
  • Reachability analysis
Proposal
  • Add an additional child class of `Simulation` called `Analysis`.
  • The new class would have no additional attributes.
  • While this class would be functionally equivalent to `SteadyState`, this class is needed because `SteadyState` conveys a specific meaning that is awkward for some analyses.
Extended proposal
  • The above is a simple way to address this need, but awkwardly makes `Analysis` a child of `Simulation`.
  • Better, but more extensive change:
    • Rename `Simulation` to something like `AbstractAnalysis` or `AbstractModelOperation` (`Simulation` could be kept as an alias)
Jonathan

Frank Bergmann

unread,
May 17, 2021, 8:25:56 AM5/17/21
to sed-ml-discuss
I was actually thinking, that some of that would be covered already using the `DependentVariable` construct. The idea being, that with that construct it would be possible to extract certain (computed) properties at a certain point in time. Such as the Jacobian, or the eigenvalues of the Jacobian at different times (it could be while a time course simulation was is run, or after bringing the model to steady state). 

Additionally, I wonder if it would still be fine to have this a potential `Analysis` class in a `listOfSimulations`. 

Frank

Jonathan Karr

unread,
May 17, 2021, 10:08:29 AM5/17/21
to sed-ml-...@googlegroups.com
I agree `DepedentVariable` addresses some use cases, but it seems strange to me to have to think of all analyses as observables (SED variables) of simulations. How do you envision encoding analyses of the initial state (more generally, current state) that don't involve simulation of any steps or the calculation of a steady state?
  • For example, the COPASI UI and format doesn't present Lyapunov Exponents as an observable of a simulation. Another example from COPASI is MCA, which doesn't need to be coupled to a steady-state simulation.
  • Another example is network randomization. This is something that can't currently be expressed as a model change and would be awkward to express as a steady state simulation. Another example of a transformation is the calculation of a reduced model.
As part of the "extended" proposal, ideally `listOfSimulations` would also be renamed appropriately.

Jonathan

--
You received this message because you are subscribed to the Google Groups "sed-ml-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sed-ml-discus...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/sed-ml-discuss/c9ae08ea-2d7f-4c20-9080-ab2ef5a6c381n%40googlegroups.com.

Frank Bergmann

unread,
May 17, 2021, 11:05:26 AM5/17/21
to sed-ml-discuss
Hello Jonathan, 

indeed i would allow the dependent variable also to be used at initial time. In that case i'd be using the modelReference, rather than a taskReference, to indicate that i would want the model without any tasks applied to it. 

As far as COPASI is concerned, the individual exponents would indeed be available to select for reports / plot item in the report definitions / plot specifications (selected from the results section). Just as the results from MCA are selectable from there. But i guess how COPASI implements it is not as important here. I was just hoping that we were closer to finalizing L1V4. 

Cheers
Frank

Jonathan Karr

unread,
May 17, 2021, 12:49:40 PM5/17/21
to sed-ml-...@googlegroups.com
Hi Frank,

I'm not sure I understand your suggestion. Could you sketch it out with a little more detail?

Jonathan

Lucian Smith

unread,
May 17, 2021, 5:46:37 PM5/17/21
to sed-ml-...@googlegroups.com
I've been working on this, and this is what I've come up with:

<?xml version="1.0" encoding="UTF-8"?>
<sedML xmlns="http://sed-ml.org/sed-ml/level1/version4" level="1" version="4">
  <listOfModels>
    <model id="model0" language="urn:sedml:language:sbml.level-3.version-1" source="case_01.xml"/>
  </listOfModels>
  <listOfSimulations>
    <uniformTimeCourse id="sim0" initialTime="0" outputStartTime="0" outputEndTime="10" numberOfSteps="10">
      <algorithm kisaoID="KISAO:0000019"/>
    </uniformTimeCourse>
  </listOfSimulations>
  <listOfTasks>
    <task id="task0" modelReference="model0" simulationReference="sim0"/>
  </listOfTasks>
  <listOfDataGenerators>
    <dataGenerator id="jacobian">
      <math xmlns="http://www.w3.org/1998/Math/MathML">
        <ci> j </ci>
      </math>
      <listOfVariables>
        <variable id="j" symbol="urn:sedml:symbol:jacobian:full" taskReference="task0" modelReference="model0"/>
      </listOfVariables>
    </dataGenerator>
  </listOfDataGenerators>
  <listOfOutputs>
    <report id="report">
      <listOfDataSets>
        <dataSet id="Jacobian_report" label="Jacobian" dataReference="jacobian"/>
      </listOfDataSets>
    </report>
  </listOfOutputs>
</sedML>

The main problem that I can see is that the 'j' in the MathML is actually a matrix instead of a vector, but that seems... tolerable?  Add to this the issue that the report itself is reporting a matrix instead of a vector, which is weird, and there's no way to really label the axes...

I guess in the end the whole system is janky and I don't like it, and I hope we can come up with something better for L2, but it seems at least somewhat workable for now.

(I'm also happy to change the symbol URN to a KiSAO term.)

I have other ideas about other analyses, but I'll save those for a separate email--this one is more 'this is what we've been working towards with the DependentVariable construct'.

-Lucian


Jonathan Karr

unread,
May 17, 2021, 6:05:17 PM5/17/21
to sed-ml-...@googlegroups.com
This seems like a reasonable way to request a time course of Jacobians. Still, it seems like it should be possible to describe a Jacobian computation, and other analyses, without simulations.

The matrix issue seems no worse to me than data generators for repeated tasks or spatial simulations. In each case, the implied values of data generators aren't vectors, and SED-ML doesn't provide a scheme to label the other axes or their elements.

Jonathan

Lucian Smith

unread,
May 17, 2021, 6:17:53 PM5/17/21
to sed-ml-...@googlegroups.com
Well, hmm.  Actually, it was intended to request a *post-time course* Jacobian, not a time course *of* Jacobians.

However, the only way to ensure this is if we define a Jacobian as something you can't get a time course of.  This is kind of tricky.

Lucian Smith

unread,
May 17, 2021, 6:27:36 PM5/17/21
to sed-ml-...@googlegroups.com
On Mon, May 17, 2021 at 5:00 AM Jonathan Karr <jonr...@gmail.com> wrote:
I'd like to propose another addition to L1V4, an additional class for representing other analyses of models. This proposal is particularly important for qualitative modeling. Such analyses are also common in kinetic modeling.

Examples:
  • Calculation of eigenvalues
  • Calculation of Lyapunov exponents
  • Reachability analysis
As mentioned, calculation of eigenvalues is something we were already moving towards with the 'DependentVariable' construct.

Interestingly, however, I think this touches on something that we hadn't yet fully grasped, or at least haven't expressed in the specs yet.  In basic terms:

* A simulation (right now) changes the model state, according to a set of rules.
* A datagenerator (up until now) has two modes:  in the first mode, it collects data *as the model state is being changed*, and in the second mode, it collects data *based on the current model state*.

As evidenced by the move to allowing calculating the jacobian, eigenvalues, etc. by using a DependentVariable, these all fall into the second 'mode' of a DataGenerator:  a computation that is performed on a model state that does not change that state.  (Also, as noted, my initial assumption was that these analyses would *always* be 'current model state' analyses, and never 'as you change the model state' analyses.  Hrm hrm hrm.  Anyway.)

At any rate, it makes sense to allow people to perform 'mode two' calculations on 'pristine' models that have not yet been manipulated by any simulation.  It should be possible to calculate the Jacobian on the initial state of a model, for example.  This could be easily accomplished in our current setup by making the 'taskReference' on a Variable optional:

<?xml version="1.0" encoding="UTF-8"?>
<sedML xmlns="http://sed-ml.org/sed-ml/level1/version4" level="1" version="4">
  <listOfModels>
    <model id="model0" language="urn:sedml:language:sbml.level-3.version-1" source="case_01.xml"/>
  </listOfModels>
  <listOfDataGenerators>
    <dataGenerator id="jacobian">
      <math xmlns="http://www.w3.org/1998/Math/MathML">
        <ci> j </ci>
      </math>
      <listOfVariables>
        <variable id="j" symbol="urn:sedml:symbol:jacobian:full" modelReference="model0"/>

      </listOfVariables>
    </dataGenerator>
  </listOfDataGenerators>
  <listOfOutputs>
    <report id="report">
      <listOfDataSets>
        <dataSet id="Jacobian_report" label="Jacobian" dataReference="jacobian"/>
      </listOfDataSets>
    </report>
  </listOfOutputs>
</sedML>

If we do that, then I think we can move the rest of your proposed analyses to the DataGenerator class as well... I think.  The only issue would be if either analysis required a change in the model state, and I don't know the underlying mathematics well enough to say one way or another if that's true.  My suspicion is that calculating Lyapunov exponents is something that is performed without changing the model state, but it may well be that performing a reachability analysis would change the model state.  If so, it really would belong in the 'Simulation' category, but maybe for now we could hack it this way?  As I said before, the whole system is pretty awkward and I would like to move away from it, but I think it could be made to work in these cases.

-Lucian

Jonathan Karr

unread,
May 17, 2021, 6:42:44 PM5/17/21
to sed-ml-...@googlegroups.com
I agree that the current structure makes many things awkward. Personally, I feel its important to maintain a distinction between computations (tasks) and recording outputs (data generators). I think its ok to mix trivial calculations like unit conversions into recording outputs. But, I think it's important to separate out more involved computations that involve significant algorithms of their own, especially when multiple methods could be used to compute the same information. Taking the example of Lyapunov exponents, COPASI files note that COPASI uses the Wolf method. To me, a natural way to think of this is (a) the Wolf method is an algorithm, (b) the exponents are the output (variable of a data generator), and (c) the execution of this algorithm is separate from a steady-state or time course simulation. To me network randomization is similar, there's multiple ways (algorithms) to do it and networks can be randomized to different degrees (parameter of an algorithm).

Jonathan

--
You received this message because you are subscribed to the Google Groups "sed-ml-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sed-ml-discus...@googlegroups.com.

Lucian Smith

unread,
May 17, 2021, 8:40:05 PM5/17/21
to sed-ml-...@googlegroups.com
That's a good point--the Algorithm is definitely the place we currently store things like 'use the Wolf method'.

We need to also remember, though, that you'll still need to figure out some sort of DataGenerator for these new analyses, which may well duplicate the effort--if we need to put all the same stuff in the DataGenerator anyway, it might not be worth it to additionally try to stuff things in a new Simulation type.

-Lucian

Reply all
Reply to author
Forward
0 new messages