Grammatical representation of Intra/extra molecular rules

Jean Krivine

unread,

Oct 5, 2011, 7:30:09 AM10/5/11

to kappa-users

I would like to have your opinion on the following syntax used in simplx:

'rule' A,B -> C @ 'k_2' ('k_1')

where 'k_1' is the rate that should be associated to intra molecular instances of the rule and 'k_2' is the rate associated to binary instances of the rule.
Note that this allows one to encode the BNGL + and . :

'rule' A.B -> C @ 'k_1'
'rule' A+B -> C @ 'k_2'

I would very much like to re-use the 'k2'('k1') representation in KaSim which is less expressive than . and + (for instance you cannot define the observable %obs: 'ab' A.B) but has the advantage of being simpler to implement (avoids to have to deal with A.B + C.D or weird stuff like A,B -> C+D) and I like the idea to make user pay the price of using nasty constructs :)

So question:
- do you think the k2(k1) representation is enough or do you need explicit + and .?
- do you like the order k2 then k1?

This is for the 1.09 version of KaSim that will be released in the unstable branch for beta testing.
J

Bill

unread,

Oct 9, 2011, 3:11:36 AM10/9/11

to kappa-users

I prefer + and . over k2(k1), as + especially seems consistent w/
standard notation for (bio)chemical reactions.
A question...
In the k2(k1) syntax..., if I don't want to allow intramolecular
reactions, what would I write?
'rule' A,B -> C @ k2(0)
...or...
'rule' A,B -> C @ k2 ...or does the latter generate both intra and
intermolecular reactions with rate constant k2?
If I don't want to allow intermolecular reactions, what would I write?
'rule' A,B -> C @ 0(k1)
...or...
'rule' A,B -> C @ (k1)
Not sure I understand the weirdness of A.B+C.D or A,B->C+D. Could you
give more specific examples of nasty constructs?
--Bill

Walter Fontana

unread,

Oct 9, 2011, 8:29:27 PM10/9/11

to kappa-users

I'm on travel (and short on time), but I wanted to chime in. Warning:
this is a bit long and probably less clear than it could or should be.

* The comma and the plus/dot (Bill)

I favor the comma over the plus/dot, because it separates "mechanism"
from "kinetics". Let me explain.

I propose a distinction between a local mechanism of action and a
mechanism for modulating the effective reaction volume. By "local
mechanism" of action I mean processes that occur within a small region
- small compared to the size of the context (typically proteins or
protein complexes) necessary for providing the resources (i.e.
specific atoms) for an interaction. By a "mechanism for modulating
the effective reaction volume" I mean something like a scaffolding
architecture whereby proteins A and B are linked together (directly or
indirectly) in a way that either enhances the frequency of encounters
between A and B (thus diminishing the effective volume of interaction)
or impeding encounters between A and B by rigidly positioning them
away from each other (thus increasing the effective volume of
interaction). In an extreme version of the former case we have only
intra-molecular interactions between A and B; in an extreme version of
the latter case we have only inter-molecular interactions. When framed
by the above distinction, the bimolecular and unimolecular cases
warrant distinct rate constants because of differences in effective
reaction volume, not because of differences in the local interaction
mechanism. Kappa makes that explicit by having a single agent
composition operator - the comma - and two rate constants (either of
which might be zero) for rules with two components on the left hand
side.

This point of view takes the stance that the dependency of the rate
constant on volume should be separated from those aspects of the rate
constant that are intrinsic to a mechanism of action. The latter
should be expressed in the rule proper, while the former should reside
in the rule's rate annotation (which is an execution directive to the
simulation engine). The plus and the dot blur this distinction by
providing notation that appears to be "mechanistic" (by showing up in
the rule expression), when it should be "kinetic" annotation. This is
simply a point of view for rationalizing the comma and the absence of
plus and dot in Kappa. This is not an unassailable truth.

There might be cases in which connectivity should be considered as
part of the *mechanism*. If it is an essential aspect of some docking
mechanism for A and B to be in the same complex prior to interaction,
then the modeler should make that aspect explicit; for example, by
stating that A and B are both docked to the same scaffold C through
such and such sites, in which case we don't need a dot because we
should represent the situation with an explicit path through the site
graph. If different connectivities warrant different rate constants,
then we should make that explicit too, even if that means writing many
more rules. Kappa's primary purpose is to represent knowledge, not to
enable "lazy" rule writing. If writing many rules by hand is tedious,
one can write them programmatically. Incidentally, this is why we view
Kappa like "assembler code" and why we advocate the development of a
variety of meta-languages or macro-constructs, which are then compiled
into Kappa. Seen from this angle, the plus and the dot of BNGL are
useful macros in certain applications.

* weirdness of A,B -> C+D

A,B means A.B or (inclusively) A+B. If upon application A,B embeds
into an instance that can be described as A.B, then some
intramolecular bond must be broken that is not broken if A,B were to
embed into an instance of A+B. Thus, in addition to whatever other
modifications the rule imparts to A,B, there is a true mechanistic
difference between the two instances that the notation A,B -> C+D
brilliantly succeeds in obscuring. I think that justifies calling such
an expression "weird".

Bill, the comma is not a silly syntactical difference between Kappa
and BNGL. It reveals a difference in view point. Writing in Kappa
requires the user to be more deliberate and thus aware of the
mechanistic knowledge s/he is expressing with a rule. BioNetGen /
BNGL appears to be designed for easy general purpose modeling. There
is nothing wrong with either objective.

The + notation in chemistry is rooted in a representation that sees
reactions as central. There is no necessity to adopt it within a
system that sees rules as central.

As far as computational costs of the +/. and the two-constants
notation goes, both are potentially costly - but in different regimes.
(This will be analyzed elsewhere.)

* Jean's notation k2 (k1)

I'm fine with the proposal. I would prefer the following notational
equivalences

A,B -> C @ k2 or
A,B -> C @ k2 (0) for only bimolecular instances,
and
A,B -> C @ (k1) or
A,B -> C @ 0 (k1) for only unimolecular instances.

This is also "backward compatible" with past models in which rules of
arity 2 and a single rate constant were meant to be bimolecular - in
part because users made sure that such rules would only trigger in a
bimolecular way.

* general

Some programmatic assistance in determining whether an "A,B ->
something" rule has ambiguous molecularity (i.e. induces both uni- and
bi-molecular reactions in a given mixture) would be highly useful. Not
the least because with a few more rules, a user might be able to
resolve such ambiguity, thereby allowing KaSim to avoid the
potentially costly two-constants mode. Jerome's old static analyzer
(complx) appears to already compute all the necessary information for
such a service, but we never deployed it. Perhaps Jerome's new static
analyzer will surprise us... :) when Jerome gets to it... :)

I hope this helps.
w

Bill

unread,

Oct 9, 2011, 11:02:41 PM10/9/11

to kappa-users

Responding to Walter's comments...
The suggested notational equivalences make sense to me.
On plus/dot... for A+B-> and A.B-> reactions, let us assume that the
same interface is involved. In this case, yes, there is really only
one "mechanism" of binding and yes the difference is "kinetic," in
that k1 and k2 are different because of different effective reaction
volumes. But... unless I'm out of the loop, KaSim implements a
stochastic simulation method based on the well-mixed assumption, so
under this assumption, the only way to capture the difference in
effective volumes is to make k1 and k2 be different. If you were doing
spatial calculations, then you would not need different rate
constants. Given that within the methodological framework, it's
important to distinguish between intramolecular and intermolecular
interactions, even for cases that involve the same molecular
interface, I don't see it being an advantage to use A,B to refer to
both intra and inter, as there is a distinction and it's important.
The k2(k1) notation is a way of making the distinction and my point is
that this way is a little less intuitive (at least to me) than plus/
dot. I guess I would see A,B usage only as OK if there were spatial
calculations happening.
Another thought... differences between BNGL and Kappa are minor, and
the plus/dot issue is one of the differences. I think it would make
sense to eliminate all differences. It would be nice if KaSim could
read .bngl files and execute the same simulation as, say, BioNetGen
would. And vice versa. It wouldn't matter is this happened at the
assembler level or at a higher level, so long as the feature is
provided for users and there is an unambiguous reversible mapping
between the two languages.

Jean Krivine

unread,

Oct 10, 2011, 5:43:28 AM10/10/11

to Walter Fontana, kappa-users

To complement Walter's comment about weirdness, a rule of the form A(x!1).B(x!1) -> A(x)+B(x) implies that one should reject the rule when instances lead to an A not totally disconnected from a B as in A(x!1,y!2).B(x!1,y!2) --> A(x,y!2).B(x,y!2). So in theory you should force the user to write both versions of the rule (with probably no real biological reason for this).

I'm fine with the proposal. I would prefer the following notational
equivalences

A,B -> C @ k2 or
A,B -> C @ k2 (0) for only bimolecular instances,
and
A,B -> C @ (k1) or
A,B -> C @ 0 (k1) for only unimolecular instances.

Problem is I would like to keep A,B -> C @ k as a rule that applies both for intra and inter. So indeed as Bill suggests, if you want to write something that resembles the + of BNGL you should write A,B -> C @ k2 (0)
The reason is that simulation with no unary/binary constraints will be more efficient. So I would prefer the default semantics to be agnostic with respect to molarity and leave the modeler write explicit binary/unary rule instead of deciding molarity at runtime.

If we want to make BNGL models compatible we can always push A+B constructs to meta-kappa.
J

Anatoly Sorokin

unread,

Oct 10, 2011, 8:57:23 AM10/10/11

to kappa-users

I think for backward compatibility first rule should be

A,B -> C @ k2 is equivalent to
A,B -> C @ k2(k2)

Anatoly

Jean Krivine

unread,

Oct 10, 2011, 9:02:21 AM10/10/11

to Anatoly Sorokin, kappa-users

Yes precisely.

Walter Fontana

unread,

Oct 10, 2011, 5:50:36 PM10/10/11

to kappa-users

Good point, indeed.
w

Bill

unread,

Oct 10, 2011, 6:08:38 PM10/10/11

to kappa-users

What about the following?
A,B->C@k12 is agnostic w.r.t. molecularity
A+B->C@k2 is bimolecular vs. A,B->C@k2(0)
A.B->C@k1 is unimolecular vs. A,C->C@0(k1) or A,C->@(k1)

Jean Krivine

unread,

Oct 11, 2011, 4:27:52 AM10/11/11

to Bill, kappa-users

Yes bill what you write is actually the equivalence btw BNGL and KaSim's next release. But for all the reasons Walter listed (which might seem rather philosophical I agree) I prefer the leftmost notation

Jean Krivine

unread,

Oct 11, 2011, 9:28:39 AM10/11/11

to Bill, kappa-users

Sorry I mean rightmost. Excuse my french :)

Reply all

Reply to author

Forward