Print all events of a trace with Kappa-TQL

Sébastien Légaré

unread,

Feb 21, 2019, 11:39:54 AM2/21/19

to kappa-users

Hello,

I was wondering if it was possible to use Kappa-TQL to print each and every event of a trace.

Printing every event is arguably not so interesting for analysis, but I think it is essential to debug large models.

In Kappa-TQL, I guess it could be done with some wildcard in the query like:

match e:{ * } return (time[e], rule[e])

We could also ask things like every event that was a phosphorylation

match e:{ *(*{u/p}) } return (time[e], rule[e])

or every event that involves protein A

match e:{ A(*) } return (time[e], rule[e])

At the moment I use a homemade python script to print all the events of a trace, but it seems Kappa-TQL would be a better tool for the job.

Thanks,

Sébastien

Jonathan Laurent

unread,

Feb 22, 2019, 3:29:08 PM2/22/19

to kappa-users

Hi Sébastien,

Thanks for your interest in the Trace Query Language!

It is possible to print each and every event of a trace using the following query:

match e return (time[e], rule[e])

The two other queries you mentioned are currently not supported, although it would be easy to handle them by making minor additions to the TQL.

For those interested, I am now starting a long explanation on how this could happen and why it involves nontrivial design choices.

--------

We currently support

match e:{ K(x{u/p}) } return (time[e], rule[e])

to match events that phosphorylate the x site of a K but you cannot replace x and K by wildcards. Adding support for "name wildcards" in patterns may be an interesting addition.

The third query you provided raises a different question. First, there are several interpretations of what it may mean for an event to "involve" an agent of type K:

Does it mean that an agent of type K gets modified?
Does it mean that there is a path between an agent of type K and an agent that gets modified?
Does it mean that the rule r that e is an instance of features an agent of type K?
Even the formulation above is somewhat ambiguous. Indeed, a rule could be written without explicitly mentioning an agent of type K but require that an agent A and B are connected together in order to apply. If this connection is realized by an agent K, should we say that K is "involved"?

Second, the following query is syntactically valid:

match t: { K } return rule[t]

However, it has a different semantics from what you may expect. Indeed, the pattern between curly braces is called a "transition pattern" (see the TQL paper) and it matches any transition t such that the mixture before t features a K agent (not necessarily affected by the rule). However, if you run the query, you'll get the following error message:

[Fatal error] Ambiguity detected.

This is because the query engine adds an additional requirement that every agent that appears in a pattern has to be dynamically resolvable in a non-ambiguous way. This is not the case here as a mixture can feature multiple K agents. One of the (many) reasons for this requirement can be understood by looking at the following query:

match t: { k:K } return int_state[.t]{k, "x"}

You may say that k should be resolved to the kinase that is "involved" in transition t, if any. However:

In general, a rule may involve several kinases
More importantly, it is very useful to have the ability for a transition pattern to match agents that are not directly affected by a transition (see examples on the documentation page) and so I would not change the default semantics.

--------------

There is another reason why transition patterns have limited expressiveness. Indeed, although many people want to use the TQL to perform very simple queries that match single transitions (in which case the TQL is only a convenient replacement of a script that does a single pass through a trace file), the original main design goal of the TQL was to handle complex "trace patterns" featuring several transition patterns with shared variables. Therefore, making transition patterns too expressive quickly makes the problem of evaluating complex queries intractable. This is the reason I do not allow arbitrary predicates to appear in transition patterns.

However, it arbitrary predicates are allowed in the "when" clause of a query. As a reminder, the query:

match e when P return E

is somewhat equivalent to the (invalid) query

match w return (if P then E else nothing)

Therefore, I think that the best way to express your third query would be something like this

match e when involves(rule[e], "K") return (time[e], rule[e])

where involves is a function that takes a rule R name along with an agent type A and returns whether or not the rule R "involves" an agent of type A. Such a function does not currently exist but it could be trivially added.

---------

Ideally, it should be very easy to add custom functions to the TQL such as the involves function above. Currently, you would need to add a couple of lines to 3 different files in the TQL sources (the grammar, the AST and the interpreter). What I think would be cool would be to allow external calls to arbitrary python scripts. This way, your script would not have to handle the gory details of streaming a JSON trace file, computing matchings, capturing event and agent names... You would just have to specify the core computation to be performed on every matched pattern and the TQL would take care of the rest. I don't know if the current python interface is mature enough to allow a clean implementation of this idea.

Don't hesitate to give me feedback on how you think the TQL should evolve. Also, I am happy to help anyone willing to contribute to the TQL. :-)

Best,

Jonathan

Sébastien Légaré

unread,

Feb 25, 2019, 6:07:41 AM2/25/19

to kappa-users

Hi Jonathan,

Thank you for your answer. Great, match e return (time[e], rule[e]) is all I really need at the moment.

About the "involvement" of an agent K in a transition, my first idea was effectively that K was explicitly featured in rule r that e is an instance of. If it can eventually be implemented in the "when" clause I think it would be great.

I would leave the case "A modifies B if they are in the same complex, and K happens to link them" for some other special function.

I do not understand well what you mean by "the ability for a transition pattern to match agents that are not directly affected by a transition". To my sense, all agents are directly affected by the transition in the examples from the documentation page. Also, I do not get the expected results when computing the bond lifespan conditioned to K being phosphorylated. The first correct query gives me a [Fatal error] Ambiguity detected. The two following correct queries give me the same result as the wrong query. Anyway, maybe I will understand this last part better after using the TQL with more advanced queries.

Best,

Sébastien

Jonathan Laurent

unread,

Feb 25, 2019, 10:08:53 AM2/25/19

to kappa-users

I do not understand well what you mean by "the ability for a transition pattern to match agents that are not directly affected by a transition". To my sense, all agents are directly affected by the transition in the examples from the documentation page.

I realized that you're right. There is no good example of this in the documentation. Here is an example. Suppose you have a rule

'p' S(x{u/p})

and you want to match transitions where rule "p" applies to a substrate that is bound to an agent of type "X" through its site "d". Then, you wold write the following transition pattern:

{ S(x{u/p},d[1]), K(d[1]) }

As you can see, this pattern matches a K agent that does not appear in rule "p".

Also, I do not get the expected results when computing the bond lifespan conditioned to K being phosphorylated. The first correct query gives me a [Fatal error] Ambiguity detected. The two following correct queries give me the same result as the wrong query.

Could you send me the faulty queries? The TQL evolved many times and the documentation may be out of date!

About the "involvement" of an agent K in a transition, my first idea was effectively that K was explicitly featured in rule r that e is an instance of. If it can eventually be implemented in the "when" clause I think it would be great.

I will implement this. :-)

Best,

Jonathan

Jonathan Laurent

unread,

Feb 25, 2019, 10:12:06 AM2/25/19

to kappa-users

After some checking, I confirm that the documentation is out of date on many things.

I'll fix this as soon as possible.

Sébastien Légaré

unread,

Feb 25, 2019, 11:47:26 AM2/25/19

to kappa-users

Ok thanks for the example. I see that it could cause trouble. In that case, the transition { S(x{u/p},d[1]), K(d[1]) } would not count as "involving K" according to my initial understanding. This may or may not be what the user would want.

The query that was giving me the [Fatal error] Ambiguity detected was

match b:{ S(d[/1]), k:K(d[/1]) }

and first u:{ k:K(d[/.]) } after b

and u:{ k:K(x{p}) }

return (time[u] - time[b])

One thing I find strange is that "u" is defined twice. Now that you say that some of the documentation is out of date, I guess it will be fixed along with the update.