Ambiguous IDs in Kappa grammar

48 views
Skip to first unread message

Sandro

unread,
May 4, 2012, 12:44:01 PM5/4/12
to kappa-de...@googlegroups.com
Hi!

There is an issue with the Kappa grammar I encountered when writing an Emacs mode for Kappa. Here is what the KaSim manual says about IDs:

> Symbol Id can be any string generated by regular expression [a-zA-Z0-9][a-zA-Z0-9_-]*.

This is ambiguous when used in variable definitions as illustrated below:

    # These variable definitions illustrate the ambiguity between numerals
    # and IDs in the Kappa grammar.
    %var: '1A'      1A    # 1A   is an agent name
    %var: '1E'      1E    # 1E   is an agent name
    %var: '1E+1'    1E+1  # 1E+1 is a numeral
    %var: '200'     200   # 200  is a numeral (no agent signature)
    %var: '100'     100   # 1E0  is both a numeral and an agent name
    %var: '1E0'     1E0   # 1E0  is both a numeral and an agent name

In practice, KaSim doesn't accept agent signatures with "numeric" agent names. E.g. the following code

    # Numeric IDs in agent signatures
    %agent: 1A(x)   # Declaration of agent 1A      => works.
    %agent: 1E(x)   # Declaration of agent 1E      => works.
    %agent: 100(x)  # Declaration of agent 100     => fails!
    %agent: 1E0(x)  # Declaration of agent 1E0     => fails!

will generate a parser error:

    Error (id_bug.ka) line 19, character 11: Malformed agent
    signature, I was expecting something of the form '%agent:
    A(x,y~u~v,z)'

Is there any specific reason for allowing IDs with an initial digit?

Cheers
/Sandro

Jean Krivine

unread,
May 4, 2012, 1:49:55 PM5/4/12
to Sandro, kappa-de...@googlegroups.com
Yes some proteins are named like that. I don't remember which though. But they are the reason for allowing names to start with a digit

Vincent Danos

unread,
May 4, 2012, 2:16:10 PM5/4/12
to Jean Krivine, Sandro, kappa-de...@googlegroups.com
14-3-3

Bill Hlavacek

unread,
May 4, 2012, 2:46:39 PM5/4/12
to Vincent Danos, Jean Krivine, Sandro, kappa-de...@googlegroups.com
The gene name for a 14-3-3 protein would not be # but rather plain text.
--Bill

Jean Krivine

unread,
May 4, 2012, 3:54:38 PM5/4/12
to Bill Hlavacek, Vincent Danos, Sandro, kappa-de...@googlegroups.com
What do you mean Bill?

Bill Hlavacek

unread,
May 4, 2012, 4:13:32 PM5/4/12
to Jean Krivine, Vincent Danos, Sandro, kappa-de...@googlegroups.com
Rather than name the agent for the protein 14-3-3alpha/beta, you could name is YWHAB, which is the gene name for this protein. The gene names are computer friendly.

Sandro Stucki

unread,
May 8, 2012, 8:35:51 AM5/8/12
to kappa-de...@googlegroups.com
Sorry for re-posting. Forgot to include the mailing list in the reply address.

---------- Original message ----------
From: Sandro Stucki <sandro...@gmail.com>
Date: Mon, May 7, 2012 at 1:11 PM
Subject: Re: Ambiguous IDs in Kappa grammar
To: Bill Hlavacek <wshla...@gmail.com>


Are there a lot of people using this "feature"? It seems cleaner to me
to have an unambiguous grammar. The most common solution, I guess, is
to allow only alphabetical characters and "_" as the first character
in an identifier. Unless, of course, that would break a lot of
existing models. Then again,

 sed -e 's/[0-9][0-9]*-[0-9][0-9]*-[0-9][0-9]*/_&/g' oldfile > newfile

would probably do the trick.

What do you think?

/Sandro

Jean Krivine

unread,
May 8, 2012, 11:40:51 AM5/8/12
to Sandro Stucki, kappa-de...@googlegroups.com
I am happy with it.

Donal Stewart

unread,
May 9, 2012, 4:11:52 AM5/9/12
to kappa-de...@googlegroups.com
Happy with the existing grammar or happy with the suggested modification ?

Meta-ambiguity ;)
> <jean.k...@gmail.com <mailto:jean.k...@gmail.com>>

Jean Krivine

unread,
May 9, 2012, 8:22:57 AM5/9/12
to Donal Stewart, kappa-de...@googlegroups.com
Haha no happy with the proposed modification of the grammar.

Reply all
Reply to author
Forward
0 new messages