What's wrong with this (really) simple grammar ?

83 views
Skip to first unread message

JCLL

unread,
Jul 10, 2014, 9:09:02 AM7/10/14
to treet...@googlegroups.com
Hi again,

Treetop behavior is weird on this (really) simple grammar :

grammar Shyc

rule functionDef
  type space identifier '('  ')' bloc
end

rule type
  'int'
end

rule bloc
  '{'  '}'
end

rule identifier
  [a-zA-Z]  [a-zA-Z_]*
end
  
rule space
  [\s]+
end

end

it returns me an error during parsing "int main(){}" :

error at line 1, column 9
failure reason : Expected [a-zA-Z_] at line 1, column 9 (byte 9) after
compiler.rb:25:in `parse': Parse error (RuntimeError)
    from compiler.rb:73:in `<main>'

Any idea ?
JCLL

Paul Madden

unread,
Jul 10, 2014, 9:45:32 AM7/10/14
to treet...@googlegroups.com
It seems to work for me:

irb(main):006:0> ShycParser.new.parse('int main(){}')
=> SyntaxNode+FunctionDef0 offset=0, "int main(){}" (type,space,identifier,bloc):
  SyntaxNode offset=0, "int"
  SyntaxNode offset=3, " ":
    SyntaxNode offset=3, " "
  SyntaxNode+Identifier0 offset=4, "main":
    SyntaxNode offset=4, "m"
    SyntaxNode offset=5, "ain":
      SyntaxNode offset=5, "a"
      SyntaxNode offset=6, "i"
      SyntaxNode offset=7, "n"
  SyntaxNode offset=8, "("
  SyntaxNode offset=9, ")"
  SyntaxNode+Bloc0 offset=10, "{}":
    SyntaxNode offset=10, "{"
    SyntaxNode offset=11, "}"

# cat g.tt

Jean-Christophe Le Lann

unread,
Jul 10, 2014, 9:49:20 AM7/10/14
to treet...@googlegroups.com
I am using tt v1.5.3 and ruby 2.1.1. Which versions are you using ?

Could you send me the shyc.rb generated and your compiler driver ?


--
You received this message because you are subscribed to the Google Groups "Treetop Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to treetop-dev...@googlegroups.com.
To post to this group, send email to treet...@googlegroups.com.
Visit this group at http://groups.google.com/group/treetop-dev.
For more options, visit https://groups.google.com/d/optout.

Jean-Christophe Le Lann

unread,
Jul 10, 2014, 10:39:10 AM7/10/14
to Paul Madden, treet...@googlegroups.com
Paul,

Forget my question ! I didn't specify how to consume \n...

(however the error message is strange...)

It is now working with this :

grammar Shyc

rule functionDef
  type space identifier '('  ')' bloc space?

end

rule type
  'int'
end

rule bloc
  '{'  '}'
end

rule identifier
  [a-zA-Z] [a-zA-Z_]*
end
  
rule space
  [\s\n]+
end

end


2014-07-10 16:23 GMT+02:00 Jean-Christophe Le Lann <jc.l...@gmail.com>:
Thanks Paul,

I suspect my compiler.rb is wrong. Could you try it ?

Thx
JCLL




2014-07-10 15:54 GMT+02:00 Paul Madden <mad...@colorado.edu>:

# gem list treetop

*** LOCAL GEMS ***

treetop (1.5.3)

# ruby --version
ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux]

Parser enclosed.

paul


On 07/10/2014 07:49 AM, Jean-Christophe Le Lann wrote:
> I am using tt v1.5.3 and ruby 2.1.1. Which versions are you using ?
>
> Could you send me the shyc.rb generated and your compiler driver ?
>
>
> 2014-07-10 15:45 GMT+02:00 Paul Madden <pma...@gmail.com <mailto:pma...@gmail.com>>:

>
>     It seems to work for me:
>
>     irb(main):006:0> ShycParser.new.parse('int main(){}')
>     => SyntaxNode+FunctionDef0 offset=0, "int main(){}" (type,space,identifier,bloc):
>       SyntaxNode offset=0, "int"
>       SyntaxNode offset=3, " ":
>         SyntaxNode offset=3, " "
>       SyntaxNode+Identifier0 offset=4, "main":
>         SyntaxNode offset=4, "m"
>         SyntaxNode offset=5, "ain":
>           SyntaxNode offset=5, "a"
>           SyntaxNode offset=6, "i"
>           SyntaxNode offset=7, "n"
>       SyntaxNode offset=8, "("
>       SyntaxNode offset=9, ")"
>       SyntaxNode+Bloc0 offset=10, "{}":
>         SyntaxNode offset=10, "{"
>         SyntaxNode offset=11, "}"
>
>     # cat g.tt <http://g.tt>

>     grammar Shyc
>
>     rule functionDef
>       type space identifier '('  ')' bloc
>     end
>
>     rule type
>       'int'
>     end
>
>     rule bloc
>       '{'  '}'
>     end
>
>     rule identifier
>       [a-zA-Z]  [a-zA-Z_]*
>     end
>
>     rule space
>       [\s]+
>     end
>
>     end
>
>     --
>     You received this message because you are subscribed to the Google Groups "Treetop Development" group.
>     To unsubscribe from this group and stop receiving emails from it, send an email to treetop-dev...@googlegroups.com <mailto:treetop-dev...@googlegroups.com>.
>     To post to this group, send email to treet...@googlegroups.com <mailto:treet...@googlegroups.com>.

>     Visit this group at http://groups.google.com/group/treetop-dev.
>     For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Treetop Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to treetop-dev...@googlegroups.com <mailto:treetop-dev...@googlegroups.com>.
> To post to this group, send email to treet...@googlegroups.com <mailto:treet...@googlegroups.com>.

markus

unread,
Jul 10, 2014, 1:00:03 PM7/10/14
to treet...@googlegroups.com
P --

> (however the error message is strange...)

Yep. Getting parsers of any ilk to emit good error messages is darned
close to a black art, and pegs are no exception. The core problem is
that, for any unparsable string S there are mindbogglingly many ways it
could have parsed if only it had been different in some way. Selecting
which subset of these correspond to the intended parsing is a necessary
precursor to producing a good error message. It is also far beyond the
scope of a parser generator, and perhaps even brushes up against AI.

As a more tractable substitute people have tried:

* Just reporting the error as a generic "syntax error" at the
furthest point that was successfully parsed (would have helped
in your case)
* Adding special rules to the grammar that only match on an error,
and then (when the total parse fails) reporting the last of
these
* Like above, but having the error rules always report and then
skip ahead to some keyword that lets the parser recover and
continue
* Building a suite of expected errors (this could be as simple as
annotating your test cases) and mapping from the failure mode to
a more human friendly message.
http://people.via.ecp.fr/~stilgar/doc/compilo/parser/Generating%
20LR%20Syntax%20Error%20Messages.pdf
* Half a dozen other techniques, which (from a quick google) all
sound even more complicated to implement.

Failing all that, you can do what most people do and just grumble about
how bad the error messages are, secure in the knowledge that your
sentiment is widely shared.

-- M



Jean-Christophe Le Lann

unread,
Jul 10, 2014, 5:57:59 PM7/10/14
to treet...@googlegroups.com
Thanks Markus for this very interesting clarification   !


--
You received this message because you are subscribed to the Google Groups "Treetop Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to treetop-dev...@googlegroups.com.
To post to this group, send email to treet...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages