Help with Java's <<

81 views
Skip to first unread message

Erez

unread,
Oct 4, 2012, 1:58:37 PM10/4/12
to ply-...@googlegroups.com
Hi, I'm trying to parse Java and it's going ok so far, except for one problem that I just can't solve, regarding the > token.

Consider the following 3 lines:

    someType<anotherType>> a;
    if (b > c)
        d >> e;

If I define shift-right as:

    SHR: '>>'
    expression: var SHR var

then the lexer will match the template code, and the parser will break.

But If I define:

    GT: '>'
    expression: var GT GT var

then the lexer is fine, but the parser chokes on 'b > c', expecting another > to follow. 
Perhaps there is a way to get the parser to behave?

Here is the relevant grammar:

    relationalexpression :   shiftexpression 
                                  |   shiftexpression GT relationalexpression

    shiftexpression : additiveexpression 
                           | additiveexpression GT GT shiftexpression

And here is the code that breaks it:

    a > b;

And here is the debug info:

    State  : 140
    Stack  : packagedeclaration _anon_0_star modifiers CLASS identifier IMPLEMENTS typelist LBRACE _anon_10_star modifiers VOID identifier formalparameters LBRACE additiveexpression . LexToken(GT,'>',73,2318)
    Action : Shift and goto state 254

    State  : 254
    Stack  : packagedeclaration _anon_0_star modifiers CLASS identifier IMPLEMENTS typelist LBRACE _anon_10_star modifiers VOID identifier formalparameters LBRACE additiveexpression GT . LexToken(IDENTIFIER,'b',73,2320)
    ERROR: Error  : packagedeclaration _anon_0_star modifiers CLASS identifier IMPLEMENTS typelist LBRACE _anon_10_star modifiers VOID identifier formalparameters LBRACE additiveexpression GT . LexToken(IDENTIFIER,'b',73,2320)    


Pleas help me, thank you!

David Beazley

unread,
Oct 5, 2012, 11:48:39 AM10/5/12
to ply-...@googlegroups.com, David Beazley
I'm not sure I entirely understand the intended syntax of the first line, but dealing with trailing >> symbols in templates is a known tricky problem in other contexts such as C++.  For a long time, it was required that you always insert a space such as "someType<anotherType> > a;".   For instance, see http://www.comeaucomputing.com/techtalk/templates/#shiftshift

I'm not entirely sure what to advise in terms of PLY.  I think if >> is the right shift operator, you should probably continue to keep recognizing it as such.  You might be able to hack a solution using lexer states or some other bit of code to change how lexing works if you're inside template arguments.   I would have to experiment with an expression grammar to see how '>' and '>' '>' could be handled in a sane way.    Are you getting reduce/reduce conflicts in this grammar?

Cheers,
Dave

--
You received this message because you are subscribed to the Google Groups "ply-hack" group.
To post to this group, send email to ply-...@googlegroups.com.
To unsubscribe from this group, send email to ply-hack+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ply-hack/-/SmlFGUBxU4cJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Erez

unread,
Oct 10, 2012, 5:55:08 AM10/10/12
to ply-...@googlegroups.com
Thanks, that's very useful!

I eventually solved this problems by doing:

relationalexpression :   shiftexpression
    | shiftexpression relationalop relationalexpression
    | additiveexpression relationalop relationalexpression
    | molecule '<' relationalexpression
    ;

relationalop : '<=' 
    | '>=' 
    | '<' 
    | '>' 
    ;

shiftexpression : additiveexpression 
    | additiveexpression shiftop shiftexpression
    ;

shiftop : '<<' 
    | '>' '>' 
    | '>' '>' '>' 
    ;


But your project will be very helpful to me in the future.



On Tuesday, October 9, 2012 10:44:00 PM UTC+2, Werner Hahn wrote:
I also wrote a parser for Java but I just copied the grammar Java Developer Tools (JDT) for Eclipse uses. It turns out that this is actually a fairly complicated problem. Keep in mind that there is also the >>> operator which further complicates this. Add to that that Java has both type arguments as well as type parameters with different syntaxes. The grammar JDT uses creates new symbols in order to keep track how many brackets have been opened.

I hope it's OK to link my ported solution here. The juicy bits are:

By the way being fairly new to parsing I found the Java grammar extremely complex. Kudos for doing well so far.

Reply all
Reply to author
Forward
0 new messages