Can I resolve this left-recursive grammar like this (requesting formal linguist help)

65 views
Skip to first unread message

Robert Baruch

unread,
Nov 25, 2016, 12:37:13 AM11/25/16
to antlr-discussion
This is a problem I've run into with a grammar that I'm feeding to ANTLR4. It's GREAT that ANTLR now handles explicit left-recursion. The problem with my grammar is that I have two rules which have both explicit left-recursion AND mutual left-recursion.

Here's the actual grammar:

constant_primary : 
constant_primary2
| primary '.' method_call_body
| constant_primary '\'' '(' constant_expr ')'
;

primary : 
primary2
| primary '.' method_call_body
| constant_primary '\'' '(' expr ')'
;

Basically I've moved everything not problematic into the constant_primary2 and primary2 rules.

Now, I'm no grammar expert, not like Ter is. So here's a solution that I think could resolve this, but I'd like to hear from someone who knows formal language theory whether I'm totally nuts or if this is the right way to go.

First thing is, I abstract out the grammar:

A <- A' | BX | AY ; (A = constant_primary, B = primary)
B <- B' | BX | AZ ;

Expand B in rule A once (we know ANTLR can handle the A left-recursion):

A <- A' | B'X | BXX | AZX | AY ;

Expand B again:

A <- A' | B'X | B'XX | BXXX | AZXX | AZX | AY ;

And so on. It seems that, taken to infinity, the limit of this rule is:

A <- A' | B'X+ | AZX+ | AY ;

Problem solved, no mutual left-recursion, at least for A alone.

Similarly with B, expanding A:

B <- B' | BX | A'Z | BXZ | AYZ ;
B <- B' | BX | A'Z | BXZ | A'YZ | BXYZ | AYYZ ;
B <- B' | BX | A'Z | BXZ | A'YZ | BXYZ | A'YYZ | BXYYZ | AYYYZ ;

Taken to infinity:

B <- B' | BXY*Z | A'Y*Z ;

Does this seem correct? If so, I can resolve the mutual left-recursion in my original grammar like this:

constant_primary : 
constant_primary2
| primary2 ('.' method_call_body)+
| constant_primary '\'' '(' expr')'  ('.' method_call_body)+
| constant_primary '\'' '(' constant_expr ')'
;

primary : 
primary2
| primary '.' method_call_body ('\'' '(' constant_expr ')')* '\'' '(' expr ')'
| constant_primary2 ('\'' '(' constant_expr ')')* '\'' '(' expr ')'
;



Sam Harwell

unread,
Nov 25, 2016, 12:41:39 AM11/25/16
to antlr-di...@googlegroups.com

Hi Robert,

 

Have you tried the release of ANTLR I mentioned in your StackOverflow question? It should handle the indirect left recursion without requiring you to rewrite rules. The feature hasn’t been widely used so I’m a bit curious if it works for your case. :)

 

Thanks,

Sam

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robert Baruch

unread,
Nov 25, 2016, 12:59:57 AM11/25/16
to antlr-discussion
Well, the problem is that I'm planning to release the grammar, and I'd much rather point my users to a standard build... no offense intended.

Sam Harwell

unread,
Nov 25, 2016, 1:11:06 AM11/25/16
to antlr-di...@googlegroups.com

No offense taken. Do keep in mind that several high-profile projects do depend on this work directly or indirectly for a variety of reasons. While it’s not the first option for new ANTLR users, it may turn out to be a practical solution to your situation if it means keeping the grammar in a form that more closely resembles your language specification.

 

I’m actually in the middle of creating some long-overdue documentation for the features and optimizations unique to this fork… hopefully I can have this in place by the end of the weekend as a good reference.

Robert Baruch

unread,
Nov 25, 2016, 1:37:06 AM11/25/16
to antlr-discussion
Hmm... Is your fork up-to-date? That is, do you regularly merge the main branch in? If so, I would definitely take a look.

Mike Lischke

unread,
Nov 25, 2016, 4:05:05 AM11/25/16
to antlr-di...@googlegroups.com
I’m actually in the middle of creating some long-overdue documentation for the features and optimizations unique to this fork… hopefully I can have this in place by the end of the weekend as a good reference.

Sam, why not merge this back into the main repo? Have all the optimizations to stay in a fork for some reasons?


Sam Harwell

unread,
Nov 25, 2016, 1:12:26 PM11/25/16
to antlr-di...@googlegroups.com

Answering this question is a major part of the new documentation:

https://github.com/sharwell/antlr4/pull/26

 

Sam

 

From: 'Mike Lischke' via antlr-discussion [mailto:antlr-di...@googlegroups.com]
Sent: Friday, November 25, 2016 3:05 AM
To: antlr-di...@googlegroups.com
Subject: Re: [antlr-discussion] Can I resolve this left-recursive grammar like this (requesting formal linguist help)

 

I’m actually in the middle of creating some long-overdue documentation for the features and optimizations unique to this fork… hopefully I can have this in place by the end of the weekend as a good reference.

 

--

Sam Harwell

unread,
Nov 25, 2016, 1:43:40 PM11/25/16
to antlr-di...@googlegroups.com

The fork stays up-to-date regarding bug fixes. The optimized fork has different long-term goals from the reference release, which affects what most users would consider “regular merges”. Most users of this fork are working on large-scale applications involving ANTLR, where breaking changes between releases cause substantial overhead or can prevent users from upgrading. I’m much more reluctant to make source- or binary-breaking changes in the API or implementation, so each new feature from the reference branch needs to be explicitly evaluated for its impact on existing users of my fork, and sometimes rewritten to provide the same functionality without breaking existing users.

 

In addition, my fork only includes a Java target, so all the work on other targets is stripped out during merges. I have separate repositories for the original (non-fork) C# and TypeScript targets, each of which is self-contained and built from the optimized Java release.

Robert Baruch

unread,
Nov 25, 2016, 5:41:39 PM11/25/16
to antlr-di...@googlegroups.com
Ah, that's a non-starter for me. My target is Go.

But I saw Ter just tweeted that your "new optimization" for "left-recursive rules" has been integrated into 4.6. Would that be this?

Sam Harwell

unread,
Nov 25, 2016, 6:58:42 PM11/25/16
to antlr-di...@googlegroups.com

No, the new optimization doesn’t change any features. It just makes one particular case that used to be very slow go much faster. :)

 

Sam

 

From: antlr-di...@googlegroups.com [mailto:antlr-di...@googlegroups.com] On Behalf Of Robert Baruch
Sent: Friday, November 25, 2016 4:42 PM
To: antlr-discussion <antlr-di...@googlegroups.com>
Subject: Re: [antlr-discussion] Can I resolve this left-recursive grammar like this (requesting formal linguist help)

 

Ah, that's a non-started for me. My target is Go.

Reply all
Reply to author
Forward
0 new messages