Can ANTLR4 still verify that a grammar meets LL(1) constraints?

1,490 views
Skip to first unread message

trijezdci

unread,
Dec 9, 2012, 6:21:34 PM12/9/12
to antlr-di...@googlegroups.com
Since my main use of ANTLR is grammar editing and verification, I'd like to know whether it is still possible in ANTLR4 to verify that a grammar satisfies LL(k) constraints for any given k, or at least for k=1. In other words, does

options { k=1; backtrack=no }

still work in ANTLR4?

Sam Harwell

unread,
Dec 9, 2012, 6:50:49 PM12/9/12
to antlr-di...@googlegroups.com

The ANTLR 4 Tool only performs static lookahead evaluation for k=1. If a grammar is LL(1), then the resulting generated parser will not contain any calls to adaptivePredict. The Tool does not contain any code to check lookahead for any k>1.

 

Also note:

 

·         ANTLR 4 does not require or include support for backtracking. The portion of the DFA necessary for the actual input is computed on-demand by adaptivePredict (at runtime).

·         The static lookahead evaluation is completely disabled if the -Xforce-atn flag is passed to the Tool (forces the use of adaptivePredict for every decision).

 

--

Sam Harwell

Owner, Lead Developer

http://tunnelvisionlabs.com

--
 
 

Terence Parr

unread,
Dec 9, 2012, 10:04:43 PM12/9/12
to antlr-di...@googlegroups.com
ANTLR does not warn you statically about any LL(k) conditions. As Sam pointed out, it does however do static LL(1) analysis as an optimization.

I'm not sure what you mean by "main use of ANTLR is grammar editing and verification". Does that mean you don't use the generated parsers?

Ter

trijezdci

unread,
Dec 10, 2012, 3:24:32 AM12/10/12
to antlr-di...@googlegroups.com
On Monday, December 10, 2012 4:04:43 AM UTC+1, the_antlr_guy wrote:

ANTLR does not warn you statically about any LL(k) conditions. As Sam pointed out, it does however do static LL(1) analysis as an optimization.

For the avoidance of doubt then, if my goal is to establish for a given grammar whether or not the grammar is LL(1), then ANTLR4 won't be the tool for me to do that?

I'm not sure what you mean by "main use of ANTLR is grammar editing and verification". Does that mean you don't use the generated parsers?

I use ANTLR as a grammar development tool. This means I have two use cases:

(1) verify if a grammar meets LL(1) constraints and locate conflicts if it does not.
(2) use the generated parser as a preliminary syntax checker for sample input.

The compiler we work on is written by hand.

I understand that most people don't use ANTLR to build compilers and consequently ANTLR4 is designed for the more commonly observed uses. However, if you do build a compiler and you don't use a parser generator tool, then you are very likely to build a recursive descent parser. This task will be a lot more straightforward if your grammar is LL(1). Also, if you intend to publish your work, then you probably want to provide proof that your grammar is LL(1). During the design phase, you are likely to have a lot of syntax changes. While you don't need a proof until you have finalised your grammar, you will want to make sure it is LL(1) every time you made changes. Verifying this manually is a pretty laborious task. A tool that can do it for you is going to save you quite a bit of time over the course of your project.

This means, even if compiler developers do not use ANTLR to build their compilers, for their design they might still want to use ANTLR as a verification or prototyping tool to test their ideas. Consequently, if ANTLR4 will no longer cover the associated use cases, then it will become a self fulfilling prophecy that they don't use ANTLR.

But of course, we can just keep using ANTLR3 for as long as it doesn't break.

Sam Harwell

unread,
Dec 10, 2012, 9:13:40 AM12/10/12
to antlr-di...@googlegroups.com

One of the driving goals of ANTLR 4 is removing the need for this use case. With ANTLR 4, you can use grammars that written very much like a language specification, allowing compiler developers to spend time verifying that a compiler matches the language instead of verifying that the compiler matches a set of lookahead properties that have no bearing on correctness (and actually pose additional challenges for correctness by forcing grammars to not follow the design of the language specification).

 

--

Sam Harwell

Owner, Lead Developer

http://tunnelvisionlabs.com

 

--
 
 

trijezdci

unread,
Dec 10, 2012, 6:04:06 PM12/10/12
to antlr-di...@googlegroups.com
On Monday, December 10, 2012 3:13:40 PM UTC+1, Sam Harwell wrote:

One of the driving goals of ANTLR 4 is removing the need for this use case. With ANTLR 4, you can use grammars that written very much like a language specification, allowing compiler developers to spend time verifying that a compiler matches the language instead of verifying that the compiler matches a set of lookahead properties that have no bearing on correctness (and actually pose additional challenges for correctness by forcing grammars to not follow the design of the language specification).

Just because you are offering a car with an automatic gear box and some people will find that more convenient doesn't mean that everybody will want to drive a car with an automatic gear box and manual gear boxes will disappear. It doesn't work that way. You are merely offering one more choice, one amongst others.

Also, your logic is flawed when you say that ANTLR4 removes the need to verify LL(1) conditions for implementors of hand written RD parser based compilers. It only makes sense if you actually implement a compiler in ANTLR4.

If I write a compiler by hand then the ANTLR4 grammar would only be useful if my parser can handle that grammar. For it to handle such a grammar I would have to implement capabilities equivalent to those of ANTLR4. This means it would be more complex and more effort than a simple RD parser for an LL(1) grammar. If I do not want that additional complexity and effort, then I will still need an LL(1) grammar and that means I will still need to verify it against LL(1) constraints. The trade-off here is simple: invest a little more effort into the grammar and in return keep the implementation simpler.

For me, the simpler implementation justifies the use of an LL(1) grammar and thereby affirms the use case I described. Yet, leaving that aside, I can think of at least three more strong and valid reasons why one may want an LL(1) grammar or a tool that can verify LL(k) constraints:

(1) Suppose your work on a language or language revision that is going to be published by an academic publishing house and the nature of the work and audience is such that it is expected that you provide a proof your grammar is LL(1). Of course you can do the proof by hand when you have finalised the grammar, but if the end product is intended to be LL(1) then you may as well start out with LL(1) and that means you will want to verify every time you make a change.

(2) Suppose you are teaching CS students parsing techniques and formal language theory. Don't you think you would first introduce them to the basic grammars like LR and LL and let them get their heads around that first before moving on to more complex methods? Yet, for that you need to let them cut their teeth on respective example grammars, including LL(1). In fact, the capability to turn backtracking on and off and to set arbitrary values for k in ANTLR3 is a very good tool for students to learn the consequences of such changes in practise. In fact, over the last few years I have recommended ANTLR and ANTLRworks to quite a few people for that very reason.

(3) Suppose you are implementing a compiler for one of Wirth's languages or a derivative thereof. Or, suppose you are doing a revision of one of those languages. Wirth is a strong proponent of a design philosophy that says a language should be designed for LL(1), not only to keep the parser implementation reasonably simple, but also for the sake of simplicity of the language syntax. Now, you may disagree with that design philosophy, but you cannot simply dismiss it as nonsense, it does carry some weight. And even if you wouldn't choose that design philosophy for a language you designed for yourself, you would probably still want to follow it, or you might be told by your employer or sponsor to follow it when implementing a Wirthian language, if only because the users of those languages would most likely expect it.

None of these reasons are hypothetic. I know of at least one real-world instance for each scenario given.

As I mentioned at the top, a parser generator that can handle less restrictive grammars is just one more choice, one amongst others. Not everybody will have reason enough to embrace that choice.

But like I said in my earlier post, we can always continue to use ANTLR3. It's not the end of the world ;-)

Sam Harwell

unread,
Dec 10, 2012, 11:08:45 PM12/10/12
to antlr-di...@googlegroups.com

I wasn’t trying to say LL(k) analysis isn’t ever useful. I was merely pointing out that ANTLR 4 is designed to solve a very different set of problems than you’re working with, so even though there’s some overlap in the analysis abilities it may never be the tool you’re after for working with hand-written parsers. ANTLR 4 will probably be an excellent candidate for writing compilers, just not hand-written ones or ones where it’s more important for the compiler’s grammar to be LL(k) than to resemble the form given in the language specification.

 

--

Sam Harwell

Owner, Lead Developer

http://tunnelvisionlabs.com

 

From: antlr-di...@googlegroups.com [mailto:antlr-di...@googlegroups.com] On Behalf Of trijezdci
Sent: Monday, December 10, 2012 5:04 PM
To: antlr-di...@googlegroups.com
Subject: Re: [antlr-discussion] Can ANTLR4 still verify that a grammar meets LL(1) constraints?

 

On Monday, December 10, 2012 3:13:40 PM UTC+1, Sam Harwell wrote:

--
 
 

trijezdci

unread,
Dec 11, 2012, 2:46:44 AM12/11/12
to antlr-di...@googlegroups.com


On Tuesday, December 11, 2012 5:08:45 AM UTC+1, Sam Harwell wrote:

ANTLR 4 will probably be an excellent candidate for writing compilers, just not hand-written ones or ones where it’s more important for the compiler’s grammar to be LL(k) than to resemble the form given in the language specification.

If you believe that the simplicity of the parser is the only motivation, then it is understandable why you would see it that way. But with languages that are deliberately designed for LL(1) this is not usually the case. The motivation is both design and implementation. The syntax of such a language is usually leaner, cleaner and more intuitive. This is considered equally important to the simplicity of the implementation. Consequently, there is no such thing as it being "more important for the compiler's grammar to be LL(k) than to resemble the form given in the specification" because the form given in the specification is LL(1).

This may seem puzzling for somebody who has had to fight with grammars of languages not designed in this way and to convert them into grammars that are LL(1) solely for the purpose of implementation. More often than not, such grammars turn out unintuitive and confusing. But this is not a property of LL(1). It is a property of the design. If the language is designed from the ground up for LL(1), the resulting grammar will typically be intuitive and clear.

I hope this puts things into perspective.

ron...@gmail.com

unread,
Feb 4, 2013, 5:01:32 PM2/4/13
to antlr-di...@googlegroups.com
The online grammar checker at:

http://smlweb.cpsc.ucalgary.ca/

can check for (among many things) LL(1)-ness, I think.

Reply all
Reply to author
Forward
0 new messages