Received: by 10.50.202.5 with SMTP id ke5mr3808587igc.3.1338661379686; Sat, 02 Jun 2012 11:22:59 -0700 (PDT) X-BeenThere: lepl@googlegroups.com Received: by 10.231.52.41 with SMTP id f41ls3205068ibg.3.gmail; Sat, 02 Jun 2012 11:22:59 -0700 (PDT) Received: by 10.50.222.168 with SMTP id qn8mr3839744igc.3.1338661379206; Sat, 02 Jun 2012 11:22:59 -0700 (PDT) Received: by 10.50.222.168 with SMTP id qn8mr3839743igc.3.1338661379195; Sat, 02 Jun 2012 11:22:59 -0700 (PDT) Return-Path: Received: from smtp.webfaction.com (mail6.webfaction.com. [74.55.86.74]) by gmr-mx.google.com with ESMTP id bg10si677101igc.3.2012.06.02.11.22.59; Sat, 02 Jun 2012 11:22:59 -0700 (PDT) Received-SPF: neutral (google.com: 74.55.86.74 is neither permitted nor denied by best guess record for domain of and...@acooke.org) client-ip=74.55.86.74; Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 74.55.86.74 is neither permitted nor denied by best guess record for domain of and...@acooke.org) smtp.mail=and...@acooke.org Received: from acooke.org (190-21-135-25.baf.movistar.cl [190.21.135.25]) by smtp.webfaction.com (Postfix) with ESMTP id 07CE920F4626 for ; Sat, 2 Jun 2012 13:20:31 -0500 (CDT) Received: by acooke.org (Postfix, from userid 1000) id 5C9BB60D26; Sat, 2 Jun 2012 14:20:29 -0400 (CLT) Date: Sat, 2 Jun 2012 14:20:29 -0400 From: andrew cooke To: lepl@googlegroups.com Subject: Re: [LEPL] Detection of ambiguous grammar Message-ID: <20120602182029.GJ4944@acooke.org> References: <318cca21-8a98-4a66-b4c2-735126529aae@5g2000vbf.googlegroups.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <318cca21-8a98-4a66-b4c2-735126529aae@5g2000vbf.googlegroups.com> User-Agent: Mutt/1.5.21 (2010-09-15) Hi, [Sorry seems this was waiting a few days in the Google groups spam filter] Lepl is a recursive descent parser (like pyparsing, I think) and, as such, has a well defined ordering when parsing grammar. So it will resolve ambiguities according to the order implicit in the way the matchers are defined. The only errors it will flag are left recursion and, by default, the failure to match the entire input. You ask how Lepl can be made to reject the grammar. It cannot. However, you can, with care, define the grammar so that the ordering resolves ambiguities in the way you feel most appropriate. For example, Python 2.7.2 (default, Aug 19 2011, 20:41:43) [GCC] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from lepl import * >>> def build(): ... stmt = Delayed() ... with DroppedSpace(): ... expr = Literal('EXPR') ... end = Literal ('endif') ... cond = stmt | end ... stmt += ('if' & expr & 'then' & cond & Optional('else' & cond) > list) ... line = Trace(stmt) & Eos() ... return line.get_parse() ... >>> parser = build() >>> print parser('if EXPR then if EXPR then endif else endif') [['if', 'EXPR', 'then', ['if', 'EXPR', 'then', 'endif', 'else', 'endif']]] where the nesting shows how the ambiguity was resolved. Andrew On Wed, May 30, 2012 at 05:49:25AM -0700, maxime-esa wrote: > Dear LEPL experts, > > In trying to find a good parser for Python to implement a rather > complex grammar. I already know that Pyparsing will dot detect the > ambiguities of my grammar, and so far only ANTLR does. But as I am > trying to avoid having to compile the grammar before I can use it, I > am now looking at LEPL, which seems quite powerful. > > However, on my first tests, I am facing the same issue as with > Pyparsing - I hope someone can tell me if I am doing things wrong or > if LEPL will ever be able to detect my grammar error. > > Here is what I wrote: > > from def build(): > stmt = Delayed() > with DroppedSpace(): > expr = Literal('EXPR') > end = Literal ('endif') > cond = stmt | end > stmt += 'if' & expr & 'then' & cond & Optional('else' & cond) > line = Trace(stmt) & Eos() > > return line.get_parse() > > parser = build() > print parser('if EXPR then if EXPR then endif else endif') > ['if', 'EXPR', 'then', 'if', 'EXPR', 'then', 'endif', 'else', 'endif'] > > Basically, this result is wrong. According to the grammar, the "else" > can belong either to the first "if" statement or to the second one. I > would expect the parser to detect this ambiguity and reject the > grammar. > > The following is the equivalent grammar I wrote in ANTLR: > > cond : stmt | END; > > stmt : 'if' EXPR 'then' cond ('else' cond)* ; > > EXPR : 'EXPR'; > END : 'endif'; > > And when I compile it I get the following error: > > warning(200): test.g:18:42: Decision can match input such as "'else'" > using multiple alternatives: 1, 2 > As a result, alternative(s) 2 were disabled for that input > error(201): test.g:18:42: The following alternatives can never be > matched: 2 > > > I hope this is clear and makes sense - so my question: is my grammar > badly specified? How would I need to write it so that LEPL would raise > an error ? > > Thanks > > -- > You received this message because you are subscribed to the Google Groups "lepl" group. > To post to this group, send email to lepl@googlegroups.com. > To unsubscribe from this group, send email to lepl+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/lepl?hl=en. >