Left and right recursion with subtraction

Andrew Stromme

unread,

May 31, 2011, 3:53:58 PM5/31/11

to le...@googlegroups.com

Hi there,

A part of my parser deals with expressions matching. I'm handling operator precedence as shown on http://www.acooke.org/lepl/intro-4.html

add = group3 & ~symbol('+') & group4 > Add._make

sub = group3 & ~symbol('-') & group4 > Sub._make

group4 += add | sub | group3

Lets assume that I am trying to parse 4 - 3 - 2 to get -1. The above grammar matches expressions incorrectly because ((4 - 3) - 2) is different from (4 - (3 - 2)). The parser generates an AST that corresponds to the 2nd option, which is wrong. I could rewrite the grammar as a left-recursive grammar, but that isn't good because it's best not to memoize. Any ideas?

Thanks,

Andrew

astromme

unread,

May 31, 2011, 5:23:36 PM5/31/11

to lepl

After consulting the dragon book and some online resources I've come
up with

group4end = Delayed()
add = ~symbol('+') & group3 & group4end > List
sub = ~symbol('-') & group3 & group4end > List
group4end += Optional(add | sub)
group4 += group3 & group4end > List

which should associate things as expected, I think? However, this
makes it a lot harder to generate nodes on the fly. I still would like
to have nodes that look sort of like (subtract (subtract 4 3) 2).

Andrew

On May 31, 3:53 pm, Andrew Stromme <andrew.stro...@gmail.com> wrote:
> Hi there,
>
> A part of my parser deals with expressions matching. I'm handling operator

> precedence as shown onhttp://www.acooke.org/lepl/intro-4.html

andrew cooke

unread,

May 31, 2011, 7:13:44 PM5/31/11

to le...@googlegroups.com

Hi,

I am trying to look at this, but am having some "issues" with running any
Python code at all, so it may be some time before I can get back.

FWIW it seems like you are right and that exmple is wrong. What I would like
to do is extend the approach at
http://code.google.com/p/lepl/source/browse/src/lepl/_example/expression.py
which uses repeptition rather than recursion. I thnk that could give a clean
solution, with some work (eg using "sum(...)")

Sorry I can't be more help right now - will try and be more useful in the new
day or two.

Andrew

> --
> You received this message because you are subscribed to the Google Groups "lepl" group.
> To post to this group, send email to le...@googlegroups.com.
> To unsubscribe from this group, send email to lepl+uns...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/lepl?hl=en.
>

Andrew Stromme

unread,

Jun 21, 2011, 9:54:09 PM6/21/11

to le...@googlegroups.com

Hi Andrew,

I'm just checking in with regards to the repetition vs recursion and left recursion/right recursion question I had a while back. The parsing part of my project was on the back burner but I'm picking it up again and remembered that I had this issue to look at. I had a look at the expression.py example but couldn't get it to work myself with Tokens.

Thanks,

Andrew Stromme

andrew cooke

unread,

Jun 21, 2011, 10:36:08 PM6/21/11

to le...@googlegroups.com

curiously, i remembered you just this morning because i stumbled across the
test case i wrote when you last asked. i haven't done any more, but i'll have
a look at using repetition (like the example) in the next few days.

cheers,
andrew

andrew cooke

unread,

Jun 22, 2011, 1:56:29 PM6/22/11

to le...@googlegroups.com

hi,

just muddled through to find this during lunch. posting now in case you are
waiting, will explain tonight. hopefully it's fairly self-explanatory. the
idea is to have sums and products, where sums can contain positive and
negative values, and products can contain values and their inverses. i
haven't implemented the node operations to evaluate, but they have obvious
(and reasonably efficient) solutions (i hope!):

def test_repeat3(self):
'''
Fix the problem above by folding - and / into the value itself.
'''
basicConfig(level=DEBUG)

class Op(List): pass
class Sum(Op): pass
class Neg(Op): pass
class Prd(Op): pass
class Inv(Op): pass

with TraceVariables():

value = Token(UnsignedReal())
symbol = Token('[^0-9a-zA-Z \t\r\n]')
number = Optional(symbol('-')) + value >> float

sum = Delayed()
parens = ~symbol('(') & sum & ~symbol(')')
val = parens | number

inv = ~symbol('/') & val > Inv
eve = ~symbol('*') & val
prd = val & (inv | eve)[0:] > Prd

neg = ~symbol('-') & prd > Neg
pos = ~symbol('+') & prd
sum += prd & (neg | pos)[0:] > Sum

line = sum & Eos()

self.examples([
(lambda: line.parse('4-3-2')[0],
"""Sum
+- Prd
| `- 4.0
+- Neg
| `- 3.0
`- Neg
`- 2.0"""),
(lambda: line.parse('1+2-3-4+5+6+7-8-9-10-11')[0],
"""Sum
+- Prd
| `- 1.0
+- 2.0
+- Neg
| `- 3.0
+- Neg
| `- 4.0
+- 5.0
+- 6.0
+- 7.0
+- Neg
| `- 8.0
+- Neg
| `- 9.0
+- Neg
| `- 10.0
`- Neg
`- 11.0"""),
(lambda: line.parse('1+2*(3-4)+5/6+7')[0],
"""Sum
+- Prd
| `- 1.0
+- Prd
| +- 2.0
| `- Sum
| +- Prd
| | `- 3.0
| `- Neg
| `- Prd
| `- 4.0
+- Prd
| +- 5.0
| `- Inv
| `- 6.0
`- Prd
`- 7.0""")])

On Tue, Jun 21, 2011 at 09:54:09PM -0400, Andrew Stromme wrote:

andrew cooke

unread,

Jun 22, 2011, 6:54:53 PM6/22/11

to le...@googlegroups.com

the code is now at
http://code.google.com/p/lepl/source/browse/src/lepl/_example/expression2.py?spec=svn455ba3ac227d91a0a3ba6a5963118102bc55f20e&r=455ba3ac227d91a0a3ba6a5963118102bc55f20e#203
(and it includes evaluation)

i don't think (but haven't proved) that the natural pair grouping you want is
possible (my argument is that the form implies left-recursion, which
contradicts what you are trying to do). so i had two alternatives: either
rewrite the result tree to re-arrange nodes, or make nested pairs "flat".

since i already had code that parsed to give "flat" trees, that's what i went
with. it took a few attempts (which you can see in the link above) before i
understood what to do - to easily make the flat nodes you need to create a
list in a single "line" of the parser, which means matching just one "kind of
thing". that led to the idea of summing numbers that might be negative, and
the same for products.

hope that all makes sense. i'll update the manual for the next release (which
i am working on, but which is still a long way off).

cheers,
andrew

Andrew Stromme

unread,

Jun 23, 2011, 2:30:46 PM6/23/11

to le...@googlegroups.com

Thanks, that makes sense and my implementation of it is working.

What should I do about other operations with precedence, such as < and >? They don't have the same easy 'inversion' properties that +,- and *,/ have.

I could get a tree structure but without the top level Compare node knowing about how to evaluate its children on a case by cases I can't see how it would work.

Andrew

andrew cooke

unread,

Jun 23, 2011, 2:46:37 PM6/23/11

to le...@googlegroups.com

can you give me an exact example with > and < that shows the problem? i am
not sure i follow. thanks, andrew

andrew cooke

unread,

Jun 23, 2011, 2:53:52 PM6/23/11

to le...@googlegroups.com

do you mean something like:

1 < x < 2

which might in some crazy example be

1 < x > 3

if so, then those aren't binary operators. there some freaky mess. and what
i would do is slurp them up as a sequence of symbols and then try and make
sense of them elsewhere.

so something like:

expression[2:,Or(">","<")] > FreakyMess

where expression matches some sub-expression like "x" or "1".

andrew

Andrew Stromme

unread,

Jun 23, 2011, 3:17:09 PM6/23/11

to le...@googlegroups.com

What about something like

1 < 2 > 3?

i.e.

((1 < 2) > 3).

In c this would result in the value 0 (or false if you stuffed it into a bool). But the way my code currently works we get

(1 < (2 > 3))

which results in the value 1

andrew cooke

unread,

Jun 23, 2011, 3:18:47 PM6/23/11

to le...@googlegroups.com

what does 1<2 return? true? so what is true > 3?

andrew

Andrew Stromme

unread,

Jun 23, 2011, 4:17:49 PM6/23/11

to le...@googlegroups.com

hmm, my example is inconsistent, oops. I tried some things out in the python interpreter, just to see how python interpreted them.

>>> 1 > 1 < 3

False

>>> 1 > (1 < 3)

False

>>> (1 > 1) < 3

True

That looks to me like python is grouping from the right towards the left, I had assumed the other way. I will think about my problem space some more, I think my original issue might not exist. Thanks for your help.

Andrew

andrew cooke

unread,

Jun 23, 2011, 5:03:12 PM6/23/11

to le...@googlegroups.com

i think you're confused about how this works.

> and < are not (always) binary operators.

1 < 2 < 3 is true because AS A WHOLE the sequence is true.

1 < (2 < 3) is false because it is equivalent to 1 < True and in such
comparisons True is taken as 1.

so the two above are not equivalent. because 1 < 2 < 3 is not formed from
binary operations. it's a freaky mess. you cannot decompose 1 < 2 < 3 into
two binary operators.

andrew

Andrew Stromme

unread,

Jun 23, 2011, 9:01:54 PM6/23/11

to le...@googlegroups.com

I very well may be very confused. Humor me one more time, here is a c++ program

#include <iostream>

using namespace std;

int main() {

cout << (1 < 2 < 2) << endl; // outputs 1

cout << ((1 < 2) < 2) << endl; // outputs 1

cout << (1 < (2 < 2)) << endl; // outputs 0

}

And the python version:

>>> (1 < 2 < 2)

False

>>> ((1 < 2) < 2)

True

>>> (1 < (2 < 2))

False

>>>

I agree that it's a freakish mess. I agree that I probably don't understand it quite correctly. But I also believe that the C/C++ style of handling this is different from the python version.

Andrew

andrew cooke

unread,

Jun 23, 2011, 9:49:04 PM6/23/11

to le...@googlegroups.com

for python the three different meanings are:

1 < 2 < 2 is 2 between 1 and 2 (not inclusive)? no!

(1 < 2) < 2 is True less than 2? True is 1. so yes!

1 < (2 < 2) is 1 less than False? False is 0. so no!

see how the first is NOT the same as either of the other two? i imagine that
internally the first is reduced to something like:

(1 < 2) and (2 < 2)

while the other two examples are parsed more or less like you'd expect.

for c++ i really have no idea, but i would guess it's more consistent /
logical - probably left to right with 0 and 1 as truth values.

1 < 2 < 2 == 1 < 2 == 1
(1 < 2) < 2 == 1 < 2 == 1
(1 < (2 < 2)) == 1 < 0 == 0

so c++ works like you were thinking python worked.

if you're asking me how i'd parse c++ so that the grouping worked i think i'd
probably do the following:

term[2:,Or(">", "<")] > secondary_function

where secondary_function is a hand-written function that does what you want.
i don't see a good way at the moment to do it purely in lepl (which is
annoying, but part of the reason lepl is in python is so that you can do
exactly that - you can use python itself to extend it however you want).

andrew

Andrew Stromme

unread,

Jun 23, 2011, 11:44:09 PM6/23/11

to le...@googlegroups.com

Absolutely, thanks for the in-depth rundown. I get to decide how I want my system to work, and after looking at both python and c++ and after scratching my head a bit :) I think that I like the python version better because it's closer to reality.

Andrew

Reply all

Reply to author

Forward