aaron-mac:lepl aaron$ more ptest.py
from lepl import *
namere = '[a-z]+'
symbol = Token('[^0-9a-zA-Z \t\r\n]')
name = Token(namere)
with DroppedSpace(spaces):
#namelist = Delayed()
namelist = name & symbol('+') & name
print name.parse("hello")
print namelist.parse("hello")
print namelist.parse("hello+goodbye")
aaron-mac:lepl aaron$ python ptest.py
Traceback (most recent call last):
File "ptest.py", line 8, in <module>
with DroppedSpace(spaces):
NameError: name 'spaces' is not defined
aaron-mac:lepl aaron$
What am I doing wrong? Sorry, but it wasn't obvious to me from the
documentation.
btw: Eventually I would like to encode a significant fragment of
something like this: http://savage.net.au/SQL/sql-92.bnf
-- is that a reasonable goal?
Here is the actual error after deleting the nameerror as you suggest.
And I've written an SQL parser before ;c). http://gadfly.sourceforge.net/
thanks, -- Aaron Watters
['hello']
Traceback (most recent call last):
File "ptest.py", line 13, in <module>
print namelist.parse("hello")
File "build/bdist.macosx-10.6-universal/egg/lepl/core/config.py",
line 858, in parse
File "build/bdist.macosx-10.6-universal/egg/lepl/core/config.py",
line 815, in get_parse
File "build/bdist.macosx-10.6-universal/egg/lepl/core/config.py",
line 723, in get_match
File "build/bdist.macosx-10.6-universal/egg/lepl/core/config.py",
line 675, in _raw_parser
File "build/bdist.macosx-10.6-universal/egg/lepl/core/parser.py",
line 220, in make_raw_parser
File "build/bdist.macosx-10.6-universal/egg/lepl/lexer/
rewriters.py", line 122, in __call__
File "build/bdist.macosx-10.6-universal/egg/lepl/lexer/
rewriters.py", line 76, in find_tokens
lepl.lexer.support.LexerError: The grammar contains a mix of Tokens
and non-Token matchers at the top level. If Tokens are used then non-
token matchers that consume input must only appear "inside" Tokens.
The non-Token matchers include: Any(' \t').
aaron-mac:lepl aaron$
So, the error is not very good in this case - sorry - as it's not clear where
the conflict is coming from. What is happening is that, "behind the scenes",
DroppedSpace is adding additional matchers (to handle spaces) that are
causing the error you are seeing.
The underlying issue is that you're mixing two different ways of handling
spaces. If you use tokens, then you should drop spaces in the tokenizer. If
you don't use tokens, then you use DroppedSpace. Using both at the same time
doesn't make much sense.
Tokens (ie the lexer) are explained at
http://www.acooke.org/lepl/intro-2.html#tokens-first-attempt and
http://www.acooke.org/lepl/lexer.html#lexer
So, which should you use?
In general, if you have a target that can be handled by Lepl's simple
regular-expression based lexer, then using it simplifies the grammar. But
note that it is a completely separate layer, so any lexing is restricted to
regular expressions. If you have anything "fancy" (eg nested comments or
strings with anything-but-very-simple escapes for quoting) that requires
context information during lexing then you cannot use the lexer (well, you
can, but you will end up frustrated later) (SQL might be simple enough - I
don't know enough about exactly what a literal SQL string can look like to
say).
Finally, after writing the other reply, I was thinking more about SQL-related
issues. Lepl is generally used for small "pull data out of this mess"
problems. It is not used - to my knowledge - for parsing large languages.
Theoretically, there are no limits, but practically you may hit some issues.
One is the weak lexer (see above!). Another is error handling.
Part of the problem with errors is efficiency. Lepl's recursive descent
nature means that when input contains an error it tends to spend a lot of time
backtracking to find a "solution" that doesn't exist. It is, in a sense, too
flexible. The First() matcher can help avoid this.
Another part is the informality of the parser ("failure" is often a normal
condition that simply means a different option should be tried via
backtracking - how do you separate this from a "real" error? (*)) and a lack
of experience with big projects. You may find, as you write a parser for SQL,
that there is some useful abstraction that Lepl doesn't have, that would make
handling (specifying, in a sense) errors easier. That would be interesting,
and because Lepl is written in Python you could extend it to add that
abstraction, but it could also mean more work...
Hope that helps.
Andrew
(*) One thing Lepl does has, which can help, is that it tracks the position of
the deepest match within the text. Typically this is very close to where the
error is, which makes the information very useful. What it doesn't do (and I
am not sure how you would do this, but it might be interesting to explore) is
associate that with any kind of metadata about "where in the grammar" it was
when it reached that point.
> --
> You received this message because you are subscribed to the Google Groups "lepl" group.
> To post to this group, send email to le...@googlegroups.com.
> To unsubscribe from this group, send email to lepl+uns...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/lepl?hl=en.
>