hi! this is because tokens match as much as possible. so TEXT takes
everything and comment fails (because nothing was left for PERCENT).
here's one solution:
>>> with TraceVariables():
... PERCENT = Token('%')
... TEXT = Token('[^%\n]*')
... comment = PERCENT & TEXT
...
>>> comment.parse('% ThisIsAComment')
PERCENT = ['%'] stream = ' ThisIsAComment'
TEXT = [' ThisIsAComment'] stream = <EOS>
comment = ['%', ' ThisIsAComment'] stream = <EOS>
['%', ' ThisIsAComment']
but really it's much better to not use tokens at all:
>>> with TraceVariables():
... percent = Literal('%')
... text = AnyBut('\n')[:,...]
... comment= percent & text
...
>>> comment.parse('% ThisIsAComment')
percent = ['%'] stream = ' ThisIsAComment'
text = [' ThisIsAComment'] stream = ''
comment = ['%', ' ThisIsAComment'] stream = ''
['%', ' ThisIsAComment']
tokens are an "advanced" feature that you shouldn't use unless you need to.
> The second (and last) point ; this simple code:
>
> test = Token(String())
> test.parse('"irt"')
[...]
> lepl.lexer.support.LexerError: A Token was specified with a matcher, but the
> matcher could not be converted to a regular expression: And(NfaRegexp,
> Transform, NfaRegexp)
>
> It is as if the regexp behind String() was not recognized by LELP itself. Do
> I miss something?
again, this is because you're using tokens. tokens work with regular
expressions. not everything can be converted to a regular expression. for
example, in this case, lepl doesn't know how to convert the String() matcher.
if you want a token that matches a simple string you can use
>>> test = Token('"[^"]*"')
>>> test.parse('"irt"')
['"irt"']
but that's nothing like as flexible as String() (it's not as flexible because
it's a regular expression, and regular expressions are limited in what they
can do).
again, the simplest solution is to simply not use tokens:
>>> test = String()
>>> test.parse('"irt"')
['irt']
hope that helps,
andrew