Sorry for not replying today - been busy. Was going to reply tomorrow
(and still will).
I have a quick look and thought what you were doing might be possible, but
it's going to take some thought - the stream types are more complex than
you might expect (because they carry location in formation).
Looking at what you have below, I am a little surprised it does not work.
That approach should be fine. Maybe try "parse_list" rather than just
"parse"?
But I don't really have time today, sorry. Will look at this tomorrow
(unless Comcast is even suckier than usual and absorbs the whole day just
trying to pay the bill...).
Andrew
I realised (while asleep?!) that your types are wrong here.
Usually a stream is (effectively) a string. So stream[a:b] is a subset of
a string, which is also a string.
However, in this case, your stream is a list and stream[a:b] is a sublist
of a list. So Literal() needs to match a list, not a string. So instead
of Literal('a') you need Literal(['a']).
>>> from lepl import *
>>> class _rawtext(object):
... def __init__(self, value):
... self.value = value
...
>>> @function_matcher
... def rawtext(support, stream):
... if stream and isinstance(stream[0], _rawtext):
... return ([stream[0]], stream[1:])
...
>>> raw_then_literal = rawtext() & Literal(['a'])
>>> raw_then_literal.parse([_rawtext('some unstructured data'), 'a'])
[<__main__._rawtext object at 0x7fe511864b90>, ['a']]
Is that sufficient for what you want? Or would it be better to look again
at your original approach?
Andrew
On Sun, 31 Oct 2010 18:43:42 -0700 (PDT), Ben <cohe...@gmail.com> wrote:
In the example below I create a token, but disable the lexer (I need to
set compiler=True on the token because otherwise a sanity check in the
parser flags an error). Then I short-circuit the stream generation by
providing my own dummy stream. The stream is in the internal format
expected for tokens - each entry is a list of possible token IDs and then
the value.
I specialise the token to test for even, and supply an even and an odd
number. I get a partial match error because only the even value matched.
I'm sorry this isn't more elegant - at some point I should add a simple
interface that lets you do this kind of thing.
from lepl import *
if __name__ == '__main__':
@function_matcher
def isEven(support, stream):
if stream[0] % 2 == 0:
return stream[0], stream[1:]
special = Token('Specialised')
special.compiled = True
even = special(isEven)
class DummyStreamFactory(object):
def auto(self, x):
return x
special.config.stream_factory(DummyStreamFactory()).no_lexer()
print special.parse([([special.id_], 2), ([special.id_], 3)])
Looking at that I am now starting to wonder why you need tokens/lexer at
all, so you may well be right in your other approach...!
Andrew
PS I'm travelling tomorrow through Thursday, probably without my laptop,
but will reply to email when I return.
The code below has an error, in that I used "special" rather than "even"
as the parser. If I correct that then I hit a pile of errors due to
inconsistencies in stream types.
I really don't think the lexer code can do this. Hopefully you can just
use ordinary parsing on a list.
Sorry again,
Andrew
On Mon, 01 Nov 2010 09:22:15 -0500, andrew cooke <and...@acooke.org>
wrote:
Unfortunately, if efficiency is paramount, Lepl isn't really the right
solution.
> I spent some time looking at the Source and Stream classes -- I was
> trying to figure out if it would be easy to write a custom version of
> one of those which would let me pass a sequence of opaque objects and
> 'strings' and write my matcher's like this:
>
> opaque_text = rawtext()
> balanced_text = Literal('a') & opaque_text & Literal('b')
>
> instead of
>
> opaque_text = rawtext()
> structured_text = Literal(['a']) & opaque_text & Literal(['b'])
I was wondering about that too (although the way it works now is
consistent, I think - Literal matches a sequence of values from the input
stream; normally that's a sequence of characters in a string, but here it's
a sequence of entries in a list). A simple solution would be to
automatically adapt things. I haven't tried this, but something like:
def Lift(matcher):
def MyMatcher(text):
return matcher([text])
return MyMatcher
which you would use like:
MyLiteral = Lift(Literal)
structured_text = MyLiteral('a') & opaque_text & MyLiteral('b')
although this only works for matchers that take a single argument, etc
etc.
Another approach would be something similar that gives a matcher which
receives a modified stream (so the returnedmatcher modifies the stream to
remove the extra list), but to get that right you would need to use
trampoline_matcher_factory, which is pretty much undocumented.
Andrew