Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

any good racc examples that parse from an IO instead of a String?

1 view
Skip to first unread message

Eric Mahurin

unread,
Nov 3, 2005, 5:17:22 PM11/3/05
to
I wanted to compare how the my grammar project compares against racc in
terms of performance. I have yet to find an example that parses from an IO.
Everything I've found so far parses from a String. Seems kind of silly if
racc has to read in the whole file before parsing - or everybody is doing it
that way. Maybe because people write there lexers based on regexes which
require Strings (I noticed discussions about this on another thread).

Steven Jenkins

unread,
Nov 4, 2005, 9:08:03 AM11/4/05
to

I don't understand the question. The parser operates on a token stream,
neither String nor IO.

I have a real (but arcane) Racc application; you're welcome to see it.
(It's a reverse-engineered parser for the export format of a commercial
system engineering tool.) The lexer pushes tokens onto an array and the
parser shifts from it. The lexer does read the entire file into memory
before parsing, but that could be trivially changed by running the lexer
in a separate thread and using a Queue for the token stream.

Steve


Eric Mahurin

unread,
Nov 4, 2005, 9:34:52 AM11/4/05
to


All the example racc parsers I've seen use lexers that parse from a string.
I think this is because they are using regexes which operates naturally on
strings, but not IO's.

So, I guess I mainly want to see a racc lexer example that operates on an
IO/File. Can you give me a link to what you have?

Steven Jenkins

unread,
Nov 4, 2005, 2:06:53 PM11/4/05
to
Eric Mahurin wrote:
> All the example racc parsers I've seen use lexers that parse from a string.
> I think this is because they are using regexes which operates naturally on
> strings, but not IO's.
>
> So, I guess I mainly want to see a racc lexer example that operates on an
> IO/File. Can you give me a link to what you have?

OK. It's not exactly published, but not exactly proprietary either. I'll
email it to you.

For the record, the lexer iterates over input lines with IO#each and
then uses regexes to split each line into tokens. So I suppose the lexer
is, strictly speaking, tokenizing a string.

Steve

0 new messages