Best way to ignore optional whitespace ?

73 views
Skip to first unread message

mortench

unread,
Apr 22, 2008, 8:33:28 AM4/22/08
to Treetop Development
What is the easiest way to ignore whitespace in treetop parsers?

Right now I am doing this explicitly by putting required "SPACE" and
optional "space" into the grammar itself which leads to constructs
like these below. However I feel that the grammar becomes a bit to
large and unfocused when I do this.

I could use a filter to remove spaces before calling the parser but
that would mess up line numbering (+ currently I have a problem with
requires spaces that can't be filtered unless I redesign the
underlaying language to be parsed - is under consideration though).

-----------
rule type_definition
type_name whitespace space ':=' space 'ENUM' SPACE enum_field_name+
SPACE 'ENDENUM'
end

# Optional whitespace
rule space
[ \r\t\n]*
end

# Required whitespace
rule SPACE
[ \r\t\n]+
end
----------

P.S. I know that Antlr has a mechanism for ignoring whitespace. A
possible feature for treetop also?

Iñaki Baz Castillo

unread,
Apr 22, 2008, 3:57:09 PM4/22/08
to treet...@googlegroups.com
El Martes, 22 de Abril de 2008, mortench escribió:
> Right now I am doing this explicitly by putting required "SPACE" and
> optional "space" into the grammar itself which leads to constructs
> like these below. However I feel that the grammar becomes a bit to
> large and unfocused when I do this.

Another way is by adding the optional spaces into the basic node syntax. This
is used in some grammars as SIP ABNF and so. For example:

rule from_header
[Ff] [Rr] [Oo] [Mm] HCOLON uri parameters
end

rule HCOLON
space ':' space
end

rule EQUAL
space EQUAL space
end

rule LQUOT
space '<'
end

...

rule space
[\s\t]*
end


As you see, in this way you keep the higher syntax nodes "clean" of optional
spaces and that stuff. The trick is adding them to low level nodes.


--
Iñaki Baz Castillo

Reply all
Reply to author
Forward
0 new messages