How to simulate a CRLF?

12 views
Skip to first unread message

Iñaki Baz Castillo

unread,
Mar 18, 2008, 12:47:34 PM3/18/08
to treet...@googlegroups.com
Hi, I'm getting crazy to simulate a CRLF, I don't know why it doesn't
work. I use:

rule input
'FIRST LINE'
CLRF
'SECOND LINE'
end

rule CLRF
('\r\n' / '\n\r' / '\r' / '\n' / '\0' ) # <-- what to put here ???
end


Now I declare a variable as:

input = <<-INPUT_END
FIRST_LINE
SECOND_LINE
INPUT_END


and test it:

result = parser.parse(input)

but I can't get the CLRF symbol, why? how to indicate it?


Thanks.

PD: Sorry if this question is an off-topic.


--
Iñaki Baz Castillo
<i...@aliax.net>

Martin Gamsjaeger

unread,
Mar 18, 2008, 12:53:10 PM3/18/08
to treet...@googlegroups.com
Iñaki,

I don't know if this helps, but I have my whitespace rules as follows.

# optional space
rule space
white*
end

# mandatory space
rule SPACE
white+
end

# whitespace
rule white
[ \r\t\n]
end

The main difference I can see is that I use [ ... ] whereas you use (
../.../... ) ...
Maybe this does the trick?

cheers
Martin

Iñaki Baz Castillo

unread,
Mar 18, 2008, 1:12:40 PM3/18/08
to treet...@googlegroups.com
2008/3/18, Martin Gamsjaeger <gams...@gmail.com>:

> # whitespace
> rule white
> [ \r\t\n]
> end
>
> The main difference I can see is that I use [ ... ] whereas you use (
> ../.../... ) ...
> Maybe this does the trick?

Completely!!!

I've realized that writting:
'\n'
DOESN'T MATCH a line jump, I must use:
[\n]

So if I need to match a CRLF I cannot do:
'\r\n'
and I must do:
[\r] [\n]

Anyway I don't like having to use "[" "]" since they are used to enter
options. Is there no other ellegant way to set \n \t and so?


Thanks a lot for your useful help ;)

Martin Gamsjaeger

unread,
Mar 18, 2008, 1:55:06 PM3/18/08
to treet...@googlegroups.com
Interesting!

I'm always confused by whitespace handling :-) I don't know of any
other way, but let us know if you find one!
Anyway, I can really recommend having a look at all the examples under
treetop-1.2.3/examples along with the tests! They helped clarify a lot
for me!

cheers
Martin

Clifford Heath

unread,
Mar 18, 2008, 5:45:33 PM3/18/08
to treet...@googlegroups.com
On 19/03/2008, at 4:55 AM, Martin Gamsjaeger wrote:
> I'm always confused by whitespace handling :-) I don't know of any
> other way, but let us know if you find one!

Here are the basic lexical rules I use, with comments defined for a
C++-like grammar. I hope they help you.

rule id
alpha alphanumeric*
end

rule alpha
[A-Za-z_]
end

rule alphanumeric
alpha / [0-9]
end

rule s # Optional space
S?
end

rule S # Mandatory space
(white / comment_to_eol / comment_c_style)+
end

rule white
[ \t\n\r]+
end

rule comment_to_eol
'//' (!"\n" .)+
end

rule comment_c_style
'/*' (!'*/' . )* '*/'
end


Clifford Heath.

Iñaki Baz Castillo

unread,
Mar 19, 2008, 5:01:24 AM3/19/08
to treet...@googlegroups.com
2008/3/18, Clifford Heath <cliffor...@gmail.com>:

> Here are the basic lexical rules I use, with comments defined for a
> C++-like grammar. I hope they help you.

> rule comment_to_eol


> '//' (!"\n" .)+
> end

Please, could you explain a little the above expresion? What does the
"." mean in ".)+" ?

Thanks a lot.

Clifford Heath

unread,
Mar 19, 2008, 12:40:50 PM3/19/08
to treet...@googlegroups.com
On 19/03/2008, at 8:01 PM, Iñaki Baz Castillo wrote:
> 2008/3/18, Clifford Heath <cliffor...@gmail.com>:

>> rule comment_to_eol
>> '//' (!"\n" .)+
>> end
> Please, could you explain a little the above expresion? What does the
> "." mean in ".)+" ?

The . matches any character, and !"\n" says "as long as we're not
looking
at a newline...". So this matches any number of characters up to the
next
newline (or end of file).

You can use any rule after ! (not just a character literal) so you
can have
unlimited lookahead before deciding to proceed down the current path.

Clifford Heath.

Iñaki Baz Castillo

unread,
Mar 20, 2008, 7:19:44 PM3/20/08
to treet...@googlegroups.com
Ok, thanks a lot for the explanation.

Iñaki Baz Castillo

unread,
Mar 24, 2008, 6:23:06 PM3/24/08
to treet...@googlegroups.com
El Martes, 18 de Marzo de 2008, Iñaki Baz Castillo escribió:
> 2008/3/18, Martin Gamsjaeger <gams...@gmail.com>:
> > # whitespace
> > rule white
> > [ \r\t\n]
> > end
> >
> > The main difference I can see is that I use [ ... ] whereas you use (
> > ../.../... ) ...
> > Maybe this does the trick?
>
> Completely!!!
>
> I've realized that writting:
> '\n'
> DOESN'T MATCH a line jump, I must use:
> [\n]

I was wrong when saying it. I don't need to use:
[\n]
I can use:
"\n"
but cannot use:
'\n'

:)


--
Iñaki Baz Castillo

Clifford Heath

unread,
Mar 24, 2008, 6:30:50 PM3/24/08
to treet...@googlegroups.com
On 25/03/2008, at 9:23 AM, Iñaki Baz Castillo wrote:
> I can use:
> "\n"
> but cannot use:
> '\n'

Right, Ruby-style. I introduced rigourous tests and corrected this
behaviour a
few weeks back, after it was pointed out that #{...} could be used
for injection.
All other character escapes except multi-byte Unicode work the same,
including
the weird Ruby-isms like "\C-M-c"

Clifford Heath.

Reply all
Reply to author
Forward
0 new messages