Parser infinite loop

Will Myers

unread,

Jan 20, 2013, 8:38:35 PM1/20/13

to treet...@googlegroups.com

Hi, All

Hoping to get some help as my parser keeps deadspinning.

I'm trying to parse a Latex document into HTML markup. So far I'm starting simply, I just want to parse an input with `tags` and `text`. When I start the following grammar using :root => :paragraph, I get an infinite loop with the following input:

'friend \emph{hello}\n\n'

It seems to be stalling at the first '\n' character. I've been building the grammar into a .rb file and debugging but can't seem to work out why it can't escape.

grammar Latex

rule document

(paragraph)* {

def content

[:document, elements.map { |e| e.content }]

end

}

end

rule paragraph

( tag / text )* eop {

def content

[:paragraph, elements.map { |e| e.content } ]

end

}

end

rule text

( !( tag_start / eop) . )* {

def content

[:text, text_value ]

end

}

end

# Example: \tag{inner_text}

rule tag

tag_start tag_type "{" inner_text "}" {

def content

[tag_type, inner_text.content]

end

}

end

# Example: \emph{inner_text}

rule inner_text

( !'}' . )* {

def content

[:inner_text, text_value]

end

}

end

rule eop

newline 2.. {

def content

[:newline, text_value]

end

}

end

rule tag_type

"emph" / "texttt"

end

rule newline

"\n"

end

rule tag_start

"\\"

end

Clifford Heath

unread,

Jan 20, 2013, 9:29:09 PM1/20/13

to treet...@googlegroups.com

Will,

I can't spot anything straight away. In a situation like this, I tend to do the following:

require 'ruby-debug'
Debugger.start
trap "INT" do
puts caller*"\n\t"
debugger
end

Run your program and hit ^C.

Clifford Heath.

> --
> You received this message because you are subscribed to the Google Groups "Treetop Development" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/treetop-dev/-/v9tvR5ToZPQJ.
> To post to this group, send email to treet...@googlegroups.com.
> To unsubscribe from this group, send email to treetop-dev...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/treetop-dev?hl=en.

Will Myers

unread,

Jan 22, 2013, 7:56:05 PM1/22/13

to treet...@googlegroups.com

Thanks for the reply, Clifford.

When I run this and hit ^C I usually hit one of the following three places:

(eval):118:in `call'
(eval):118:in `_nt_text'
(eval):73:in `block in _nt_paragraph'
(eval):67:in `loop'
(eval):67:in `_nt_paragraph'

or

(eval):206:in `call'
(eval):206:in `_nt_tag'
(eval):69:in `block in _nt_paragraph'
(eval):67:in `loop'
(eval):67:in `_nt_paragraph'

or

(eval):68:in `call'
(eval):68:in `block in _nt_paragraph'
(eval):67:in `loop'
(eval):67:in `_nt_paragraph'

I'm afraid that doesn't clear anything up for me, I already knew where it was spinning from pry. Any other thoughts?

Clifford Heath

unread,

Jan 22, 2013, 10:01:14 PM1/22/13

to treet...@googlegroups.com

Ahh, found it. Paragraph matches an unlimited number of tags or text.
Text can however match zero-length input; so you're getting an unlimited
number of variable-length texts. Try using + instead of * in text.

Clifford Heath.

> --
> You received this message because you are subscribed to the Google Groups "Treetop Development" group.

> To view this discussion on the web visit https://groups.google.com/d/msg/treetop-dev/-/wrGTsIdpcI8J.

Will Myers

unread,

Jan 22, 2013, 11:24:27 PM1/22/13

to treet...@googlegroups.com

Clifford,

That was it! Many thanks.

Reply all

Reply to author

Forward