How to do strict parsing

29 views
Skip to first unread message

Raj

unread,
Jul 5, 2009, 3:05:46 PM7/5/09
to nokogiri-talk
I just want to know if my html is well formed or not.


s = '<p>ruby on rails</div>'
Nokogiri::HTML.parse(s,nil,nil,0)


I was expecting the parse method to blow up or to raise some kind of
exception.

What is the best way to find out if an html page is well formed ( all
the tags are closed) or not using nokogiri?

Aaron Patterson

unread,
Jul 6, 2009, 1:18:10 PM7/6/09
to nokogi...@googlegroups.com
On Sun, Jul 5, 2009 at 12:05 PM, Raj<neera...@gmail.com> wrote:
>
> I just want to know if my html is well formed or not.
>
>
> s = '<p>ruby on rails</div>'
> Nokogiri::HTML.parse(s,nil,nil,0)
>
>
> I was expecting the parse method to blow up or to raise some kind of
> exception.

The problem is that it doesn't know what you mean by "strict". Is
that HTML 3, HTML 4, or HTML 4.01?

Your best bet is to check the "errors" array on the document. That
will tell you if it decided to correct anything:

doc = Nokogiri::HTML('<p>ruby on rails</div>')
raise "invalid html" unless doc.errors.empty?

--
Aaron Patterson
http://tenderlovemaking.com/

Reply all
Reply to author
Forward
0 new messages