<![endif] breaks parser

35 views
Skip to first unread message

ristretto.rb

unread,
Mar 31, 2009, 7:30:43 PM3/31/09
to beauti...@googlegroups.com
Has anyone noticed that sites with this Microsoft addition never
finish parsing with BS 3.0.7a?

"Conditional comments only work in Explorer on Windows, and are thus
excellently suited to give special instructions meant only for
Explorer on Windows. They are supported from Explorer 5 onwards, and
it is even possible to distinguish between 5.0, 5.5 and 6.0."

I'm finding that sites like this result in an infinite loop in the
sgmllib.py goahead.

Is there a work around for this? I don't really need to parse for
decl's, so I might fork BS for me own use, and strip out the support
for decl's. Something less harsh would be nice.

To see it in action, try parsing this

----------
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="fr">
<head>
<title>Escale de 2 jours : Hong Kong une ville à voir : Voyage Hong
Kong - Sejour Hong Kong - Voyage sur mesure Hong Kong - Circuit Hong
Kong - Vacances Hong Kong &ndash; Objectif Asie - AVENTURIA</title>

<!--[if lt IE 7]>
<link href="aventuria-ie6.css" rel="stylesheet" type="text/css"
media="projection, screen, tv" />
<![endif]-->
<!--[if IE]>
<link href="aventuria-ie.css" rel="stylesheet" type="text/css"
media="projection, screen, tv" />
<![endif]-->
</head>
<body>
</body>
</html>
--------------------
http://www.meltour.com/voyage_sur_mesure_chine_la-chine-en-version-originale.html
stripped down to just show problem.

thanks
gene

Reply all
Reply to author
Forward
0 new messages