It means that what you want to parse here is not valid HTML, i.e. the web
page is broken. The HTMLParser package in the standard library is not made
for parsing broken HTML. Use another tool like html5lib or lxml.html.
Stefan
_______________________________________________
XML-SIG maillist - XML...@python.org
http://mail.python.org/mailman/listinfo/xml-sig
Yes. That's pretty easy, though. They should be readily packaged for your
platform (Linux), so you can just install them like any other software
package. Look out for "python-html5lib" or "python-lxml".
Just noticed this now - you seem to be using BeautifulSoup, likely version
3.1. This version does not support parsing broken HTML any well, so use
version 3.0.8 instead, or switch to the tools I indicated.
Note that switching tools means that you need to change your code to use
them. Just installing them is not enough.
That's some funny code - it uses BeautifulSoup to parse HTML, and then uses
lxml to build an XML tree from it - instead of using just lxml in the first
place...
Please send an e-mail to the original author of the tool to tell him/her
about the problem. Use the project mailing list for this (if there is one).
If that doesn't help, I'd suggest installing BeautifulSoup 3.0.8 to see if
that helps.
sharifah ummu kulthum, 23.02.2010 04:45:
> I am so sorry but I really don't know how to change the code as I have just> [...]
> learn python. How am I going to switch the version or to change the code?
> Because I don't really understand the code.
>
> Here is the code:
That's some funny code - it uses BeautifulSoup to parse HTML, and then uses
lxml to build an XML tree from it - instead of using just lxml in the first
place...
Please send an e-mail to the original author of the tool to tell him/her
about the problem. Use the project mailing list for this (if there is one).
If that doesn't help, I'd suggest installing BeautifulSoup 3.0.8 to see if
that helps.
Stefan
You should consider reading the documentation of easy_install. That would
have told you that you can use
# sudo easy_install BeautifulSoup==3.0.8
Note that this (and most of the previous thread) is rather off-topic to
this list. The comp.lang.python newsgroup would have been a better choice.