On the following page I get an exception when calling newDocument:
http://www.chefkoch.de/magazin/artikel/943,0/AEG-Electrolux/Frischer-Saft-aus-dem-Dampfgarer.html
Warning: DOMDocument::loadXML() [domdocument.loadxml]: DOCTYPE
improperly terminated in Entity, line: 1 in /home/chroot/wm/home/wm/
inc/phpQuery/DOMDocumentWrapper.php on line 239
Warning: DOMDocument::loadXML() [domdocument.loadxml]: Start tag
expected, '<' not found in Entity, line: 1 in /home/chroot/wm/home/wm/
inc/phpQuery/DOMDocumentWrapper.php on line 239
Quick fix was to uncomment the following lines (l. 240-241) in
DOMDocumentWrapper.php in order to allow parsing of not well-formed
HTML:
if (! $return)
$return = $this->document->loadHTML($markup);
I would like to know, why this part of the code was commented? Am I
running into some serious issues with this code being executed? So far
everything looks quite good and I'm running this code against ~20.000
different pages (though not all had this error).
Thanks in advance,
Max