Loading an HTML fragment

Skip to first unread message

Ryan Mahoney

Sep 29, 2009, 7:32:22 AM9/29/09
to devel-querypath, techno...@gmail.com
Recently, I've had the need to load HTML document fragments into
QueryPath. These fragments may or may not be combined later into a
complete HTML template. The issue I'm having though, is that if I
read a file, say, called "test.html" that contiains the following
HTML: "<strong>test</strong>", it will be treated as HTML and the
parser will automatically add additional tags to it to make it a full
page. Since we write XHTML, we would prefer just to parse the
fragment as XML, but then, I think I'd have to an XML declaration to
the top of each fragment which I'm not keen on doing. Below are some
string replacements that I've put in place as a stop-gap solution,
just wondering if anyone on the list knows a way to work around this?


- - - - - -

$buffer = ob_get_clean();

//nasty hack to support html fragments :(
$buffer = str_replace('<?xml version="1.0" standalone="yes"?>', '',
$buffer = str_replace('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">', '',
$buffer = str_replace('<html>', '', $buffer);
$buffer = str_replace('<body>', '', $buffer);
$buffer = str_replace('</html>', '', $buffer);
$buffer = str_replace('</body>', '', $buffer);

Matt Butcher

Oct 12, 2009, 3:48:26 PM10/12/09
to devel-querypath
I think I sent the reply directly to Ryan... but I believe the
solution is to do something like:

$html = qp('some.html');

$out = $html->top('body')->xml(TRUE); // TRUE means "skip XML
Reply all
Reply to author
0 new messages