How do I extract the CDATA section out of XML using D6 and TXMLDocument?
In part of my XML document I have some HTML code which I've put
inside a CDATA section so that the XML parser will ignore it.
Now my code is something like this:
<Campaign>
<Webpage><![CDATA[<br><a href="http://www.borland.com">Rocks</a> ]]></Webpage>
<OtherStuff>234345</OtherStuff>
</Campain>
Now in my Delphi6 code I have assigned the DocumentElement to a
node, MyRootNode, and I then interogate each node like so:
CampaignNode := MyRootNode.ChildNodes.Nodes['Campaign'];
Now my CampaignNode contains a number of child nodes and I can
interogate all of them except the one containing CDATA section!?
Can anyone shed some light on this for me please?
I am trying to get the value of the CDATA section by doing the
following:
Mystring := CampaignNode.ChildNodes.Nodes['Webpage'].nodeValue;
Anyway, I get the following error when I try step to the line above:
"EXMLDocError: Element does not contain a single text node"
p.s. I am using OpenXML as the DOMVendor, if that helps??
Thanks,
Jason.
>
> Hi all,
>
> How do I extract the CDATA section out of XML using D6 and TXMLDocument?
>
> In part of my XML document I have some HTML code which I've put
> inside a CDATA section so that the XML parser will ignore it.
The parser won't exactly ignore it. It won't treat markup characters < and &
in it as markup, but it will still read it, and if you output it via an XML
processor (eg XSLT or other parsed process) it will by default turn all <
characters into < and all & characters into &
See http://xml.silmaril.ie/authors/cdata/
///Peter
Reading CDATA contents from XML
----------------------------------------
There is popular belief that CDATA contents are very difficult to read. But the actual fact is that it is very simple, and that too using performance enhancing classes. When i say 'performance enhancing classes', i mean XmlTextReader class in .NET 1.1. It might always look very easy to use XmlDocument for all XML related requirements, and ofcourse it is!! But, seldom do we recognize that it is a resource hungry and time consuming class. As a rule of thump, we can keep a guideline as follows
Use XmlDocument ONLY if
1) You need to update the XML file immediately after reading some part of it.
2) You need to access a considerable big number of properties / methods provided in the XmlDocument class
Use XmlTextReader ALWAYS if
1) You do not want to write back to the XML anything.
2) You don't want to navigate back and front in the DOM model. (Remember this is a forward only reader, and hence the performance advantage)
The following code shows how to read the CDAT using XmlTextreader
Sample XML
----------
xml removed due to security restriction. Don't ask me why..ask TopXML!!!
Code C#
-------
XmlTextReader reader = new XmlTextReader("xml path");
XmlNodeType type;reader.WhitespaceHandling = WhitespaceHandling.None;
bool bFlag = false;
while(reader.Read())
{
type = reader.NodeType;
if(type == XmlNodeType.Element && reader.Name == "JunkData") //#1
{
bFlag = true;
}
if(bFlag) //#2
{
if(type == XmlNodeType.CDATA)
{
string s = reader.ReadString();
MessageBox.Show(s); //#3
}
}
}
Notes
------
#1) Here you can add more logic to ensure you access the right CDATA if you have more than one.
#2) This ensures that unnecessary code execution are not happening
#3) This should display the contents between "CDATA[" and "]"
Regards,
James Poulose
james....@yahoo.co.uk
BizTalk Utilities - Frustration free BizTalk Adapters
http://www.topxml.com/biztalkutilities