Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How do I extract the CDATA section out of XML using D6 and TXMLDocument?

306 views
Skip to first unread message

Jason

unread,
Oct 25, 2005, 12:50:27 AM10/25/05
to

Hi all,

How do I extract the CDATA section out of XML using D6 and TXMLDocument?

In part of my XML document I have some HTML code which I've put
inside a CDATA section so that the XML parser will ignore it.

Now my code is something like this:
<Campaign>
<Webpage><![CDATA[<br><a href="http://www.borland.com">Rocks</a> ]]></Webpage>
<OtherStuff>234345</OtherStuff>
</Campain>

Now in my Delphi6 code I have assigned the DocumentElement to a
node, MyRootNode, and I then interogate each node like so:

CampaignNode := MyRootNode.ChildNodes.Nodes['Campaign'];

Now my CampaignNode contains a number of child nodes and I can
interogate all of them except the one containing CDATA section!?

Can anyone shed some light on this for me please?

I am trying to get the value of the CDATA section by doing the
following:
Mystring := CampaignNode.ChildNodes.Nodes['Webpage'].nodeValue;

Anyway, I get the following error when I try step to the line above:

"EXMLDocError: Element does not contain a single text node"

p.s. I am using OpenXML as the DOMVendor, if that helps??
Thanks,
Jason.

Peter Flynn

unread,
Oct 25, 2005, 6:15:02 PM10/25/05
to
Jason wrote:

>
> Hi all,
>
> How do I extract the CDATA section out of XML using D6 and TXMLDocument?
>
> In part of my XML document I have some HTML code which I've put
> inside a CDATA section so that the XML parser will ignore it.

The parser won't exactly ignore it. It won't treat markup characters < and &
in it as markup, but it will still read it, and if you output it via an XML
processor (eg XSLT or other parsed process) it will by default turn all <
characters into &lt; and all & characters into &amp;

See http://xml.silmaril.ie/authors/cdata/

///Peter

James Poulose

unread,
Feb 13, 2007, 8:32:38 AM2/13/07
to

Reading CDATA contents from XML
----------------------------------------
There is popular belief that CDATA contents are very difficult to read. But the actual fact is that it is very simple, and that too using performance enhancing classes. When i say 'performance enhancing classes', i mean XmlTextReader class in .NET 1.1. It might always look very easy to use XmlDocument for all XML related requirements, and ofcourse it is!! But, seldom do we recognize that it is a resource hungry and time consuming class. As a rule of thump, we can keep a guideline as follows

Use XmlDocument ONLY if
1) You need to update the XML file immediately after reading some part of it.
2) You need to access a considerable big number of properties / methods provided in the XmlDocument class

Use XmlTextReader ALWAYS if

1) You do not want to write back to the XML anything.
2) You don't want to navigate back and front in the DOM model. (Remember this is a forward only reader, and hence the performance advantage)

The following code shows how to read the CDAT using XmlTextreader

Sample XML
----------
xml removed due to security restriction. Don't ask me why..ask TopXML!!!

Code C#
-------

XmlTextReader reader = new XmlTextReader("xml path");
XmlNodeType type;reader.WhitespaceHandling = WhitespaceHandling.None;
bool bFlag = false;
while(reader.Read())
{
type = reader.NodeType;
if(type == XmlNodeType.Element && reader.Name == "JunkData") //#1
{
bFlag = true;
}
if(bFlag) //#2
{
if(type == XmlNodeType.CDATA)
{
string s = reader.ReadString();
MessageBox.Show(s); //#3
}
}

}

Notes
------
#1) Here you can add more logic to ensure you access the right CDATA if you have more than one.
#2) This ensures that unnecessary code execution are not happening
#3) This should display the contents between "CDATA[" and "]"

Regards,
James Poulose
james....@yahoo.co.uk


BizTalk Utilities - Frustration free BizTalk Adapters
http://www.topxml.com/biztalkutilities

0 new messages