Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

xslt transform speed on huge files.

2 views
Skip to first unread message

Sven Bakke

unread,
May 23, 2003, 5:27:58 AM5/23/03
to
I have a 30 MB xml file, which I want to transform using xslt.
The xslt transformation is working but my problem is that its extremely slow
on huge files. Is there anything I can do with the xslt file to speed it up?


The transformation groups all member nodes with the same id under a name
node with the corresponding id.

The xslt file I use to transform the xml file:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes" />
<xsl:template match="/">
<person>
<xsl:apply-templates/>
</person>
</xsl:template>
<xsl:template match="name">
<xsl:choose>
<xsl:when test="position() = 1" >
<xsl:choose>
<xsl:when test="following-sibling::name[1]/id = id">
<xsl:text disable-output-escaping="yes"><![CDATA[<name>]]></xsl:text>
<xsl:copy-of select="* [not(self::name)]"/>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:when test="position() = last()" >
<xsl:choose>
<xsl:when test="preceding-sibling::name[1]/id = id">
<xsl:for-each select="member">
<xsl:copy-of select="."/>
</xsl:for-each>
<xsl:text
disable-output-escaping="yes"><![CDATA[</name>]]></xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:when test="preceding-sibling::name[1]/id = id">
<xsl:choose>
<xsl:when test="following-sibling::name[1]/id = id">
<xsl:for-each select="member">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<xsl:for-each select="member">
<xsl:copy-of select="."/>
</xsl:for-each>
<xsl:text
disable-output-escaping="yes"><![CDATA[</name>]]></xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:when test="following-sibling::name[1]/id = id">
<xsl:choose>
<xsl:when test="preceding-sibling::name[1]/id = id">
<xsl:for-each select="member">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:when>
<xsl:otherwise>
<xsl:text disable-output-escaping="yes"><![CDATA[<name>]]></xsl:text>
<xsl:copy-of select="* [not(self::name)]"/>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>


The structure of the XML file look like this:

<person>
<name>
<tull>23</tull>
<id>1</id>
<member>
<id>1</id>
<lastname>Hill</lastname>
<firstname>John</firstname>
</member>
</name>
<name>
<tull>23</tull>
<id>1</id>
<member>
<id>1</id>
<lastname>Andersen</lastname>
<firstname>Ole</firstname>
</member>
</name>
<name>
<tull>23</tull>
<id>1</id>
<member>
<id>1</id>
<lastname>Eriksen</lastname>
<firstname>Tor</firstname>
</member>
</name>
<name>
<tull>23</tull>
<id>2</id>
<member>
<id>2</id>
<lastname>Bryggen</lastname>
<firstname>Petter</firstname>
</member>
</name>
<name>
<tull>23</tull>
<id>5</id>
<member>
<id>5</id>
<lastname>Pedersen</lastname>
<firstname>Ole</firstname>
</member>
</name>
</person>

The result file looks like this:

<person>
<name>
<tull>23</tull>
<id>1</id>
<member>
<id>1</id>
<lastname>Hill</lastname>
<firstname>John</firstname>
</member>
<member>
<id>1</id>
<lastname>Andersen</lastname>
<firstname>Ole</firstname>
</member>
<member>
<id>1</id>
<lastname>Eriksen</lastname>
<firstname>Tor</firstname>
</member>
</name>
<name>
<tull>23</tull>
<id>2</id>
<member>
<id>2</id>
<lastname>Bryggen</lastname>
<firstname>Petter</firstname>
</member>
</name>
<name>
<tull>23</tull>
<id>5</id>
<member>
<id>5</id>
<lastname>Pedersen</lastname>
<firstname>Ole</firstname>
</member>
</name>
</person>

Any help will be greatly appreciated.

Regards
Sven Bakke.

Sergey Dubinets

unread,
May 23, 2003, 6:44:39 PM5/23/03
to
Do you use XPathDocument or XmlDocument? You may want to try another one.

I afraid that actual problem is that XslTransform in System.Xml is not smart
enough to notice that following-sibling::name[1] is only first node and
tries to compare positions of all following-sibling::name nodes to 1.
In this case we have N* N algorithm.

MSXML 4.0 does this optimization.

Sergey
--
This posting is provided "AS IS" with no warranties, and confers no rights.

"Sven Bakke" <sven....@itsolutions.no> wrote in message
news:bakpj1$dt1$1...@news.tdcnorge.no...

Oleg Tkachenko

unread,
May 25, 2003, 5:03:04 AM5/25/03
to
Sven Bakke wrote:

> I have a 30 MB xml file, which I want to transform using xslt.
> The xslt transformation is working but my problem is that its extremely slow
> on huge files. Is there anything I can do with the xslt file to speed it up?
>
>
> The transformation groups all member nodes with the same id under a name
> node with the corresponding id.

When grouping on big XML documents it's usually much effective to use keys.
Try and measure this transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:key name="idKey" match="name" use="id"/>
<xsl:template match="person">
<person>
<xsl:apply-templates
select="name[generate-id()=generate-id(key('idKey',id)[1])]"/>


</person>
</xsl:template>
<xsl:template match="name">

<name>
<xsl:copy-of select="*[not(self::member)]"/>
<xsl:copy-of select="key('idKey', id)/member"/>
</name>
</xsl:template>
</xsl:stylesheet>

--
Oleg Tkachenko
http://www.tkachenko.com/blog
Multiconn Technologies, Israel

Sven Bakke

unread,
May 27, 2003, 9:49:15 AM5/27/03
to
Thanks.

With your xslt transformation file, and using MSXML 4 the transformation was
completed in a few seconds.

Regards
Sven Bakke


"Oleg Tkachenko" <ol...@multiconn.com> wrote in message
news:urVaCPpI...@TK2MSFTNGP10.phx.gbl...

0 new messages