XSLT Processors Comparison

185 views
Skip to first unread message

dctrud

unread,
Dec 2, 2009, 5:45:44 AM12/2/09
to spctools-discuss
I've just done a quick comparison of the speeds of various XSLT
processors for transforming .prot.xml to html. There is a marked
difference between the processors, and xsltproc which is the TPP
default is not the quickest.

Tests performed on Ubuntu 9.0x 64-bit on a DELL R600 Dual Xeon 5500
32GB RAM. All processors are installed from their Ubuntu packages.
Input document was a large 200Mb .prot.xml file resulting from OMSSA
search of the 72-run MaxQuant dataset downloaded from ProteomeCommons:

xsltproc - 1011.96s
xalan - 1206.95
saxon-xslt - 491.95s
saxonb-xslt - 132.19s

saxonb-xslt works for me as a direct replacement for xsltproc in the
$xsltproc definition in protxml2html.pl

I've not tried the commercial Saxon-SA / Saxon-EE from Saxonica.com,
but they are supposedly faster still.

DT



dctrud

unread,
Dec 2, 2009, 7:28:39 AM12/2/09
to spctools-discuss
Have obtained a Saxon-EE evaluation to try it. Same .prot.xml file,
same server - 35.04s (3.5% of the xsltproc run-time). Down side is
that it costs 300 GBP per server.

DT

Brian Pratt

unread,
Dec 2, 2009, 11:53:18 AM12/2/09
to spctools...@googlegroups.com
Impressive!  I'm unclear, though, on the practicalities of how it replaces xsltproc (which is an executable) - presumably there's a script that invokes java?  In which case we have a TPP java dependency we didn't have before - not that this is necessarily an insurmountable problem, and one we'll probably have to address sooner than later anyway. 
 
Brian Pratt

--

You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To post to this group, send email to spctools...@googlegroups.com.
To unsubscribe from this group, send email to spctools-discu...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.



dctrud

unread,
Dec 2, 2009, 12:17:09 PM12/2/09
to spctools-discuss
Brian,

The Ubuntu libsaxonb-java package in the universe repository installs
the script /usr/bin/saxonb-xslt which fires up saxonb under Java. It
expects filenames to be specified as the TPP does to Xalan, i.e.

saxonb-xslt <xml file> <xsl file>

... so it will work if you just replace references to /usr/bin/
xsltproc with /usr/bin/saxonb-xslt in the pl scripts.

My quick tests were done on a command-line xslt transform. Actual
performance when using protxml2html.pl via a web browser doesn't
improve as much since the resulting html is still huge, and takes time
to transfer and for the browser to display. It does seem very useful
for getting huge results files out into text format quickly by
invoking protxml2html on the command line though. Another thing to
note is that on very small files xsltproc is probably still faster due
to the overhead of starting up a JVM for Saxon.

The commercial Saxon-EE is very impressive with its 28x speed-up, but
the free speedup of 7.5x with saxonb is still very nice. I also tested
Saxon-HE (new open source version that is replacing Saxonb, but not
packaged for Ubuntu), and it's about the same as saxonb.

DT
> > spctools-discu...@googlegroups.com<spctools-discuss%2Bunsu...@googlegroups.com>
> > .

Luis Mendoza

unread,
Dec 14, 2009, 7:34:47 PM12/14/09
to spctools...@googlegroups.com
Hello Dave,
Thanks for the suggestion and research.  We are planning on developing a completely new prot-xml viewer within the next few months, since the current one has severe limitations when dealing with very large files.

In the meantime, David and I have just checked in a version of protxml2html that performs about 4-5 times faster than the previous one (still using xsltproc).  It would be interesting to see if this speed-up holds with other xslt engines; we may tweak this a bit more in the next few weeks before the full re-write.

Cheers,
--Luis


To unsubscribe from this group, send email to spctools-discu...@googlegroups.com.

Dave Trudgian

unread,
Dec 15, 2009, 9:14:41 AM12/15/09
to spctools...@googlegroups.com
Hi Luis,

I fetched the update script from svn and have obtained timings on our
server for the large test file which show that the performance of the
interact.xml+interact.xsl
transform has indeed changed for all processors:

xsltproc - 72.41s (prev 1011.96s)
xalan - 111.44s (prev 1206.95s)
saxon-xslt - 82.96s (prev 491.95s)
saxonb-xslt - 20.44s (prev 132.19s)

Certainly for us using saxonb-xslt it's now feasible to look at protxml
results with 4500+ protein IDs, the limitation definitely being the web
browser coping with the large HTML file rather than the xslt processing
being very slow.

DT


Luis Mendoza wrote:
> Hello Dave,
> Thanks for the suggestion and research. We are planning on developing a completely new prot-xml viewer within the next few months, since the current one has severe limitations when dealing with very large files.
>
> In the meantime, David and I have just checked in a version of protxml2html that performs about 4-5 times faster than the previous one (still using xsltproc). It would be interesting to see if this speed-up holds with other xslt engines; we may tweak this a bit more in the next few weeks before the full re-write.
>
> Cheers,
> --Luis
>
>
> On Wed, Dec 2, 2009 at 9:17 AM, dctrud <dct...@ccmp.ox.ac.uk<mailto:dct...@ccmp.ox.ac.uk>> wrote:
> Brian,
>
> The Ubuntu libsaxonb-java package in the universe repository installs
> the script /usr/bin/saxonb-xslt which fires up saxonb under Java. It
> expects filenames to be specified as the TPP does to Xalan, i.e.
>
> saxonb-xslt <xml file> <xsl file>
>
> ... so it will work if you just replace references to /usr/bin/
> xsltproc with /usr/bin/saxonb-xslt in the pl scripts.
>
> My quick tests were done on a command-line xslt transform. Actual
> performance when using protxml2html.pl<http://protxml2html.pl> via a web browser doesn't
> improve as much since the resulting html is still huge, and takes time
> to transfer and for the browser to display. It does seem very useful
> for getting huge results files out into text format quickly by
> invoking protxml2html on the command line though. Another thing to
> note is that on very small files xsltproc is probably still faster due
> to the overhead of starting up a JVM for Saxon.
>
> The commercial Saxon-EE is very impressive with its 28x speed-up, but
> the free speedup of 7.5x with saxonb is still very nice. I also tested
> Saxon-HE (new open source version that is replacing Saxonb, but not
> packaged for Ubuntu), and it's about the same as saxonb.
>
> DT
>
> On Dec 2, 4:53 pm, Brian Pratt <brian.pr...@insilicos.com<mailto:brian.pr...@insilicos.com>> wrote:
>> Impressive! I'm unclear, though, on the practicalities of how it replaces
>> xsltproc (which is an executable) - presumably there's a script that invokes
>> java? In which case we have a TPP java dependency we didn't have before -
>> not that this is necessarily an insurmountable problem, and one we'll
>> probably have to address sooner than later anyway.
>>
>> Brian Pratt
>>
>> On Wed, Dec 2, 2009 at 4:28 AM, dctrud <dct...@ccmp.ox.ac.uk<mailto:dct...@ccmp.ox.ac.uk>> wrote:
>>> Have obtained a Saxon-EE evaluation to try it. Same .prot.xml file,
>>> same server - 35.04s (3.5% of the xsltproc run-time). Down side is
>>> that it costs 300 GBP per server.
>>> DT
>>> On Dec 2, 10:45 am, dctrud <dct...@ccmp.ox.ac.uk<mailto:dct...@ccmp.ox.ac.uk>> wrote:
>>>> I've just done a quick comparison of the speeds of various XSLT
>>>> processors for transforming .prot.xml to html. There is a marked
>>>> difference between the processors, and xsltproc which is the TPP
>>>> default is not the quickest.
>>>> Tests performed on Ubuntu 9.0x 64-bit on a DELL R600 Dual Xeon 5500
>>>> 32GB RAM. All processors are installed from their Ubuntu packages.
>>>> Input document was a large 200Mb .prot.xml file resulting from OMSSA
>>>> search of the 72-run MaxQuant dataset downloaded from ProteomeCommons:
>>>> xsltproc - 1011.96s
>>>> xalan - 1206.95
>>>> saxon-xslt - 491.95s
>>>> saxonb-xslt - 132.19s
>>>> saxonb-xslt works for me as a direct replacement for xsltproc in the
>>>> $xsltproc definition in protxml2html.pl<http://protxml2html.pl>
>>>> I've not tried the commercial Saxon-SA / Saxon-EE from Saxonica.com,
>>>> but they are supposedly faster still.
>>>> DT
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "spctools-discuss" group.
>>> To post to this group, send email to spctools...@googlegroups.com<mailto:spctools...@googlegroups.com>.
>>> To unsubscribe from this group, send email to
>>> spctools-discu...@googlegroups.com<mailto:spctools-discuss%2Bunsu...@googlegroups.com><spctools-discuss%2Bunsu...@googlegroups.com<mailto:spctools-discuss%252Buns...@googlegroups.com>>
>>> .
>>> For more options, visit this group at
>>> http://groups.google.com/group/spctools-discuss?hl=en.
>>
>
> --
>
> You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
> To post to this group, send email to spctools...@googlegroups.com<mailto:spctools...@googlegroups.com>.
> To unsubscribe from this group, send email to spctools-discu...@googlegroups.com<mailto:spctools-discuss%2Bunsu...@googlegroups.com>.
> For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
>
>
>
>
> --
>
> You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
> To post to this group, send email to spctools...@googlegroups.com.
> To unsubscribe from this group, send email to spctools-discu...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
>


--
Dr. David Trudgian
Bioinformatician in Proteomics
University of Oxford

Mon-Thu: CCMP, Roosevelt Drive
Tel: (+44) (01865 2)87784

Friday : Dunn School of Pathology, S. Parks Rd.
Tel: (+44) (01865 2)75557




Reply all
Reply to author
Forward
0 new messages