Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

XML-WRT 2.0 (XML compressor) has been released

2 views
Skip to first unread message

ini...@gmail.com

unread,
Jun 14, 2006, 5:22:23 AM6/14/06
to
XML-WRT 2.0 (XML compressor) has been released at:
http://www.ii.uni.wroc.pl/~inikep/research/XML/XML-WRT20.zip

New functions in XML-WRT 2.0 (14.06.2006):
-internal zlib and LZMA compression
-input XML file is split into containers depend on start-tags and
end-tags
and content under the same tag is sent to the same container
-container for dates in format 1980-02-31 and 01-MAR-1920
-container for times in format 11:30pm
-container for numbers from 1900 to 2155 (years)
-container for pages in format "x-y", where y-x<256, eg. "120-148",
"1480-1600"
-container for numbers in format "x-y", eg. "1234-0", "87-623"
-container for two digits after period, eg. "102.00", "12.01"
-container for numbers from 0.0 to 24.9 (one digit after period), eg.
"12.0", "9.9"
-urls (statring from "http:"), e-mails (x@y.z), "&uuml;" added to
dynamic
dictionary

best regards,
Przemyslaw

Matt Mahoney

unread,
Jun 15, 2006, 4:13:30 PM6/15/06
to

Updated compression results for xml-wrt + ppmonstr are posted here:
http://cs.fit.edu/~mmahoney/compression/text.html

Compression ratio is improved slightly, from .1547 to .1542 on enwik9.
I haven't tested the standalone compression yet.

-- Matt Mahoney

0 new messages