A new version with compressed revision storage is in the works

0 views
Skip to first unread message

Luca

unread,
Sep 4, 2009, 7:30:41 PM9/4/09
to wikitrust-devel
Hi all,

I am working flat out on a new release of WikiTrust, that implements
high-efficiency revision compression on disk.
"Colored" revisions, that is, revisions annotated with trust,
authorship, and origin information, would be first grouped together
according to which page they belong to, in groups of 20-30, and then
compressed together and stored.
This achieves very large space savings compared to the current setup:
my estimate is that we can reduce the storage needs of the WikiTrust
extension to 1/4 of what they are now.
I am also implementing other tricks and optimizations to reduce the
bandwidth to disk that WikiTrust requires.

All these changes will make WikiTrust much more suited to the analysis
of very large wikis. Our current work is motivated by our desire to
do a demo on the English Wikipedia. However, all installations should
benefit from this.

The only negative is that the new version of WikiTrust will use an
incompatible DB layout, so that users would have to first remove the
wikitrust tables, then add the new tables and re-compute all
information. This should not be too much trouble, (it's three
commands total), and I think the benefits are worth it.

I will keep you up to date when I have something that works in the
luca branch.

Luca
Reply all
Reply to author
Forward
0 new messages