Interesting - in my earlier wiki (which I wrote many years ago) I did the following
1) save the latest version (as is)
2) save the diff to the latest-1 version
3) concatenate the top version and all the diffs then compress
Now a funny thing happens - consider pure additions to a file
If we start off with K bytes (vsn1) and add N1 bytes
the size of the new file is K + N1 bytes and the size of the
diff is ~N1 bytes (essentially 'add <string> here')
So the total size is K + 2N1 bytes
After adding N1,N2,N3 bytes the total size is
K + 2*(N1+N2+N3+...) bytes
Now the size of the final file (displayed) is
K + N1 + N2 + N3
So asymptotically N1+N2+N3 ... >> K so we can ignore K
And so the size of the final file = 2 * size of all updates
And if we compress everything we win back a lot of space
The conclusion was we can store all old versions AND it's quite often smaller than
the text of the original page.
This is true for pure additions to a page - removing data is even better
I'll do this when I've got my server going :-)
/Joe