New BWT-based archiver

Binder Edgar

unread,

Apr 13, 2000, 3:00:00 AM4/13/00

to

Hi,

DC was mentioned a few times on comp.compression, so I
thought it might be of some interest if I make it publical.
An implementation can be found at:

ftp://ftp.elf.stuba.sk/pub/pc/pack/dc124.zip

Results (default):
732k - Calgary
378k - Canterbury
893k - ACT Text

Results (-cprea, no preprocessing):
769k - Calgary
487k - Canterbury
929k - ACT Text

DC is also quite fast for a full-order blocksorter... (on
text about 250k/sec on my P166)

--
Edgar

mikael.l...@spray.se

unread,

Apr 14, 2000, 3:00:00 AM4/14/00

to

Hi again Edgar!
First I must congratulate you for a very well done job!
Do you use the radixsort by Bentley and McIlroy?
How much memory does dc use? 5n + C or 6n + C ... ?
The processor test was a little strange ... why do it?
I never got the same results as you though.
But you must have compressed file by file in the corporas.
It never the less gives amazing results!
I recognize most of it in your coder but I'm not sure about the distance
coder though. Is it from any of the dcc papers (e.g. dcc98 something)?
That is the output of it is symbol by symbol using a count of where it
was last seen. E.g. aaabaacccca may become 00010436000
coding abc in this order.

Mikael

In article <38F5F805...@T-Online.de>,

Sent via Deja.com http://www.deja.com/
Before you buy.

Binder Edgar

unread,

Apr 14, 2000, 3:00:00 AM4/14/00

to

Hi,

mikael.l...@spray.se wrote:

> Hi again Edgar!
> First I must congratulate you for a very well done job!

Thanks... :-)

> Do you use the radixsort by Bentley and McIlroy?

No, I use my own sorting algorithm (AFAIK it isn't published), with
average complexity O(n*log log n) - and of course worst case O(n*n*log log
n) (maybe better than that, but still to slow to be practical on skewed
data)

> How much memory does dc use? 5n + C or 6n + C ... ?

Currently 6n+C, but I can reduce that to 5n+C with a some loss of
speed...

> The processor test was a little strange ... why do it?

Well, because I use some CPU instructions that aren't supported on all
processors. For example Szymon's machine doesn't support RDTSC (that's the
reason I introduced this test actually :-). Anyway, you can make it disapper
with the switch -v1..4, or with -n...

> I never got the same results as you though.

I compressed the corporas with dc *.*, respectively dc -cprea *.*

> But you must have compressed file by file in the corporas.
> It never the less gives amazing results!

Thanks!

> I recognize most of it in your coder but I'm not sure about the distance
> coder though. Is it from any of the dcc papers (e.g. dcc98 something)?
> That is the output of it is symbol by symbol using a count of where it
> was last seen. E.g. aaabaacccca may become 00010436000
> coding abc in this order.

No, it's not that coder, I read that paper much later after DC was
allready implemented. Yet, my distance coder has some similarities with the
one described in the DCC paper (and a nice feature, it reflects exactly the
order 0 entropy of the input, not like MTF). I'll write a paper about it
when I get the time....

--
Edgar

Vadim Yoockin

unread,

Apr 17, 2000, 3:00:00 AM4/17/00

to

Hello, Binder Edgar ! You wrote:

DC is a magnificent compressor! A very strong compression
at a good speed.

> No, I use my own sorting algorithm (AFAIK it isn't published), with
>average complexity O(n*log log n) - and of course worst case O(n*n*log log
>n) (maybe better than that, but still to slow to be practical on skewed
>data)

Yes, a file obtained by copy /b file+file file2 made DC to think :(
Have you tested the quadrants like BZIP2, BWC?
If your sorting algorithm allow it, of course...

The strange for me is the small difference between compression and
decompression speed. Comparing to my YBS, DC is 20% faster on
compression, but 20% slower on decompression...

> Well, because I use some CPU instructions that aren't supported on all
>processors. For example Szymon's machine doesn't support RDTSC (that's the
>reason I introduced this test actually :-). Anyway, you can make it
disapper
>with the switch -v1..4, or with -n...

Does DC use MMX-instructions?

With best regards,
Vadim.

kaptke...@my-deja.com

unread,

Apr 17, 2000, 3:00:00 AM4/17/00

to

In article <38fb...@news.ptt.ru>,

"Vadim Yoockin" <v...@thermosyn.com> wrote:
> Hello, Binder Edgar ! You wrote:
>
> DC is a magnificent compressor! A very strong compression
> at a good speed.
>
> > No, I use my own sorting algorithm (AFAIK it isn't published),
with
> >average complexity O(n*log log n) - and of course worst case O
(n*n*log log
> >n) (maybe better than that, but still to slow to be practical on
skewed
> >data)
>
> Yes, a file obtained by copy /b file+file file2 made DC to think :(
> Have you tested the quadrants like BZIP2, BWC?
> If your sorting algorithm allow it, of course...
>
> The strange for me is the small difference between compression and
> decompression speed. Comparing to my YBS, DC is 20% faster on
> compression, but 20% slower on decompression...

I have an alternative coder which is appoximately 1.8x the encoding
speed of the fastest arithmetic coders and more than double the
decoding speed. It can achieve compression ratios that are
approximately the same as arithmetic coders (i.e. Range coder)and does
not require the MTF and eliminates the need to RLE all together. If
you like, I will make available the beta exe. so that we can all see
what the fastest blocksorters can do with my fast coder.

I tried DC out over the weekend. damn that some good work.

-Michael A Maniscalco
michaelm...@homepage.com

>
> > Well, because I use some CPU instructions that aren't supported
on all
> >processors. For example Szymon's machine doesn't support RDTSC
(that's the
> >reason I introduced this test actually :-). Anyway, you can make it
> disapper
> >with the switch -v1..4, or with -n...
>
> Does DC use MMX-instructions?
>
> With best regards,
> Vadim.
>
>

Binder Edgar

unread,

Apr 17, 2000, 3:00:00 AM4/17/00

to

Hi,

Vadim Yoockin wrote:

> DC is a magnificent compressor! A very strong compression
> at a good speed.

Thank you!

> > No, I use my own sorting algorithm (AFAIK it isn't published), with

> >average complexity O(n*log log n) - and of course worst case O(n*n*log log

> >n) (maybe better than that, but still to slow to be practical on skewed
> >data)
>
> Yes, a file obtained by copy /b file+file file2 made DC to think :(
> Have you tested the quadrants like BZIP2, BWC?
> If your sorting algorithm allow it, of course...

Not yet. I have at the moment much to do, and just not enough time :-)

> The strange for me is the small difference between compression and
> decompression speed. Comparing to my YBS, DC is 20% faster on
> compression, but 20% slower on decompression...

That's because distance coding isn't symmetrical. The encoding algorithm
differs substancially from the decoding algorithm, and I didn't optimized yet
the latter...

> Does DC use MMX-instructions?

No, not the public version...

> With best regards,
> Vadim.

--
Edgar

Pred.

unread,

Apr 17, 2000, 3:00:00 AM4/17/00

to

Good work Edgar! Can you elaborate briefly on your modelling technique?
No patents pending, I hope :-)

> DC was mentioned a few times on comp.compression, so I
> thought it might be of some interest if I make it publical.
> An implementation can be found at:
>
> ftp://ftp.elf.stuba.sk/pub/pc/pack/dc124.zip

[...]

Binder Edgar

unread,

Apr 18, 2000, 3:00:00 AM4/18/00

to

Hi,

kaptke...@my-deja.com wrote:

> I have an alternative coder which is appoximately 1.8x the encoding
> speed of the fastest arithmetic coders and more than double the
> decoding speed. It can achieve compression ratios that are
> approximately the same as arithmetic coders (i.e. Range coder)and does
> not require the MTF and eliminates the need to RLE all together. If
> you like, I will make available the beta exe. so that we can all see
> what the fastest blocksorters can do with my fast coder.

Hmm, that sounds interresting. Yes, I'd love to see your alternative
coder...

> I tried DC out over the weekend. damn that some good work.

:-)

--
Edgar

Binder Edgar

unread,

Apr 18, 2000, 3:00:00 AM4/18/00

to

"Pred." wrote:

> Good work Edgar! Can you elaborate briefly on your modelling technique?

Well, the distance coder encodes as the name says distances between the
places where the same symbols appears. The main difference from the
algorithm presented at DCC is that I don't do a step for each symbol (256
steps altogether), but I encode everything at once. This leads to some
interresting properties that improve compression substancially...

> No patents pending, I hope :-)

No, and I don't intend to patent it... I find patents horrible, they
should be prohibited :-)

--
Edgar

kaptke...@my-deja.com

unread,

Apr 18, 2000, 3:00:00 AM4/18/00

to

In article <38FC33B5...@T-Online.de>,

Binder Edgar <Binde...@T-Online.de> wrote:
>
>
> "Pred." wrote:
>
> > Good work Edgar! Can you elaborate briefly on your modelling
technique?
>
> Well, the distance coder encodes as the name says distances
between the
> places where the same symbols appears. The main difference from the
> algorithm presented at DCC is that I don't do a step for each symbol
(256
> steps altogether), but I encode everything at once. This leads to some
> interresting properties that improve compression substancially...

This is very interesting because it sounds a lot like my M99 coder
technique. A whole lot. But I have been thinking of modeling the data
symbol type by symbol type lately, instead of just using my coder which
encodes all symbols at the same time. This might have its advantages.
That is take a stream like AAABBBCCAABBAAAAACCC. We could encode this
by offset starting with A like so ... 0005020000 This make the
remaining characters "move together" yeilding BBBCCBBCCC and we then
only have to encode the B's. We don't even need to encode the C's at
all. Just make a note in the coded stream as to how many charaters
were in the stream all together and what symbol type is the one left
out. There could be an advantage in choosing a wise pattern for which
symbols to code first. for instance after say a MTF, we could encode
all but the zeros. The reason I think this might work is because with
a coder such as mine, fixed length codes are used however, the size of
the code makes no difference what so ever to the compression ratio so
it's possible to code large offsets that would be more than one byte
without any penalty.

-Michael A. Maniscalco
www.michaelmaniscalco.homepage.com

> > No patents pending, I hope :-)
>
> No, and I don't intend to patent it... I find patents horrible,
they
> should be prohibited :-)
>
> --
> Edgar
>
>

kaptke...@my-deja.com

unread,

Apr 18, 2000, 3:00:00 AM4/18/00

to

In article <38FC2EAB...@T-Online.de>,

Binder Edgar <Binde...@T-Online.de> wrote:
> Hi,
>
> kaptke...@my-deja.com wrote:
>
> > I have an alternative coder which is appoximately 1.8x the encoding
> > speed of the fastest arithmetic coders and more than double the
> > decoding speed. It can achieve compression ratios that are
> > approximately the same as arithmetic coders (i.e. Range coder)and
does
> > not require the MTF and eliminates the need to RLE all together. If
> > you like, I will make available the beta exe. so that we can all see
> > what the fastest blocksorters can do with my fast coder.
>
> Hmm, that sounds interresting. Yes, I'd love to see your
alternative
> coder...
>

I'll put the beta on my site. The beta has no BWT component. This is
because I want the beta to demonstrate the codec speed and not blur
this with the speed of my currently less than speedy BWT
implementation. But you can test the coder by encoding/decoding pre-
BWTed streams. This will tell us if my coder can encode differently
modeled BWT data as well as arithmetic in different cases. The only
catch is that the beta M99 coder compresses one byte fixed length
codes, so if the technique used to model the data generates anything
other than one byte codes, this current exe. will probably not do much.
However, the size of the fixed length code is of no importance to the
actual coding method so in the near future, I will code the second
version which will code two byte codes. This will make it possible to
work with data that is modeled by say PPM. It will take the
probabilties that would normally go to an arithmetic coder and choose
one two byte number which falls inside of the range to represent it.
But that's a bit in the future.
At any rate, i'll put the beta M99 coder at my site at
www.michaelmaniscalco.homepage.com

-michael maniscalco

> > I tried DC out over the weekend. damn that some good work.
>

Binder Edgar

unread,

Apr 18, 2000, 3:00:00 AM4/18/00

to

Hi,

kaptke...@my-deja.com wrote:

[DC description]

> This is very interesting because it sounds a lot like my M99 coder
> technique. A whole lot. But I have been thinking of modeling the data
> symbol type by symbol type lately, instead of just using my coder which
> encodes all symbols at the same time. This might have its advantages.
> That is take a stream like AAABBBCCAABBAAAAACCC. We could encode this
> by offset starting with A like so ... 0005020000 This make the
> remaining characters "move together" yeilding BBBCCBBCCC and we then
> only have to encode the B's. We don't even need to encode the C's at

Yes, these are inverted frequencies. Though I don't think this is the
best way to do it....

> There could be an advantage in choosing a wise pattern for which
> symbols to code first. for instance after say a MTF, we could encode
> all but the zeros.

Hmm, I'm not sure if different patterns will give different compression
ratios - however they give different speeds so it worths doing it anyway...

--
Edgar

Binder Edgar

unread,

Apr 18, 2000, 3:00:00 AM4/18/00

to

Hi,

kaptke...@my-deja.com wrote:

[M99]

> I'll put the beta on my site. The beta has no BWT component. This is

Okay, I downloaded M99, and it seems to work correctly, and quite
fast... Good work!

I there going to be a M99 paper or something describing this method ?

--
Edgar

kaptke...@my-deja.com

unread,

Apr 20, 2000, 3:00:00 AM4/20/00

to

In article <38FC813E...@T-Online.de>,

I'm sure at some point I'll document it but, what little time I have
these days, I am dedicating to fast and efficient modeling techniques
to use prior to my coder. And, perhaps to optimizing the M99 coder.
It's fairly well written as it stands but, there is much room for small
improvements. I think if I ever get the time, an assm version might
gain 20% speed or so.

-Michael Maniscalco

mikael.l...@spray.se

unread,

Apr 20, 2000, 3:00:00 AM4/20/00

to

Hi!
Assembler ...
Yeah, this is great. However, what if most of us want to avoid doing
work on a slow, limited and generally annoying platform that the
average wintel machine truly is? What if we want to use say a
power macintosh or sun ultra sparc or even linux?
If we really want our software to be used we should make it easily
portable to at least most platforms!

20% ??? My experience is just a couple of percent. It's not worth it!

Mikael

In article <8dlq7c$3q7$1...@nnrp1.deja.com>,

kaptke...@my-deja.com wrote:
> In article <38FC813E...@T-Online.de>,
> Binder Edgar <Binde...@T-Online.de> wrote:
> > Hi,
> >
> > kaptke...@my-deja.com wrote:
> >
> > [M99]
> >
> > > I'll put the beta on my site. The beta has no BWT component. This
> is
> >
> > Okay, I downloaded M99, and it seems to work correctly, and quite
> > fast... Good work!
> >
> > I there going to be a M99 paper or something describing this
> method ?
>
> I'm sure at some point I'll document it but, what little time I have
> these days, I am dedicating to fast and efficient modeling techniques
> to use prior to my coder. And, perhaps to optimizing the M99 coder.
> It's fairly well written as it stands but, there is much room for small
> improvements. I think if I ever get the time, an assm version might
> gain 20% speed or so.
>

Binder Edgar

unread,

Apr 20, 2000, 3:00:00 AM4/20/00

to

Hi,

mikael.l...@spray.se wrote:

[asm optimizing the M99 coder]

> Assembler ...
> Yeah, this is great. However, what if most of us want to avoid doing
> work on a slow, limited and generally annoying platform that the
> average wintel machine truly is? What if we want to use say a
> power macintosh or sun ultra sparc or even linux?
> If we really want our software to be used we should make it easily
> portable to at least most platforms!

Yes, that's a strong argument...

> 20% ??? My experience is just a couple of percent. It's not worth it!

Hmm, my experience is 30-50%, and on some rare cases up to 150% (the
permutation filter in DC). One can do much better than the compiler with a
good asm optimization manual (I recommend Agner Fog's manual,
http://www.agner.org)

--
Edgar

mikael.l...@spray.se

unread,

Apr 20, 2000, 3:00:00 AM4/20/00

to

Hello!
Try using Djgpp (Gnu C++).
I've compared it to Microsuck Visual C++.
Djgpp compiler generates programs that are 2-3 times faster.
Sometimes even more.
So, obviously it's important to choose right compiler!

Mikael

In article <38FF45A6...@T-Online.de>,

kaptke...@my-deja.com

unread,

Apr 21, 2000, 3:00:00 AM4/21/00

to

In article <38FF45A6...@T-Online.de>,
Binder Edgar <Binde...@T-Online.de> wrote:
> Hi,
>
> mikael.l...@spray.se wrote:
>
> [asm optimizing the M99 coder]
>
> > Assembler ...
> > Yeah, this is great. However, what if most of us want to avoid doing
> > work on a slow, limited and generally annoying platform that the
> > average wintel machine truly is? What if we want to use say a
> > power macintosh or sun ultra sparc or even linux?
> > If we really want our software to be used we should make it easily
> > portable to at least most platforms!
>
> Yes, that's a strong argument...
>
> > 20% ??? My experience is just a couple of percent. It's not worth
it!
>
> Hmm, my experience is 30-50%, and on some rare cases up to 150%
(the
> permutation filter in DC). One can do much better than the compiler
with a
> good asm optimization manual (I recommend Agner Fog's manual,
> http://www.agner.org)
>
> --
> Edgar
>
>

Thanks for the info Edgar. I'll download and digest it today.
-michael

Binder Edgar

unread,

Apr 21, 2000, 3:00:00 AM4/21/00

to

Hi,

mikael.l...@spray.se wrote:

[using asm]

> Try using Djgpp (Gnu C++).
> I've compared it to Microsuck Visual C++.
> Djgpp compiler generates programs that are 2-3 times faster.
> Sometimes even more.
> So, obviously it's important to choose right compiler!

Hmm, this is getting offtopic :-) I'm not a C/C++ programmer, but I
don't remember Djgpp beeing so much faster than Visual C++. Well, you
can probably compare the filters from DC (all written in optimized
ASM) with the Djgpp compiled filters, and if they're less than 10%
faster, I'll convert everything to C.. :-)

--
Edgar