Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PAQ7 released

2 views
Skip to first unread message

Matt Mahoney

unread,
Dec 24, 2005, 5:06:16 PM12/24/05
to
I just posted my newest compressor, PAQ7, to
http://cs.fit.edu/~mmahoney/compression/#paq7
Compression is similar to PAQAR but about 3 times faster (still very
slow). It includes models for color .bmp, .tiff, and .jpeg images, so
gives better compression on these (but not as good as WInRK or Stuffit
- how they do this well eludes me). It lacks a dictionary and a x86
model so compression is a bit worse than PAsQDa on English text and
32-bit WIndows .exe and .dll files.

This is a complete rewrite of PAQ6. It differs primarily in that it
replaces the gradient descent model mixer with a neural network, which
can be accelerated using MMX assembler (thus the better speed). For
non x86-32 machines or if you don't have NASM you can compile with
-DNOASM (1/3 slower). I tested it under WIndows, Linux and Sparc
Solaris for archive compatibility.

I will let Werner test on the maximumcompression.com corpus but in my
own tests it takes first place on ohs.doc (due to a large embedded
jpeg, which Stuffit missed), and english.dic, and second place on a
couple other files.

I don't know how Stuffit models jpeg (I haven't seen their patent) but
what I did was partially decode the image back to the DCT coefficients
to provide context for the Huffman coded data.

I plan to add more models to PAQ8 but I wanted to get something
released this year.

-- Matt Mahoney

Matt Mahoney

unread,
Dec 24, 2005, 7:08:52 PM12/24/05
to
Matt Mahoney wrote:
> I just posted my newest compressor, PAQ7, to
> http://cs.fit.edu/~mmahoney/compression/#paq7

Some results on the Calgary corpus. There are 5 memory level settings.
They all run about the same speed. Tested on a 2.2 GHz Athlon-64
3500+ with 1 GB RAM (in 32 bit mode under WinXP). This is a solid
archive.

paq7 -1 (lowest memory setting, about 56 MB)
111261 BIB: -> 21592
768771 BOOK1: -> 199430
610856 BOOK2: -> 123431
102400 GEO: -> 44832
377109 NEWS: -> 89025
21504 OBJ1: -> 7799
246814 OBJ2: -> 49829
53161 PAPER1: -> 11331
82199 PAPER2: -> 17578
513216 PIC: -> 23303
39611 PROGC: -> 8788
71646 PROGL: -> 10236
49379 PROGP: -> 7294
93695 TRANS: -> 11263
3141622 -> 625924 (1.5939 bpc) in 172.70 sec (18.191 KB/sec)
Time 172.70 sec, memory 56440419 bytes

paq7 -5 (highest memory setting, about 500 MB)
111261 BIB: -> 21493
768771 BOOK1: -> 194933
610856 BOOK2: -> 120373
102400 GEO: -> 44561
377109 NEWS: -> 86395
21504 OBJ1: -> 7755
246814 OBJ2: -> 48216
53161 PAPER1: -> 10809
82199 PAPER2: -> 16812
513216 PIC: -> 23201
39611 PROGC: -> 8628
71646 PROGL: -> 10189
49379 PROGP: -> 7299
93695 TRANS: -> 10827
3141622 -> 611684 (1.5576 bpc) in 177.66 sec (17.684 KB/sec)
Time 177.67 sec, memory 525842019 bytes

paq7 -3 (default setting, 150 MB) on a 750 MHz Duron (192 MB memory)
under WinMe:
111261 BIB: -> 21500
768771 BOOK1: -> 195555
610856 BOOK2: -> 120863
102400 GEO: -> 44643
377109 NEWS: -> 86932
21504 OBJ1: -> 7744
246814 OBJ2: -> 48670
53161 PAPER1: -> 10905
82199 PAPER2: -> 16970
513216 PIC: -> 23229
39611 PROGC: -> 8652
71646 PROGL: -> 10172
49379 PROGP: -> 7272
93695 TRANS: -> 10909
3141622 -> 614209 (1.5641 bpc) in 710.08 sec (4.424 KB/sec)
Time 710.13 sec, memory 150320739 bytes

-- Matt Mahoney

giorgi...@email.it

unread,
Dec 27, 2005, 4:58:12 AM12/27/05
to
Those seems really nice improvements over PAQ6, and with incoming PAQ8
planning more models (and executable-wise compression) the project seem
becoming even more interesting!

werner....@gmail.com

unread,
Dec 30, 2005, 12:02:56 AM12/30/05
to
> I will let Werner test on the maximumcompression.com corpus but in my
> own tests it takes first place on ohs.doc (due to a large embedded
> jpeg, which Stuffit missed), and english.dic, and second place on a
> couple other files.


The maximum compression site has just been updated. Not only PAQ7 was
added, but also the previous #1 and #2 listed programs (WinRK and
PAsQDa) are updated. Don't forget to have a look at the DOC test :)

On 'best overall compression program' WinRK is ranked 1st, PAQ7 2nd and
PAsQDa 3th. On the 'real life' multiple files test PAQ7 doesn't show
it's full potential yet, but this will change when the exe,wav,txt
models are in place...

http://www.maximumcompression.com/

0 new messages