CMM fast context mixing compressor

23 views
Skip to first unread message

toffer

unread,
Apr 20, 2008, 9:23:00 AM4/20/08
to encode_ru_f...@googlegroups.com


Hello!

After playing around with several different x86 transforms i integrated the one with best overall performance into cmm4. Here's a smaller new release containing a filter framework (most of the work) and an implementation for a executable filter.

http://freenet-homepage.de/toffer_86/cmm4_01e_080420.7z

The improvement is quiet obvious on SFC :) I might even reach the 10mb "sonic barrier" :) after applying some tweaks and adding more filters.

I wonder why Matt didn't include a cmm4 test into his LTCB. The performance on enwik9 is significantly improved since 0.1c.

To Nania: If you add this one to moc, please correct the version number and memory requirements.

Have fun!

LovePimple

unread,
Apr 20, 2008, 10:45:00 AM4/20/08
to encode_ru_f...@googlegroups.com


Thanks Chris!

Mirror: Download

LovePimple

unread,
Apr 20, 2008, 11:44:00 AM4/20/08
to encode_ru_f...@googlegroups.com


Quick test...

Test file: ENWIK8

Test machine: AMD Sempron 2400+, Windows XP SP2.

Setting: 76


Compressed size: 19.6 MB (20,569,034 bytes)

Ratio: 20569032/100000000 bytes (1.65 bpc)
Speed: 173 kB/s (5619.4 ns/byte)
Time: 561.94 s

LovePimple

unread,
Apr 20, 2008, 12:10:00 PM4/20/08
to encode_ru_f...@googlegroups.com


Another quick test...

Setting: 46

A10.jpg > 829,832
AcroRd32.exe > 1,188,935
arc > 2,114
english.dic > 453,648
FlashMX.pdf > 3,651,244
FP.LOG > 446,317
MSO97.DLL > 1,597,197
ohs.doc > 750,711
rafale.bmp > 740,481
vcfiu.hlp > 514,204
world95.txt > 454,916

Total = 10,629,599 bytes

osmanturan

unread,
Apr 21, 2008, 5:58:00 AM4/21/08
to encode_ru_f...@googlegroups.com


@toffer
Really good work!

I want to share what I have found interesting with my neural mixer. Maybe, you have already done like these tests. But, I think you must know :)

Currently, I'm not using any SSE or APM stage after mixing stage. Just mixing order 2-1-0 with suffix tree like implementation (not hashed) at order-1 ROLZ literal coding. ROLZ part uses flexible parsing which is same as quad. Whole compression algorithm optimized for 4-byte aligned binary files. I noticed, context selection for neurons is one of the best important thing in my momentum-term based neural mixer. After doing some tests, I have really interesting results:
- Text compression is really really bad! On SFC test FP.log, I have %200 worse compression when I compare my old literal coder (simple order-1).
- Binary compression (especially on ISO files) I have really good results which outperforms 7-zip Ultra, rzm 0.07e and RAR Best. For example my compressor compressed ~258 MB Intel C 10 ISO file ~+20 MB better when we compare the other compressors Also, when we compare rzm and 7zip with my coder, rzm and 7zip uses optimal parsing!
- Calgary corpus compression is not good enough. I think, my coder suffers from text files in TAR version.

I really interested in SSE/APM stage. Maybe the compression will be better. But, I can't understand what's going on in SSE

When I have done some other tweaks, I would like to post exact results with a release. I hope, I can complete this task in this week.

Off-Topic: I have passed the mastering/PhD english exam with 72,5 on 100 points 50-55 points are enough for most of universities in my country.

Vacon

unread,
Apr 21, 2008, 6:03:00 AM4/21/08
to encode_ru_f...@googlegroups.com


Hello everyone,

Quoting: osmanturan
Off-Topic: I have passed the mastering/PhD english exam with 72,5 on 100 points


Congrats!

Best regards!

toffer

unread,
Apr 21, 2008, 8:50:00 AM4/21/08
to encode_ru_f...@googlegroups.com


Thanks for testing! When i've finished more filters i'll try to do auto detection, like christian does, since it's far more elegant and maybe more efficient (data segmentation, works for tars too).


@osmanturan

Congratulations for your english exam.

Before beginning a discussion, what do you want to know exactly (despite SSE)?

BTW: the main reason for me not to include a momentum term is the amount of memory which is heavily realted to a speed decrease due to cache misses (my weight vector completly fits ino a single cache line). I'm however sure, that the compression improvement is notable. If some well placecd prefetch instructions can compensate this, i'll give it a try. But at the moment i'm working on parameter optimization and creating a state machine from a large dataset's empirical distribution.

Nania Francesco Antonio

unread,
Apr 21, 2008, 12:01:00 PM4/21/08
to encode_ru_f...@googlegroups.com


Tanks Chris! Hi :)!

osmanturan

unread,
Apr 21, 2008, 4:40:00 PM4/21/08
to encode_ru_f...@googlegroups.com


@vacon: Thanks!
@toffer: Thanks!
My question is about SSE implementation. How can I process the neural mixer output with SSE or APM? What's the benefit of this process. I already know this improve compression, but how? My guess is something like that:

Single neural mixer is a fast learning phase which adapts small local changes. But, adaptation must be a long term on most data files. Because, small local changes do not reflect whole data characteristics. So, in this area SSE enables long-term learning. So, I think a second neural mixer with low learning rate can do this already. If these are true, neural mixer based "long term learning" can beat any previous implementations due to neural network nature. Last thing, another guess is long-term based learning only benefits large files. On small files compression can hurt. Am I totally wrong?

Also, I would like to share some 2D graphs which based on two variables: learning rate and momentum-term coefficient. I think, these graphs will help us a lot. I hope, I will be able to find some spare time for generating these plots.

Reply all
Reply to author
Forward
0 new messages