RZM - a dull ROLZ compression engine

73 views
Skip to first unread message

Christian

unread,
Apr 24, 2008, 3:55:00 PM4/24/08
to encode_ru_f...@googlegroups.com


One more new compile. Acutally, this includes a little bugfix, too. Now, memory allocation for compression really is ~258M. Additionally, compression and decompresssion speed are slightly faster than 0.07e.

Download RZM 0.07h

Bulat Ziganshin

unread,
Apr 24, 2008, 4:24:00 PM4/24/08
to encode_ru_f...@googlegroups.com


0.07e? ;)

Bulat Ziganshin

unread,
Apr 24, 2008, 4:25:00 PM4/24/08
to encode_ru_f...@googlegroups.com


this new site prohibits downloaders from Russia

Christian

unread,
Apr 24, 2008, 4:32:00 PM4/24/08
to encode_ru_f...@googlegroups.com


So, I'll upload all files to Rapidshare and their funny cats ... again.
But this has to wait for tomorrow - perhaps LovePimple can host the files before that.

Christian

unread,
Apr 24, 2008, 4:34:00 PM4/24/08
to encode_ru_f...@googlegroups.com


Quoting: Bulat Ziganshin
this new site prohibits downloaders from Russia

Maybe you can try to use a proxy. Anyway, sorry for the inconvenience.

encode

unread,
Apr 24, 2008, 4:38:00 PM4/24/08
to encode_ru_f...@googlegroups.com


Try out the Send Space:
http://www.sendspace.com/



Bulat Ziganshin

unread,
Apr 24, 2008, 5:16:00 PM4/24/08
to encode_ru_f...@googlegroups.com


i don't had more luck with rapidshare because there are many thousands users sharing the same IP due to our provider's NAT :)

actually, i'm not hungry about others' programs and LovePimple page is enough for me, i noted it just to inform you that it may be inconvenient for other users

Bulat Ziganshin

unread,
Apr 24, 2008, 5:24:00 PM4/24/08
to encode_ru_f...@googlegroups.com



successfully tried to upload & dl file here

LovePimple

unread,
Apr 24, 2008, 5:42:00 PM4/24/08
to encode_ru_f...@googlegroups.com


Thanks Chris! :)

Mirror: Download

Black_Fox

unread,
Apr 24, 2008, 7:27:00 PM4/24/08
to encode_ru_f...@googlegroups.com


How about a googlepages web? Shelwien is using that and there was no complaint so far. Additionaly, no dogs, cats, timeouts and whatsoever :)

Christian

unread,
Apr 25, 2008, 4:43:00 AM4/25/08
to encode_ru_f...@googlegroups.com


RZM now has a homepage to circumvent the web hosting nuisance.

Thanks for clue the BlackFox. The latest problems finally made me create a small home page. Curiously, googlepages is very slow for me - although, the homepage is like 2 or 3 kb.

encode

unread,
Apr 26, 2008, 10:24:00 AM4/26/08
to encode_ru_f...@googlegroups.com


By the way, Criss, can you explain how do you perform a string search? Do you use hashing or BT? And how do you deal with ROLZ nature? As you posted above, you keep an offset table:
offsets[win[i-1]][...]
Just thinking about an improved ROLZ scheme. With LZPM, I use hash chains, to determine the match index I keep the char count in the hash element with actual offset. If we will add a phrase per each byte, I think we may simplify the search...


Christian

unread,
Apr 26, 2008, 11:41:00 AM4/26/08
to encode_ru_f...@googlegroups.com


Quoting: encode
By the way, Criss, can you explain how do you perform a string search? Do you use hashing or BT?

In RZM I use binary trees. Of course, Slug uses hashing for speed reasons.
I described the idea of string-insertion for binary trees in the tornado 0.4 topic. If you understand it then string search is easy. Since RZM uses optimal parsing it needs to search at each position - therefore, you just search while inserting the string for the current position.

Quoting: encode
And how do you deal with ROLZ nature? As you posted above, you keep an offset table:
offsets[win[i-1]][...]

The size of each table is fixed (as it is hinted on Slug's homepage) to 64K entries. RZM adds one offset for each step - additionally it keeps some long distance offsets for better handling of compressed data.

Quoting: encode
If we will add a phrase per each byte, I think we may simplify the search...

Is this related to hash chaining? If not, what exactly do you mean?

encode

unread,
Apr 26, 2008, 12:09:00 PM4/26/08
to encode_ru_f...@googlegroups.com


Quoting: Christian
I described the idea of string-insertion for binary trees in the tornado 0.4 topic.

OK, will carefully read this topic. ;)

Quoting: Christian
Is this related to hash chaining?

Yes, I'm just thinking, how we can translate the real offsets into match indexes. In other words, we may use pure LZ77 with MINMATCH=4 (as BALZ v1.04) and then drop first byte of a match as a literal and encode match length of three, plus, an offset as index. Generally speaking, LZPM does it exactly in such way, but for offset->index translation LZPM keeps byte counts - i.e. cnt[c] - keeps count of c's. Thus, match index = cnt[c]-node.cnt;
:)
Although, you may test the latest BALZ, it's not that worser than LZPM, in most cases it's even better, especially on binary data. So, now I'm thinking about further improving LZ77 scheme especially in offset coding, maybe better buffered offsets technique, etc.

Bulat Ziganshin

unread,
May 1, 2008, 3:41:00 AM5/1/08
to encode_ru_f...@googlegroups.com


Christian, is rzm still limited to files <2gb?

and - how about adding faster lazy-parsing compression mode?

Christian

unread,
May 1, 2008, 3:26:00 PM5/1/08
to encode_ru_f...@googlegroups.com


Quoting: Bulat Ziganshin
Christian, is rzm still limited to files <2gb?

Yes.

Quoting: Bulat Ziganshin
and - how about adding faster lazy-parsing compression mode?

Great idea, but sadly - no time. And originally, RZM is about strong asymmetric compression.

Reply all
Reply to author
Forward
0 new messages