Is there another algorithm that, per the FAQ §9:78, "skews the first
order symbol distributions to make the data more compressible with simple
methods," or is BWT the only one that can be used practically for losless
compression of arbitrary data? I've seen info on lossless DCTs applied to
images/sound, can those be gainfuly applied to text compression?
Then he describes how he can save 1 bit for some files but absolutely
fails to explain why you need to make a complete ass of yourself
beforehand, instead of just applying the bit saving part right away.
So I shouldn't BWT before I run my arithmetic or huffman coder?
> ...
> So I shouldn't BWT before I run my arithmetic or huffman coder?
It depends on the data you are going to compress. If memory serves
me right Mark Nelson were explaining in his article why he were
doing RLE and MTF before BWT (at least in a german article on his
article).
If you are looking for an "universal" archiver/compressor BWT or
LZ77 before arithmetic or huffman coding should be a good choice.
If you are compressing audio data (wav etc) or images other coding
shemes might be a better choice.
R.Nausedat
That's not quite right. You want to do RLE before applying BWT as it helps
pathological sort behavior on strings with long repeated sequences. This can
improve performance quite a bit.
MTF is performed after BWT. The BWT output has a lot of order, but it takes
MTF to help expose that order in a way that is very compressible.
--
|
| Mark Nelson - ma...@ieee.org - http://MarkNelson.us
| The Ultimate Compression Resource - http://www.datacompression.info
| Sponsored by Xceedsoft.com - Serious About Components
|
"Rainer Nausedat" <RNau...@speedproject.de> wrote in message
news:c01m5l$5iq$04$1...@news.t-online.com...
Not if you have a decent sorting algorithm that doesn't barf on long
repeated sequences.
) MTF is performed after BWT. The BWT output has a lot of order, but it takes
) MTF to help expose that order in a way that is very compressible.
You should also do a RLE of the zeroes after the MTF, because it tends to
produces so many zeroes irt. other values that you take a big compression
hit because of rounding off in an arith coder.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
> Mark wrote:
> )> It depends on the data you are going to compress. If memory serves
> )> me right Mark Nelson were explaining in his article why he were
> )> doing RLE and MTF before BWT (at least in a german article on his
> )> article).
> )
> ) That's not quite right. You want to do RLE before applying BWT as it helps
> ) pathological sort behavior on strings with long repeated sequences. This can
> ) improve performance quite a bit.
>
> Not if you have a decent sorting algorithm that doesn't barf on long
> repeated sequences.
It also doesn't help the ababababababababababababababababababa... case.
(Which is not uncommon in situations such as 16-bit audio or graphics formats)
> ) MTF is performed after BWT. The BWT output has a lot of order, but it takes
> ) MTF to help expose that order in a way that is very compressible.
>
> You should also do a RLE of the zeroes after the MTF, because it tends to
> produces so many zeroes irt. other values that you take a big compression
> hit because of rounding off in an arith coder.
This sounds surprising, but entirely believable. Do you have any figures
for the amount of waste due to rounding in many-zero scenarios?
Phil
--
Unpatched IE vulnerability: XSS in Unparsable XML Files
Description: Cross-Site Scripting on any site hosting files that can
be misrendered in MSXML
Reference: http://sec.greymagic.com/adv/gm013-ie/
Exploit: http://sec.greymagic.com/adv/gm013-ie/
Phil wrote:
) This sounds surprising, but entirely believable. Do you have any figures
) for the amount of waste due to rounding in many-zero scenarios?
Well, no.
Just that one of the original papers mentioned something like this, I
vaguely recall trying it with and without this step, and there was a
difference, I wouldn't know how large it was.
Something to test/find out perhaps ?