http://dobbscodetalk.com/index.php?option=com_myblog&show=The-Million-Random-Digit-Challenge.html&Itemid=29
(or http://bit.ly/8i63hd)
While proof is impossible, I think it has been pretty well
demonstrated that this file is not going to succumb to any sort of
statistical analysis. The mathematicians at RAND did a great job,
leaving at most a few dozen bits of potential savings:
http://groups.google.com/group/comp.compression/browse_frm/thread/987e4ef26de2d7e8?tvc=1&q=matt+mahoney+million+group%3Acomp.compression
(or http://bit.ly/70iWFe)
In this blog post I include a few speculative ways in which this file
could be compressed.
For example: what if the million digit number was actually the nth
prime? And while the length of n will be pretty close to a million
digits, maybe n is a bit compressible? (Unfortunately we are a long
way from numbering primes up to a million digits.)
Or what if the million digit number could be described by some fairly
short polynomial? Factoring the million digit number might have some
interesting fallout.
It's pretty cool to imagine that the million digit number might
actually be an interesting number. What is harder to calculate is just
how astronomically unlikely that is. Most people don't have an
intuitive sense of that, hence the never-ending supply of fresh
believers.
- Mark - ma...@ieee.org
Mark
I don't think most files of one million bits can be
compressed. I also think if a real random source was used
that it would be incompressible. But not trusting
government contractors that feed at the public trough.
I wonder if the file is really random. They may have
taken shortcuts in its production. I think this file
should be looked at by many. I don't think I will compress
it but for example I would like to do a BWTS of the file
in binary and see if that changes any of the statistics.
Before and after such a transform. I would suspect if
random they should stay about the same.
Any way I hope some one does compress it greatly this year
I guess I am not trusting of how it was really created.
I trust you not them.
David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"
It's not prime. It's even. The last digits in the RAND list are 88,
the last byte in your file is 0x64. So I already know that the number
is divisible by four. Another test showed that it is also divisible by
seven.
Mark
Oops. Messed that one up. It's not divisible by 7, nor by 3, 5, 11,
or 13. But it is divisible by 2.
Mark
You do have to count your bits carefully. I still am bothered
by the claims of tape manufacturers on the storage capacity of tapes.
Ultrium 1 claims 200GB, and only in fine print says compressed.
Of course that 200GB only contains 100GB of information or it
wouldn't easily compress to 100GB on tape.
In the case of random digits, one could have a file of one million
ASCII characters representing decimal digits. Good compressors
should approach the 415,000 bytes at 3.32 bits per decimal digit.
Some would say that it was compressing random data.
Mostly, though, humans prefer files with redundant information.
Pretty pictures and pleasant audio tends to have a highly
ordered structure that is reasonably compressible.
> http://groups.google.com/group/comp.compression/browse_frm/thread/987e4ef26de2d7e8?tvc=1&q=matt+mahoney+million+group%3Acomp.compression
> (or http://bit.ly/70iWFe)
-- glen
Time ago I tried to compress it with an experimental compressor and it
seem incompressible for groups up to 5 bytes .
I test it little bit for 6 bytes groups but with my machine I estimate I
need about one year of computational time .
I think it is a good test but I don't want only to compress this
specific million digit I want to use it as a test so I don't want to use
"tricks" to compress but I think it is interesting to find a general
compression method able to compress the million test.
Denis.
Could someone with the ASCII version of this file, all 1000000
digits of it, run it through gzip to see how well it does?
-- glen
Just to tag in here...
Hi.. I'm back fiddling with writing data encoders and the
MillionDigit file.
Thanks Mark and thanks all the Data Compression people.
I approached this from the idea of changing the data to something
compressible.
I believe I see the essence of information transferred into different
encodings and none of the encodings seem to offer
enough compression to justify the encoding.
However, I feel some of my encoders are really clever.
So I hope I will be accepted as a poster in this forum. I would like
to be part of the group even though my education is limited on the
subject.
I'd like to be a part of the quest, so to speak yet my real skill is
imagination.
So Hello again.. Sorry for the lack of dedication to the forum. but I
am here now.
Ernst
The original is at:
http://www.rand.org/pubs/monograph_reports/MR1418/index.html
It's a bit formatted (for easy reading, there are fifty digits per
line, arranged in 10 groups of 5, plus line numbers) so you'll need to
cook it a bit if you just want a million byte file.
>> Could someone with the ASCII version of this file, all 1000000
>> digits of it, run it through gzip to see how well it does?
> The original is at:
> http://www.rand.org/pubs/monograph_reports/MR1418/index.html
The result seems to be 470429 bytes. About 13% worse than binary.
That is just the 1000000 ASCII characters, no blanks,
line numbers, or line terminators.
It downloads as a ZIP file at 663943 bytes.
> It's a bit formatted (for easy reading, there are fifty digits per
> line, arranged in 10 groups of 5, plus line numbers) so you'll need to
> cook it a bit if you just want a million byte file.
Well, first I took it a little to literally and downloaded
the above indicated file. The actual file is at:
http://www.rand.org/pubs/monograph_reports/2005/digits.txt.zip
thanks,
-- glen
The binary file is 415,241 bytes long. (Though I could argue that Mark
left off a leading zero, and it should be 415,242 bytes.)
Putting the 1,000,000 byte ASCII digits file through gzip -9 results in
470,429 bytes.
Running the same through zlib, generating a gzip file using the
Z_HUFFMAN_ONLY strategy gets it to 436,943 bytes. For cases like this,
that's better than letting zlib try to look for matches.
Since the deflate format needs to code 11 symbols (the ten digits and
the end-of-block code), you would expect five 3-bit codes and six 4-bit
codes. For equal frequencies, you would then expect about 437,500
bytes not including overhead. The frequencies aren't quite equal, so
taking that into account, you would expect 437,330 bytes not including
overhead. The zlib Huffman coder does a smidge better, even with the
overhead, by breaking the input into about 30 blocks and getting some
more frequency dispersion in the smaller blocks to take advantage of.
Mark
If you used a static bijective huffman you need only ten
predefined symbols. You would use 6 3 bit symbols and 4 4
bit symbols for 425000 bytes. If you used bijective static
arthmetic with predefined 10 symbols you need 415241
or 415242 bytes. which makes me wonder how did Mark
convert it to binary. Did he use a special program to
change a decimal fraction to binary or what. Knowing
exactly how he did it might lead to a way to compress
his result.
Probably it was done using the GNU multiple precision library "gmp",
after processing the text input to leave exactly the one million
ASCII digits and nothing else.
Mark Nelson issued the same challenge in 2004, about 5.5 years ago:
http://groups.google.com/group/comp.compression/browse_thread/thread/b7b1d6477fd9c00c/61c6e2954d8cd3ca?q=comp.compression+million+rand+challenge#61c6e2954d8cd3ca
--
Due to the small dispersion in the actual frequencies, I calculated
that an efficient arithmetic coder could do one byte better: 415,240
bytes. (Actually, 415,239 bytes and seven bits.)
Mark
If you know the exact frequency of each symbols yes
you could do it easily with that compressor. However
its when you try to make a general arithmetic for
any file of the set {0..9} that you get into trouble.
when you add in the count data or do it adaptively.
I have never seen the original file. But I am sure
I tested binary with both arb2x and arb255 I have used
over the years for these two adaptive bijective
stationary compresses sometimes the Lapalce where
the states have a stating count of 1 and some times
the KT methods which some think is hot where the
starting values are 1/2. The KT method gave longer
files if I remember correctly. Also tested the BWTS
of the file and got exactly the same lengths. Many
so called arithmetic will get different lengths a
pure arithmetic should get the same length regardless
of the permutation. If the starting values of counter
very large you would get the lenghts I calculated above
there is likely some starting value that would be
optimal for this file. Not sure what it would be
or if it would be really make it smaller if one
played the tuning for nonstionary game that Shelwien
plays I guess you should be able to save more space.
In my mind if the file does not compress with Shelwien
tuning then it really might be random. Note when you
use it tuned with Shelwien stuff the BWTS of file would
likely increse in length if you try to compress it since
the coder would no longer be a stgationary arithmetic
coder.
I think that means they did a completely lousy job. Calling them
'mathematicians' is an insult to mathematicians, they were just
intellectually lazy bodgers who did a totally half-arsed job.
Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1
>> While proof is impossible, I think it has been pretty well
>> demonstrated that this file is not going to succumb to any sort of
>> statistical analysis. The mathematicians at RAND did a great job,
>> leaving at most a few dozen bits of potential savings:
>
> I think that means they did a completely lousy job. Calling them
> 'mathematicians' is an insult to mathematicians, they were just
> intellectually lazy bodgers who did a totally half-arsed job.
Can you provide any evidence for that? IOW, do you know how hard it is
to come up with a string of digits that withstands so many statistical
analysis?
So long,
Thomas
I posted the code to this NG back in 2002 - I used the Java bignum
class. Gordon Cormack also posted some C code to do the same thing:
Hope this link works, Google Groups search function seems to be doing
better these days, but they make no promise of any type of permalink:
http://groups.google.com/group/comp.compression/browse_frm/thread/7cb284374eb99eb5
- Mark
Bijective coding plus small alphabet plus uniform frequencies = win.
If everyone here will concede the point, maybe we can then close the
topic for good!
Well, not likely, but it is pretty clear cut.
- Mark - ma...@ieee.org
Did I say that right "Bijective data sets?"
<- ... A <> B <> C <>D ... ->
That looks like fun..
Love the challenge I have spent the years learning about encoding
data.
I may have a bijective encoder.. Cross my fingers.
Again Great fun!
.
According to my calculations, if a number N
comprising a million decimal digits is the k'th
prime, P_k, then k itself is a number with
almost 999,994 digits. Thus to win the $100 prize
I'd have to squeeze the program
f(k)
{ print the k'th prime; }
into about 21 bits. Can this be done?
Of course, k *might* just be the j'th prime,
and j the i'th prime, and i the h'th prime....
One comment about the Million Digits intrigues me;
something like "RAND carefully constructed the digits
to pass certain statistical tests."
Suppose this means that all 3-digit patterns
occur equally often. Then, for example, our
"compressed file" could start with the first
998,000 digits after which we would know a *lot*
about the final 2000 digits and could encode
them well.
Any more info on how the Million Digits were
constructed? I suppose they were *decimal*
digits, so rendering them in binary would
blur some of those "statistical regularities."
Can we win the prize if we output just the
decimal digits, or would our "decompressor"
have to include a decimal-to-binary converter?
(Given P_k = N ~= 10^6, by the Prime Number
theorem:
k ~= N/log N
~= 10^10^6 / 10^6.36
~= 10^999993.6
)
James Dow Allen
The evidence Mark posted several years ago.
They height of their logic was "shit, there's bias - erm, let's add them!".
I could trivially come up with one million digits with better
statistical properties (more precisely - fewer demonstrable
statistical weaknesses) than the RAND ones.
The RAND ones are an epic failure, people are just emotionally
attached to them as they're historical.
Had they done a "completely lousy job", the file would have been quite
a bit more compressible than the currently theorized couple of dozen
bytes.
Maybe you object more to their methodology than their results?
On Jan 9, 8:11 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
wrote:
> Mark Nelson <snorkel...@gmail.com> writes:
> > I have a short blog post on Dobbs Code Talk re the Million Digit
> > Challenge:
>
> >http://dobbscodetalk.com/index.php?option=com_myblog&show=The-Million...
> > (orhttp://bit.ly/8i63hd)
Nope, you would have to find some compressibility in k. That's where
the luck comes in.
>
> Any more info on how the Million Digits were
> constructed? I suppose they were *decimal*
> digits, so rendering them in binary would
> blur some of those "statistical regularities."
> Can we win the prize if we output just the
> decimal digits, or would our "decompressor"
> have to include a decimal-to-binary converter?
Yes, both a paper and a book can be located via the Google Oracle.
- Mark - ma...@ieee.org
I'm trying to find an encoding of million digit file that will
compress. So some might try that approach.
Maybe 'completely lousy' is an exageration, but if the target is
to be paedagogical, the attempt should be beyond criticism.
> Maybe you object more to their methodology than their results?
Both. They realised their source was biased, that was good, and
then they post-process in a way which can give a mathematically
modellable reduction in shortfall of entropy rate, which they
apparently considered good enough, despite the fact that it was
clearly still visible. So it wasn't good enough - bad - and they
were happy with that - double bad.
I still don't understand, which type of bias?
> I could trivially come up with one million digits with better
> statistical properties (more precisely - fewer demonstrable
> statistical weaknesses) than the RAND ones.
Which weaknesses? For example, this file would be weak, for example, if
all digits would come up exactly equally probable - this is also
unlikely to happen exactly.
So long,
Thomas
"Mark Nelson" <snork...@gmail.com> wrote in message
news:1b45d87a-0af0-4667...@a6g2000yqm.googlegroups.com...
> I have a short blog post on Dobbs Code Talk re the Million Digit
> Challenge:
>
> http://dobbscodetalk.com/index.php?option=com_myblog&show=The-Million-Random-Digit-Challenge.html&Itemid=29
> (or http://bit.ly/8i63hd)
>
> While proof is impossible, I think it has been pretty well
> demonstrated that this file is not going to succumb to any sort of
> statistical analysis. The mathematicians at RAND did a great job,
> leaving at most a few dozen bits of potential savings:
>
> http://groups.google.com/group/comp.compression/browse_frm/thread/987e4ef26de2d7e8?tvc=1&q=matt+mahoney+million+group%3Acomp.compression
> (or http://bit.ly/70iWFe)
>
> In this blog post I include a few speculative ways in which this file
> could be compressed.
>
> For example: what if the million digit number was actually the nth
> prime? And while the length of n will be pretty close to a million
> digits, maybe n is a bit compressible? (Unfortunately we are a long
> way from numbering primes up to a million digits.)
>
> Or what if the million digit number could be described by some fairly
> short polynomial? Factoring the million digit number might have some
> interesting fallout.
>
> It's pretty cool to imagine that the million digit number might
> actually be an interesting number. What is harder to calculate is just
> how astronomically unlikely that is. Most people don't have an
> intuitive sense of that, hence the never-ending supply of fresh
> believers.
>
> - Mark - ma...@ieee.org
Mark don't you think it will be embarrassing if someone comes along and does
compress the bin file a couple hundred bytes and you got all these postings
across the internet that says it can't be done?
>
> Mark don't you think it will be embarrassing if someone comes along and does
> compress the bin file a couple hundred bytes and you got all these postings
> across the internet that says it can't be done?
No, it would be really cool if somebody succeeded in doing something
is widely viewed as impossible.
It would be enormously interesting. Who would be embarrassed by that?
I'd be out $100 though, so it's not like there is no downside.
- Mark
The same thought went thru my mind, why be embarrassed if the
challenge results into new insight about compression/source coding?
There is nothing more fun than finding out that you are wrong and
something you thought was impossible, has now become possible.
The weaknesses Mark posted several years ago. Or was it not Mark?
Which Mark, at that? Anyway. The source numbers were biased, RAND
made that known. They then removed some of that bias by summing
digits in a column, which means that the parity of an unknown digit
in a column can be worked out given knowledge of the parity of all
the other digits in the column. That's another weakness - they've
just given you 50 bits for free. Or should I say they've just taken
away 50 bits.
> For example, this file would be weak, for example,
> if all digits would come up exactly equally probable - this is also
> unlikely to happen exactly.
If they were exactly equally probable that wouldn't be a weakness;
if they were exactly equally distributed that would demonstrate a
weakness.
I last mentioned its the limit of new frontiers to be ascertained ....
download the compressed & decompressed http://random.org files :
https://www.box.net/shared/static/5q6n4dludd.zip
download the .exe : https://www.box.net/shared/static/rm79sk26oy.zip
attach files shows 1,048,576 bytes of Random2009-08-5.dat (total
random file from http://www.random.org/files/ ) , accepted being quite
random not possible compresses even 1 byte less by all present best
state-of-art known compressors , but long predicted now proven
successful compressed reduced by 1Kbytes into 1,047,887 bytes
Index.dat
1,047,887 bytes Index.dat compressed file successful lossless
decompressed reconstructed back into exact verbatim same
Ramdom2009-08-05.dat
ranking.exe also able further compresses winzip's final compressed
reduced file even further
Looking for serious confidential (only needs this for short time
period) technological capable collaborator ... pls email ... look
forward together extend the frontier limits as we know today : )
INSTRUCTIONS
===========
attach DOS command prompt exe , 2nd more versatile powerful version up
next [ this version takes few hours compresses 1MB http//random.org
files , next version optimised rank/unrank will take only minutes ]
copy all into DOS default directory , bring up DOS type in :
ranking.exe -r 1 random2009-08-05.dat <return>
-r is to compress , 1 accept 1 input byte per symbol ( upto max 3
bytes ) random2009-08-5.dat is the input file to be compressed ( from
http://random.org , you may use any other files there )
this produces 1 single compressed file Index.ind ( you may want to
transfer this file to a different PC to 'separately' reconstruct )
to reconstruct type in : ranking.exe -u reconstructed.dat
<return>
-u means reconstruct reconstructed.dat is the name you give to the
reconstructed output file
( this uses Index.ind the compressed file produced earlier to
reconstruct ) this produces reconstructed.dat which should be exact
same as original input file
both PCs needs have .NET framework & MSVC (Microsoft Visual Studio )
installed
If your compressed data file is only one(1)K smaller than the original
file how do we not know the missing 1K of data is embedded in the
decompressor program.
Still, if I find the time I will try it out out by compressing some
test files on my Dell (Windows XP) laptop, transfer the compressed
file to my Toshiba (Windows 7) laptop and try decompressing there,
then test the resulting output file by moving the hard drive to my
Compaq (Haiku) desktop.
Since, only the Toshiba has network access all file transfers are
sanitize by transferring thru a USB flash drive. The USB drive is
formatted BFS insuring no hidden files can be passed around.
Give me over the weekend.
I have verified this claim and it true:
Ranking.exe -r 1 Random2009-08-05.dat give a 1,047,887 bytes output
file in 1 hour and 47 minutes.
Ranking.exe -u output.dat give a 1,048,576 bytes input file back in 2
hours 17 minutes
Compare between original input file and decompressed output file give
Ok
It looks like you are the first one to deliver a compressor and
decompressor what can shape the size of a true random input file and
back.
Good job!
The science can go back to the drawing tables.
30 January 2010 the second law of thermodynamics is publicly cracked
by Steorn with their Orbo and now random data compression is publicly
cracked by LawCounsels with Ranking.exe :-)
http://www.youtube.com/watch?v=T4Q3Klq5dxM
http://www.youtube.com/watch?v=p7i7P63IByY
If you have a slow computer you can split the input file with HJSplit
in 400KB parts:
http://www.freebyte.com/hjsplit/
Ranking.exe -r 1 rand.dat give a 409,548 bytes output
file in 22 minutes.
Ranking.exe -u rand2.dat give a 409,600 bytes input file back in 28
minutes
Compare between original input file and decompressed output file give
Ok
It's not possible to compress the output file further.
sportman writes :
> Ranking.exe -r 1 rand.dat give a 409,548 bytes output
> file in 22 minutes.
> Ranking.exe -u rand2.dat give a 409,600 bytes input file back in 28
> minutes
> Compare between original input file and decompressed output file give
> Ok
> It's not possible to compress the output file further.
this really only very small tip of iceberg sighted sometime ago ...
things moved on since things to come :)
did you think to encrypt the result and check if the result can be
compressed ?
it would be interesting to know...
> I have verified this claim and it true:
>
> Ranking.exe -r 1 Random2009-08-05.dat give a 1,047,887 bytes output
> file in 1 hour and 47 minutes.
> Ranking.exe -u output.dat give a 1,048,576 bytes input file back in 2
> hours 17 minutes
> Compare between original input file and decompressed output file give
> Ok
I don't have the time yet to do my own tests, but the one thing I
know, never, Never, NEVER use the data files supplied by the person
making the claim. Hopefully over the weekend I will create my own
version of random data for testing purposes.
Sportsman, did you use his supplied data, if yes then strike one.
Sportsman, did you do the tests on a computer with active internet
connection, if yes then strike two.
Sportsman, did you do the decompression on the same machine that you
did the compression on, if yes then strike three.
What is the smallest program that can compress, then expand the
million random digit file by 100K bytes? 100K +n, where n is smallest
assembly language equivalent of cat. I think this code falls into that
category - tuned for a specific file, intentionally or not.
The strict definition of the challenge requires that a program smaller
than the size of the million digit file be able to recreate the file.
LawCounsels is nowhere close to that.
A looser definition of the challenge would require that the program
compress to a data file smaller than the million digit file, the
decompress correctly. Then be able to repeat the process a few more
times on the file encrypted using DES.
- Mark
I tried to recompress the output of the provided Random2009-08-05.dat
but Ranking.exe make it a little bigger:
index.ind 1,047,887 bytes to 1,048,216 bytes
Took 1 hour and 47 minutes to compress.
I also tested two other random files from random.org and also
Ranking.exe made them a little bigger:
2006-03-11.bin 1,048,576 bytes to 1,048,903 bytes
2010-02-09.bin 1,048,576 bytes to 1,048,901 bytes
Both took also 1 hour and 47 minutes to compress.
Because it looks like something is different with the provided
Random2009-08-05.dat I compared that file with the original from
random.org 2009-08-05.bin.
I found 4010 differences, I give the first 25 of them:
44: 00 20
1A8: 00 20
22C: 00 20
31E: 00 20
3F8: 00 20
479: 00 20
4DC: 00 20
6BB: 00 20
6D6: 00 20
707: 00 20
816: 00 20
821: 00 20
A5B: 00 20
AFC: 00 20
AFE: 00 20
CD8: 00 20
D27: 00 20
F04: 00 20
F37: 00 20
1144: 00 20
1226: 00 20
1262: 00 20
135B: 00 20
139E: 00 20
14C8: 00 20
......
So it looks like with testing your software decompressing you have
overwrite your original file with a file who is not 100% random
anymore. And later used this file as input...
I'm afraid your compressor can't compress random files and you need to
go back to the drawing table...
Sportsman, thank you for doing further tests. And saving me the time/
trouble of doing the tests myself.
> Because it looks like something is different with the provided
> Random2009-08-05.dat I compared that file with the original from
> random.org 2009-08-05.bin.
An important step.
Thus why I said never use the author's supplied data file to do the
tests.
> So it looks like with testing your software decompressing you have
> overwrite your original file with a file who is not 100% random
> anymore. And later used this file as input...
Yes, a very good point! This is a simple mistake to make and once the
wrong file is being used for testing the author will in complete
honesty think that they are developing something great.
I also made a similar mistake in the past, I ran my 7-bit text
compressor (it stripped out the high bit first) on a 8 bit file, then
used the decompressed version to test the 8-bit compressor that I was
developing the following month - I was getting great compression
ratios until I discovered my error. :(
Earl Colby Pottinger (Compression fool)
If it was April Fools Day I would believe it was a joke.
But you should have thought something was up when he pointed
to the site with the Irish perpetual motion machine. It
may make money for the inventors if they can find enough
"investors". However that is most likely fake too.
Oh well it was a nice break from the horrors of the
real world which is either on the brink of a nuclear war
or freezing due to an ice age.
David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"
Thanks
yes when I dust up this early version posted here thought http://random.org
was quite random not as random as Mark's
so was compressable .....
forgot Window's Notepad unexpected 'flaw' converts both hex'00' &
hex'20' to same hex when saving random.org 's file with Notepad...
making random.org 's file not so random , overlooked since Notepad
displays both original .bin & saved .bin both exact same visual
display both hex'00' & hex'20' exact same on screen
However with quite random file such as reasonable size WinZip final
compressed file , this early version ranking.exe often can reduce the
compressed winzip file even further final winzipped file probably
quite random but not always complete random .... its assumed winzip
has the best state-of-art compressions available (?), cost 1bit to
indicate if further compressable thus compressed reduced further with
ranking.exe
its simple enough confirm verify with own files, winzip them & then
further compress with ranking.exe
subsequent refined version/s have progressive reduced the 'excess'
bytes when compressing true random files (like both http://random.org
& Mark's) ....Mark has possession of one of subsequent ranking.exe
version tested on true .bin (not saved by Notepad) which expanded the
406KB random file byonly some 12 bytes
very latest refined related methods implementation at present has
attained exact 0 'excess' bytes (not with Notepad)... progress in
right direction remembers this is not straight forward transformations
which simply preserves exact same # of bits as original like BWT etc
but actual able reduces many less than complete random files ....
present shortly awaits new improved methods implemention towards
compressions gains (not just exact same # of bits as it stands at this
time) hopeful , or if not at very least able reduce many more sets of
files even further than at present
Please refer to http://www.maximumcompression.com/index.html to see what
the real state-of-the-art is.
> its simple enough confirm verify with own files, winzip them& then
> further compress with ranking.exe
>
>
> subsequent refined version/s have progressive reduced the 'excess'
> bytes when compressing true random files (like both http://random.org
> & Mark's) ....Mark has possession of one of subsequent ranking.exe
> version tested on true .bin (not saved by Notepad) which expanded the
> 406KB random file byonly some 12 bytes
>
> very latest refined related methods implementation at present has
> attained exact 0 'excess' bytes (not with Notepad)... progress in
> right direction remembers this is not straight forward transformations
> which simply preserves exact same # of bits as original like BWT etc
> but actual able reduces many less than complete random files ....
> present shortly awaits new improved methods implemention towards
> compressions gains (not just exact same # of bits as it stands at this
> time) hopeful , or if not at very least able reduce many more sets of
> files even further than at present
>
The only way of discovering the limits of the possible is to venture a
little way past them into the impossible.
- Arthur C. Clarke
I do not believe that you will succeed, but in the spirit of the above
quote, I support you attempt.
You could take the file convert it to ASCII ones and zeroes
then do a BWTS on it and convert back to the binary packed
file. Which would be the same size. See if you can compress that
if you can I would say first file not random. But its a test
they are not likely to do that kind of test before they put
the file up.