Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Mankind's centuries old Kraft's Inequality / Pigeonholes hurdles

514 views
Skip to first unread message

LawCo...@aol.com

unread,
Mar 31, 2012, 1:32:32 PM3/31/12
to

Mankind's centuries old Kraft's Inequality / Pigeonholes hurdles now comes to this :

http://groups.google.com/group/comp.compression/browse_thread/thread/c3b4127eab5f7a0e?hl=en#

can each of these many 'start'/ 'end' sequences ( each consists of ternary
symbols a b c only condition here is total# of c ALWAYS exact = total# b + 2 &
this occurs only at EOSequence , ie at any stage IF progressive total#
c becomes = progressive # b + 2 then a sequence ends .... total# a unrestricted , but usually around same as total# b BUT this is irrelevant ) be better represented encoded 'shorter'.... take care the encoded bitstring MUST be self-delimiting meaning unambiguous as to where the encoded bitsstring ends ( subsequent follows / merged with other bits , needs able distinguish this boundary from the encoded bitstring itself )

The entropy of all these many 'start'/'end' sequences together with
total N symbols ( total # a + total # b + total # c = N , which is N * 1.5 binary bits ) already OBVIOUS smaller than N * 1.5 binary bits

REWARDS for 1st person put forth a practical solution , and REWARDS also for the best practical solution put forth

Look Forward ,
LawCounsels

Thomas Richter

unread,
Apr 3, 2012, 5:32:39 AM4/3/12
to
As undefined as it can be. First of all, you say "the sequence ends if
#c = #b +2", but you do not say whether the number of c's before that
point is smaller or larger than the number of b's, that is, whether the
sequence ends if the number of c's drops to #b+2, or rises to #b+2.
Then, you say nothing about the statistics of the sequence. For all
practical matters, pick a MQ coder, and encode the symbols by binary
decisions like 0->a, 10->b, 11-> c.


James Dow Allen

unread,
Apr 3, 2012, 12:07:04 PM4/3/12
to
On Apr 3, 4:32 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
> Am 31.03.2012 19:32, schrieb LawCouns...@aol.com:
> > REWARDS for 1st person put forth a practical solution , and REWARDS also for the best practical solution put forth
>
> As undefined as it can be. First of all, you say "the sequence ends if
> #c = #b +2", but you do not say whether the number of c's before that
> point is smaller or larger than the number of b's,...

I hardly noticed OP before you responded (especially given
the April 1 date!) but I think I can answer this.
#c = #b = 0 initially, so #c can never exceed #b + 1 before
termination (If #c > #b + 2, the sequence would have
terminated earlier.)

I'll read OP if/when "REWARDS" is clarified. Are we talking
brownie points? Fame and Glory? Billions of Zimbabwe dollars?

April Fools!!

James

LawCo...@aol.com

unread,
Apr 4, 2012, 3:35:37 AM4/4/12
to
On Tuesday, April 3, 2012 9:32:39 AM UTC, Thomas Richter wrote:
> For all
> practical matters, pick a MQ coder, and encode the symbols by binary
> decisions like 0->a, 10->b, 11-> c.

yes this is a good start .... this certainly satisfies 'self-delimiting' criteria

all sequence starts with 'no symbols' ( nothing ) then progressively accumulates 'a' or 'b' or 'c' .... the initial ( simplifying ) statstistics is such that among ALL the # of sequences ( one can always takes only exactly eg 1,000 etc sequences ... since the 'start' positions of each sequences all ascertainable) the total# 'a' exact = total # 'b' and total # 'c' exact = total # 'b'

... most 'startling' most certain youwill soon enough exasperate conclude decide there is simply no possible way for existing known compression techniques to possible do this ! ... somewhat surprsingly given each of these sequences definite 'mathematics' NOT RANDOM endowed with clear unambiguous structures and distributions bias !

which then you may then find this further statistics may or may not help you toward a 'complete' new kind of compressions method : the length of each sequence ( the total # of symbols within ) follows mathematics derivable from the 1 : 1 : 2 distributions ratio of the 3 symbols

YOU CAN GUARANTEE THIS WON'T BE SIMPLE !

Warm Regards,
LawCounsels

LawCo...@aol.com

unread,
Apr 4, 2012, 3:38:07 AM4/4/12
to
On Wednesday, April 4, 2012 7:35:37 AM UTC, LawCo...@aol.com wrote:

.... the initial ( simplifying ) statstistics is such that among ALL the # of sequences ( one can always takes only exactly eg 1,000 etc sequences ... since the 'start' positions of each sequences all ascertainable) the total# 'a' exact = total # 'b' and total # 'c' exact = total # 'b' +2

LawCo...@aol.com

unread,
Apr 4, 2012, 7:06:52 AM4/4/12
to
On Wednesday, April 4, 2012 8:38:07 AM UTC+1, LawCo...@aol.com wrote:
> On Wednesday, April 4, 2012 7:35:37 AM UTC, LawCo...@aol.com wrote:
>
> .... the initial ( simplifying ) statstistics is such that among ALL the # of sequences ( one can always takes only exactly eg 1,000 etc sequences ... since the 'start' positions of each sequences all ascertainable) the total# 'a' exact = total # 'b' and total # 'c' exact = total # 'b' +2

.... the initial ( simplifying ) statstistics is such that among ALL the # of sequences ( one can always takes only exactly eg 1,000 etc sequences ... since the 'start' positions of each sequences all ascertainable) the total# 'a' exact = total # 'b' and total # 'c' exact = total # 'b' + 2*eg1000

[ if takes exact 1,000 sequences , each sequence has 2 extra # of 'c' over # of 'b' ]

LawCo...@aol.com

unread,
Apr 6, 2012, 4:24:03 AM4/6/12
to
there will be many happy just to be part of contribute to this history 'breakthrough' [ if indeed it turns out ] !

however I am not averse to confirm REWARDS :

. your choice whether to accept one-time immediate US$1,000 payment on delivery of the software ( prefers C# )

OR

. accept a retainer ( standard type agreement will be made available for yourself to decide ) whereby 'minimum' entitled to revenues share of US$3Million in return for 'part' time competent manner R&D development , over 3 years period


you may also opt to communicate your solutions by private email 1st

Cheers,
LawCounsels


Ernst

unread,
Apr 6, 2012, 10:48:52 PM4/6/12
to
On Mar 31, 10:32 am, LawCouns...@aol.com wrote:
> Mankind's centuries old Kraft's Inequality / Pigeonholes hurdles now comes to this :
>
> http://groups.google.com/group/comp.compression/browse_thread/thread/...
>
> can each of these many 'start'/ 'end' sequences  ( each consists of ternary
> symbols a b c only condition here is total# of c ALWAYS exact = total# b + 2  &
> this occurs only at EOSequence , ie at any stage IF progressive total#
> c becomes = progressive # b  + 2    then a sequence ends ....  total# a  unrestricted , but usually around same as total# b BUT this is irrelevant ) be better represented encoded 'shorter'.... take care the encoded bitstring MUST be self-delimiting meaning unambiguous as to where the encoded bitsstring ends ( subsequent follows / merged with other bits , needs able distinguish this boundary from the encoded bitstring itself )
>
> The entropy of all these many 'start'/'end' sequences together with
> total N symbols ( total # a + total # b + total # c = N , which is N * 1.5 binary bits )  already OBVIOUS smaller than N * 1.5 binary bits
>
> REWARDS for 1st person put forth a practical solution , and REWARDS also for the best practical solution put forth
>
> Look Forward ,
> LawCounsels

How interesting.. I work with self delimited strings.
As for encoding long with shorter? Not a chance. One to One, yes
but, not One to less than One from all I know. Still it's interesting
to actually see pattern matching happening.

That is the basis of my pattern matching exercises I'm working on.

Now this is a dictionary effort and a fill the array with some
pattern type of effort. Be it Pi sequence or other.

If you are interested in what you say perhaps the Cyclic Codes are
what you want to read about. Perhaps you will find an algebraic
system just to your liking.

Good luck!

LawCo...@aol.com

unread,
Apr 7, 2012, 5:28:42 AM4/7/12
to
This topic is very different 'SPECIFIC' now clear capable of scientific investigation methods clear conclusion , for the very 1ST TIME EVER this came to be possible to be so very clear 'framed' , also for very 1ST TIME EVER 'random' bits has been transformed into non-random clear 'structured' output sub-sequence with distributions bias ) :

. can the specified symbols sequences ( definite NOT random shows distributions bias and clear structured ) be encoded more economic ? IF SO , HOW ?

lawco...@gmail.com

unread,
Apr 7, 2012, 6:22:22 AM4/7/12
to
....... for VERY 1ST TIME EVER 'random' bits has been transformed into non-random clear 'structured' output sub-sequence with distributions bias [ unlike Mark Nelson AMillionRandomDigits complete devoid of any discernible structures whatsoever ]

LawCo...@aol.com

unread,
Apr 8, 2012, 3:41:20 AM4/8/12
to
On Friday, April 6, 2012 9:24:03 AM UTC+1, LawCo...@aol.com wrote:

> there will be many happy just to be part of contribute to this history 'breakthrough' [ if indeed it turns out ] !
>
> however I am not averse to confirm REWARDS :
>
> . your choice whether to accept one-time immediate US$1,000 payment on delivery of the software ( prefers C# )
>
> OR
>
> . accept a retainer ( standard type agreement will be made available for yourself to decide ) whereby 'minimum' entitled to revenues share of US$3Million in return for 'part' time competent manner R&D development , over 3 years period
>
>
> you may also opt to communicate your solutions by private email 1st
>


NOTE : to win the REWARDS your solution needs attain 8 bits Net compression savings or more if taking compresses 100 such sequences , attain 80 bits Net compression savings or more if taking compresses 1,000 such sequences , attain 800 bits Net compression savings or more if taking compresses 10,000 such sequences ... so forth

James Dow Allen

unread,
Apr 8, 2012, 7:25:47 AM4/8/12
to
On Apr 8, 2:41 pm, LawCouns...@aol.com wrote:
> NOTE : to win the REWARDS your solution needs attain 8 bits Net compression savings or more

I enjoy compression puzzles and might investigate this one except ...

Skimming your posts I find I have no idea whatsoever
what problem you're posing, nor how the compression savings would
be measured. You do mention some ternary system with a
token termination condition, but any (terminated) sequence
of trits would be valid, just with different token boundaries.

At one point you imply a trit is 1.5 bits.
Wrong, it's 1.5849625 bits. Don't know if this
makes your puzzle easier or harder.

Before you waste time trying to tell us what your
actual requirement is, be aware that, if I solve it,
I will not divulge my solution until the REWARD is in escrow.

James

lawco...@gmail.com

unread,
Apr 9, 2012, 1:58:12 AM4/9/12
to
you choose eg 1,000 such subsequence ( each such subsequence terminates when # of c = # of b + 2 ) .... yes , each such sequence can be of various lengths as you mentioned ( # of symbols within ) BUT you know the distributions of the sequence lengths eg you can ALWAYS generate any # of such sequences ( for testing your compressions algorithm ) from a source with probability of producing an 'a' symbol 25% of time a 'b' symbol 25% of times a 'c' symbol 50% of times

Yes , a trit is 1.5849625 bits ( as when uses Arithmetic coder ) ... but I was
thinking perhaps using combinatorial C(100 , 50, 25, 25 ) this comes to average near 1.5 bits each trit ?( ignoring recording the multiplicities costs )

you should provide .exe takes in any generated # of such sequences , encode smaller then decode back to same # of such sequences [ NEEDS ONLY SHOW ON AVERAGE ATTAINS THIS , so wont be 'faulted' on very rare extreme input sequences generated ]

YES , REWARDS WILL BE IN ESCROW on request provided an time-expired .exe 1st clear shows saves 0.08 bits per sequence encoded


LawCounsels

James Dow Allen

unread,
Apr 9, 2012, 7:19:23 AM4/9/12
to
On Apr 9, 12:58 pm, lawcouns...@gmail.com wrote:
> a source with probability of producing an 'a' symbol 25% of time
> a 'b' symbol 25% of times a 'c' symbol 50% of times

Allow me to recommend the optimal Huffman code:
c - 0
a - 10
b - 11
This can be improved, though only slightly, using details
you've omitted from your summary.

This was so trivial, I'll discount it down to, say $950.

If this is unsatisfactory, I'll withdraw from the contest.
Even paid at minimum wage I'm afraid it would take significant
funds (payable in advance, please!) just to elicit a
proper problem statement from you.

I don't have PayPal. Contact me for instructions on how
to pay the $950. :-)

James

biject

unread,
Apr 9, 2012, 12:03:04 PM4/9/12
to
Lets see c is .5 * 1 = .5 b = .25*2 = .5 c = .25*2 = .5
see thats .5 + .5 + .5 = 1.5 for the average sequence while
if you encode each with 1.5849625 you save about .0849625 which
is more than the .08 It appears your in the money. I have a
hunch that there still is something missing in which case I would
not count on the money yet.

First of all does he want at least .08 bits saved in every case
or just the average case. If its the average case you could be
on the right track. If its every case then since you write only
whole numbers of bits the .08 savings gets a little harder. It
would be nice if the guy decides you haven't won just what does
he want. I have read it several times and yet I do not think its
clear enough to tackle without him saying oh I meant this and not
that.

Assuming he doesn't declare you the winner
1) is the savings an average things or does each file have to be less.
2) how do you measure the savings is it .08 from a 1.5849625 per
symbol
or is it .08 less then 1.5
3) not sure why you say source C = .5 while A and B = .25 the
fact is even if the source is A = B = C = 1/3 for short files
if you run the sources enough times and created a 100 files each
you still could get the same set of 100 files for both cases.
So you test set up is not valid. There is nothing magical about
your source. Except if I know its a fixed IID souce from say 2 or
3 different models as you create more files. You can with increasing
probability determine which one it most likely is. But you can't be
100% certain which one it is unless you do an ever increasing number
of file.


David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"

LawCo...@aol.com

unread,
Apr 10, 2012, 5:29:06 AM4/10/12
to
THE COMPLETE SPECIFICATIONS :
=============================

1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b' & the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%

2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences

3. IF you compressed file bitslength =< 1.5 * N - ( 0.08 * N ) THEN YOU WIN THE REWARDS ! ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )

4. there is no restrictions on memory storage requirements , you may even show your .exe works on 'research network supercomputer cluster' , BUT processing must complete within a day

James Dow Allen

unread,
Apr 10, 2012, 8:43:45 AM4/10/12
to
On Apr 10, 4:29 pm, LawCouns...@aol.com wrote:
> THE COMPLETE SPECIFICATIONS :
> =============================
>
> 1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS  & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b'  & the # of 'c' is invariable near = 2 * the # of 'b'  THUS the probability model here is 25% : 25% : 50%

I'm not clear on what "invariable near =" means. I think you specify
that the trit is from a random memoryless source. (Anyway, a
different
intepretation would have smallish effect.)

> 2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences

Does the decompressor know, in advance, the exact number of bits in
the
sequence? (Even if it does, the compression savings will be tiny,
when
amortized over 1000 strings.)

> 3. IF you compressed file bitslength  =<  1.5 * N   - ( 0.08 * N )  THEN YOU WIN THE REWARDS !

Starting with a source of exactly 1.500 bits/token of info,
we wait for the 1000th terminal, then compress it to
1.420 bits/token. Right? Good luck! :-)

I'll leave my bet on Shannon, Kraft, and the pigeons.

James Dow Allen

LawCo...@aol.com

unread,
Apr 10, 2012, 8:58:26 AM4/10/12
to
THE COMPLETE SPECIFICATIONS :
=============================

1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b' & the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%

2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences

3. IF you compressed file bitslength =< 1.5 * N - ( 0.08 * # OF SEQUENCES ENCODED ) THEN YOU WIN THE REWARDS ! ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )

4. there is no restrictions on memory storage requirements , you may even show your .exe works on 'research network supercomputer cluster' , BUT processing must complete within a day

LawCounsels

NB should anyone wins I certainly recommends to opt for the US$3M ( this minimum of US$3M is GUARANTEED BY YOUR WINNING SOLUTION ITSELF ! )

lawco...@gmail.com

unread,
Apr 10, 2012, 9:39:50 AM4/10/12
to
On Tuesday, April 10, 2012 1:43:45 PM UTC+1, James Dow Allen wrote:
> On Apr 10, 4:29 pm, LawCouns...@aol.com wrote:
> > THE COMPLETE SPECIFICATIONS :
> > =============================
> >
> > 1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS  & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b'  & the # of 'c' is invariable near = 2 * the # of 'b'  THUS the probability model here is 25% : 25% : 50%
>
> I'm not clear on what "invariable near =" means. I think you specify
> that the trit is from a random memoryless source. (Anyway, a
> different
> intepretation would have smallish effect.)

If source produces symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times THEN after eg 1,000 symbols sequences generated ( total N # of symbols within these 1,000 sequences ) THEN can with very high confidence level says the # of 'a' will be around N/4 the # of 'b' will be around N/4 the # of 'c' will be around N/2

>
> > 2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences
>
> Does the decompressor know, in advance, the exact number of bits in
> the
> sequence? (Even if it does, the compression savings will be tiny,
> when
> amortized over 1000 strings.)

you can ALWAYS choose ONLY a particular fixed # of sequences eg 1,000 or 10,000 etc to compress , so decompressor knows in advance the EXACT # of sequences compressed ..... but may ONLY guess at the total # of bits quite accurate ( since this is not known in advance )
>
> > 3. IF you compressed file bitslength  =<  1.5 * N   - ( 0.08 * N )  THEN YOU WIN THE REWARDS !
>
> Starting with a source of exactly 1.500 bits/token of info,
> we wait for the 1000th terminal, then compress it to
> 1.420 bits/token. Right? Good luck! :-)
>
> I'll leave my bet on Shannon, Kraft, and the pigeons.

SO WOULD I should these 1,000 sequences being 'random' structureless .... here the structure is within each sequence the # of 'c' is EXACT = the # of 'b' + 2

IN FACT ... all sequence MUST END with a 'c' , immediate before this 'c' ONLY an 'a' OR a 'c' can occur [ NEVER a 'b' ! ] .... if you look more careful, there are many more restrictions on possible permutations 'ordered'arrangements between the # of 'a's & 'b' & 'c' within a sequence , eg if there are 6 symbols in a sequence 1st 2 symbols cant be both 'a's & last 3 symbols cant be ALL 'b's .... etc so forth .... IT IS THESE THAT MAY MAKE YOUR SOUGHT FOR 1.420bits/token p[ossible achievable ( ? )



>
> James Dow Allen

lawco...@gmail.com

unread,
Apr 10, 2012, 9:51:40 AM4/10/12
to
> > Starting with a source of exactly 1.500 bits/token of info,
> > we wait for the 1000th terminal, then compress it to
> > 1.420 bits/token. Right? Good luck! :-)
> >
> > I'll leave my bet on Shannon, Kraft, and the pigeons.


SO WOULD I should these 1,000 sequences being 'random' structureless .... here the structure is within each sequence the # of 'c' is EXACT = the # of 'b' + 2

IN FACT ... all sequence MUST END with a 'c' , immediate before this 'c' IF # of symbols within this sequence is > 2 THEN ONLY an 'a' OR a 'c' can occur [ NEVER a 'b' ! ] .... if you look more careful, there are many more restrictions on possible permutations 'ordered'arrangements between the # of 'a's & 'b' & 'c' within a sequence , eg if there are 6 symbols in a sequence 1st 2 symbols cant be both 'a's & last 3 symbols cant be ALL 'b's .... etc so forth .... IT IS THESE THAT MAY MAKE YOUR SOUGHT FOR 1.420bits/token possible achievable ( ? )

lawco...@gmail.com

unread,
Apr 10, 2012, 10:29:46 AM4/10/12
to
Originally Posted by JamesB
This makes no sense!

If the data is truely randomly generated with p(A)=.25, p(B)=.25 and p(C)=.5 then on average the entropy will be 1.5 bits per symbol. It's been proven that you cannot go lower than this. (And a simple huffman tree of C=0, A=10, B=11 will work fine.) So you're just wasting time.

If the data is NOT randomly generated, then it needs to be explained better. Are you saying that no randomly generated string of symbols can ever have more than 2 more c than b symbols (yet we still have p(C) = 2*p(B))? However you just slice the string of symbols at that point and keep going? If so it is completely identical to the initial random case, but split into segments. That cannot possibly improve your compression as on average it's the exact same data.

The only practical way I see is to find a weakness or prediction of the random number generator, but then it's not truely random - only psuedo random - and it's all a fake and irrelevant.

[ ABOVE is from http://encode.ru/threads/1520-US-3-Million-Data-compression-Prize?p=29054#post29054 ]

PROFOUND ! Thanks

I would refer you to Entropy Definition : Request for Comments

https://groups.google.com/forum/#!to...on/w7QSfqtfeg4


These sequences have already been encoded compression saves minimum Net 0.063 bits per sequence !!!

1st to reach 0.08 bits Net compression savings per sequence WINS US$3M+

LawCounsels

biject

unread,
Apr 10, 2012, 10:26:40 AM4/10/12
to
> > My Compression codehttp://bijective.dogma.net/
> > **TO EMAIL ME drop the roman "five" **
> > Disclaimer:I am in no way responsible for any of the statements
> >  made in the above text. For all I know I might be drugged.
> > As a famous person once said "any cryptograhic
> > system is only as strong as its weakest link"
>
> THE COMPLETE SPECIFICATIONS :
> =============================
>
> 1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS  & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b'  & the # of 'c' is invariable near = 2 * the # of 'b'  THUS the probability model here is 25% : 25% : 50%
>
> 2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences
>
> 3. IF you compressed file bitslength  =<  1.5 * N   - ( 0.08 * N )  THEN YOU WIN THE REWARDS !   ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long   ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )
>
> 4. there is no restrictions on memory storage requirements , you may even show your .exe works on 'research network supercomputer cluster' , BUT processing must complete within a day

Well that partial clears it up. It looks like you want to compress
say
1000 of these sequences to a single file of bits or bytes not sure
which
your looking for is it ok to output to a files of ascii 1 and 0 then
the total
number of bytes would represent the actual number of bits or what. You
have
not cleared that up.

Second you keep talking about a generator would have a probability
model
of 25% : 25% : 100% You seem to ignore the fact that even for a
thousand
sequences that a model with say 30% : 30% : 40% could generate exactly
the
same 1000 sequences. What I am saying here is that James could write
code
that wins based on his random set of 1000 sequences. You could keep
generating
a so called set of random sequences such that in time you come with a
set that
causes his to fail. To be fair you should add another constraint
example.
say in each 1000 so called sequences the 25% 25% 50% should be bound
to some
range such as (25 +/- .01)% (25 +/- .01)% (50 +/- .01)% or somes
definable
fixed ranges of A B C. Just to say its generated by some probability
model
is not enough to pin it down. Or do you really want to pin it down so
a
person could determine on there on if they won or not.

Please clear this up if you want anyone to look at it seriously.

LawCounsels

unread,
Apr 10, 2012, 11:10:55 AM4/10/12
to
On Apr 10, 3:26 pm, biject <biject.b...@gmail.com> wrote:
>   Well that partial clears it up.  It looks like you want to compress
> say 1000 of these sequences to a single file of bits or bytes not sure
> which your looking for is it ok to output to a files of ascii 1 and 0 then
> the total number of bytes would represent the actual number of bits or what. You
> have not cleared that up.

YES THIS IS ACCEPTABLE ALLOWED
>
> To be fair you should add another constraint
> example.
> say in each 1000 so called sequences the 25% 25% 50% should be bound
> to some range such as (25 +/- .01)% (25 +/- .01)% (50 +/- .01)%  or somes
> definable fixed ranges of A B C. Just to say its generated by some probability
> model is not enough to pin it down.  Or do you really want to pin it down so
> a person could determine on there on if they won or not.
>
>  Please clear this up if you want anyone to look at it seriously.

YES AGREE ..... the .exe must satisfy such 4 out of 5 such 1,000
sequences you described above , 'random' generated
by someone in the forum ( Mark / Thomas / James / yourself ) .... in
addition needs also satisfy the published RULES

>  David A. Scott

James Dow Allen

unread,
Apr 10, 2012, 12:03:39 PM4/10/12
to
Does it seem strange that I knew I was going to dislike
New Google Groups? (though I didn't know *what* I'd dislike
about it.) Almost every change made in some systems seems
for the worse not better.

In the example below, I click on lawcounsel's link only
to get some "overview" page with no content.

(Old Google Groups is amusing too, of course. I'm now
looking at a window with no less than five scroll-bars,
3 of which I had to manipulate just to get this far!)

On Apr 10, 9:29 pm, lawcouns...@gmail.com wrote:
> I would refer you to Entropy Definition : Request for Comments
> https://groups.google.com/forum/#!to...on/w7QSfqtfeg4
> These sequences have already been encoded compression saves minimum Net 0.063 bits per sequence !!!
> 1st to reach 0.08 bits Net compression savings per sequence WINS US$3M+

Three comments:
1. Your earlier post had
<= 1.5 * N - ( 0.08 * N )
Are we now to understand that the first N here is number of trits,
and the second number of sequences?
2. I didn't read about the "saves minimum Net 0.063 bits per
sequence."
(As I said the link doesn't work.) By "minimum" do you mean
"actual in one experiment"? I'm guessing fluke.
3. It may seem paradoxical that no compression savings are available
despite the c=b+2 constraint. But consider a simpler related problem:
Flip a coin and stop on the first Heads. Compress the result.
H occurs with prob. 1/2
TH occurs with prob 1/4
TTH with prob 1/8, etc.
No way to compress despite the constraint.

James

LawCounsels

unread,
Apr 10, 2012, 12:56:11 PM4/10/12
to
On Apr 10, 5:03 pm, James Dow Allen <jdallen2...@yahoo.com> wrote:
> Does it seem strange that I knew I was going to dislike
> New Google Groups? (though I didn't know *what* I'd dislike
> about it.)  Almost every change made in some systems seems
> for the worse not better.
>
> In the example below, I click on lawcounsel's link only
> to get some "overview" page with no content.
>
> (Old Google Groups is amusing too, of course.  I'm now
> looking at a window with no less than five scroll-bars,
> 3 of which I had to manipulate just to get this far!)
>
> On Apr 10, 9:29 pm, lawcouns...@gmail.com wrote:
>
> > I would refer you to Entropy Definition : Request for Comments
> >https://groups.google.com/forum/#!to...on/w7QSfqtfeg4
> > These sequences have already been encoded compression saves minimum Net 0.063 bits per sequence !!!
> > 1st to reach 0.08 bits Net compression savings per sequence WINS US$3M+
>
> Three comments:
> 1.  Your earlier post had
>    <= 1.5 * N   - ( 0.08 * N )
> Are we now to understand that the first N here is number of trits,
> and the second number of sequences?

YES ... corrected since to read <= 1.5 * N - ( 0.08 * # OF
SEQUENCES )


> 2.  I didn't read about the "saves minimum Net 0.063 bits per
> sequence."
> (As I said the link doesn't work.)  By "minimum" do you mean
> "actual in one experiment"?  I'm guessing fluke.

FOR "ANY" ONE SUCH SEQUENCE [ mathematics 'guaranteed' minimum 0.063
bits savings ]

.... ALSO VERIFIED OVER LARGE # OF SUCH SEQUENCES

> 3.  It may seem paradoxical that no compression savings are available
> despite the c=b+2 constraint.  But consider a simpler related problem:
>    Flip a coin and stop on the first Heads.  Compress the result.
> H occurs with prob. 1/2
> TH occurs with prob 1/4
> TTH with prob 1/8, etc.
> No way to compress despite the constraint.

perhaps particular constraint here is superficial ... reducible
'fundamental' equivalent to common 'fair coin' tosses type here

> James

James Dow Allen

unread,
Apr 10, 2012, 1:34:11 PM4/10/12
to
On Apr 10, 11:56 pm, LawCounsels <lawcouns...@gmail.com> wrote:
> FOR "ANY" ONE SUCH SEQUENCE [ mathematics 'guaranteed' minimum 0.063
> bits savings ]

Congratulations! It sounds like you've disproved
the Kraft's Inequality. Have you published or is
it secret? And why the REWARD ($3 million or $1000?)
to extend the 0.063 win to 0.080?
Once you have a perpetual compressor, can't you just
run several copies in series to get as much compression
as you want?

James


lawco...@gmail.com

unread,
Apr 10, 2012, 1:54:16 PM4/10/12
to
because the 'smaller' compressed 1,000 such sequences is NOT in
same such sequences format again any more .....

TO BE REPEATABLE needs 1st improve bits saving to average 0.08 bits per sequence
.... possible improvements schemes a plenty , promising !

biject

unread,
Apr 10, 2012, 2:02:01 PM4/10/12
to
Look its hard to measure the savings for each file when its all
compressed to
one file. Since you allow bits to be thought of as Ascii ones and
zero would
it be allowed to compress each sequence to separate files. That way
its easier to
count excess or saved bits. IN order words could compress
aaacaaabcac to say
"O1100" which is 5 bytes of ascii your allowing to count as 5 bits for
that
file?

Thomas Richter

unread,
Apr 10, 2012, 8:47:00 PM4/10/12
to
On 10.04.2012 11:29, LawCo...@aol.com wrote:

> THE COMPLETE SPECIFICATIONS :
> =============================
>
> 1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS& next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b'& the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%

Wait, you now define the probabilities of the symbols. What else do we
know? Memoryless iid for example?

>
> 2. compresses these eg 1,000 generated sequences using your .exe ,& must decode back to the same 1,000 sequences

Once you compress multiple sequences, the only thing that you win by the
"ending condition" on the b/c count is that you do not need to include a
EOF symbol or a side information that tells you where the sequence
terminates. You can simply concatenate the sequences and the decoder
also knows where to truncate the parts. That also means that if you look
as this scheme as a steam compressor, and you know nothing else about
the probabilities of the symbols, you simply cannot compress. Otherwise,
the compression gain of the ending condition allows you only to
represent the sequence size a tiny bit more effectively than without
that condition, i.e. the overhead of having to keep the file size in the
filing system could be considered smaller. However, this is a
diminishing return, i.e. the overhead *per symbol* becomes obviously
zero for longer sequences, or more sequences.

Thus, there's nothing to gain as far as I see.

> 3. IF you compressed file bitslength =< 1.5 * N - ( 0.08 * N ) THEN YOU WIN THE REWARDS ! ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )

Wait a second, how do you count "bitslength"? The problem here is that,
by allowing *finite* sequences, you can always push some side
information into the filing system. Is the output required to be *a
single file*?

LawCounsels

unread,
Apr 11, 2012, 5:43:13 AM4/11/12
to
solution immediate becomes 'trivial' already solved should multiple
files be allowed .... think this was Patrick Craig's
solution attempt to AMillionRandomDigits





LawCounsels

unread,
Apr 11, 2012, 5:56:29 AM4/11/12
to
On Apr 11, 1:47 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
> On 10.04.2012 11:29, LawCouns...@aol.com wrote:
>
> > THE COMPLETE SPECIFICATIONS :
> > =============================
>
> > 1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS&  next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b'&  the # of 'c' is invariable near = 2 * the # of 'b'  THUS the probability model here is 25% : 25% : 50%
>
> Wait, you now define the probabilities of the symbols. What else do we
> know? Memoryless iid for example?


Yes, memoryless iid


> > 2. compresses these eg 1,000 generated sequences using your .exe ,&  must decode back to the same 1,000 sequences
>
> Once you compress multiple sequences, the only thing that you win by the
> "ending condition" on the b/c count is that you do not need to include a
> EOF symbol or a side information that tells you where the sequence
> terminates. You can simply concatenate the sequences and the decoder
> also knows where to truncate the parts. That also means that if you look
> as this scheme as a steam compressor, and you know nothing else about
> the probabilities of the symbols, you simply cannot compress. Otherwise,
> the compression gain of the ending condition allows you only to
> represent the sequence size a tiny bit more effectively than without
> that condition, i.e. the overhead of having to keep the file size in the
> filing system could be considered smaller. However, this is a
> diminishing return, i.e. the overhead *per symbol* becomes obviously
> zero for longer sequences, or more sequences.
>
> Thus, there's nothing to gain as far as I see.


Yes not from just knowing 1,000 sequences were compressed


> > 3. IF you compressed file bitslength  =<   1.5 * N   - ( 0.08 * # OF SEQUENCES )  THEN YOU WIN THE REWARDS !   ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long   ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )
>
> Wait a second, how do you count "bitslength"? The problem here is that,
> by allowing *finite* sequences, you can always push some side
> information into the filing system. Is the output required to be *a
> single file*?


dont think this side-information 'gain' ever amounts to much to needs
concern here

Yes , must be in a single file ( multiple files NOT allowed for
obvious reasons )

should input 1,000 sequences generated contains N symbols , THEN the
input bitslength is taken to be = N * 1.5 bits long to measure your
compression bits savings against

Thomas Richter

unread,
Apr 11, 2012, 10:26:57 AM4/11/12
to
On 11.04.2012 11:56, LawCounsels wrote:

>>> 1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS& next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b'& the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%
>>
>> Wait, you now define the probabilities of the symbols. What else do we
>> know? Memoryless iid for example?
>
>
> Yes, memoryless iid

Well, but then what's the point? If the probabilities are as given, and
the sequences are iid, and you want to compress *many* of these
sequences (as in: number goes to infinity), and everything is considered
to be one file, then there is nothing to gain. Entropy is, as one
computes, 1.5 bits/sample, and this is the lower limit. The only thing
the "truncation condition" provides you is that you are able to extract
the sub-sequences from the common output, i.e. it provides you a EOF
condition. That you cannot compress below 1.5bits/sample is the Shannon
result of lossless channel coding.

Greetings,
Thomas

James Dow Allen

unread,
Apr 11, 2012, 10:36:05 AM4/11/12
to
On Apr 11, 9:26 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
> That you cannot compress below 1.5bits/sample is the Shannon
> result of lossless channel coding.

Yes, he already knows this much. What you seem to overlook is
that he's refuted, both experimentally and theoretically,
Shannon's theory, the pigeonhole principle, and even the
Kraft's Inequality. Naturally he's keeping details of his
method secret; wouldn't you?

I repeat my offer to OP. If you supply an .exe which achieves
the guaranteed reduction you describe, I will supply an .exe
which invokes that .exe and wins the $1,000,000 prize for
MillionDigits.
Can we split the $1,000,000 fifty-fifty?

> Greetings,
>         Thomas

Hello yourself,
James

LawCo...@aol.com

unread,
Apr 11, 2012, 1:29:15 PM4/11/12
to
YES ... 50/50 will do nicely

.... gives this few more days if anyone gets to 0.08 bits savings per sequence
1st

will be in touch private email re signing

ALSO let me know if companies / compressions reseach Institute / prominent scientists acquaintances happy to collaborate co-develop for the markets

Cheers,
LawCounsels

James Dow Allen

unread,
Apr 11, 2012, 4:16:17 PM4/11/12
to
On Apr 11, 12:54 am, lawcouns...@gmail.com wrote:

> because the 'smaller' compressed 1,000 such sequences is NOT in
> same such sequences format again any more .....

It is straightforward to convert a uniform random string of
bits into a string of your (a,b,c) format with the expected
entropy increased only by a small constant.

James

Thomas Richter

unread,
Apr 11, 2012, 8:27:12 PM4/11/12
to
On 11.04.2012 16:36, James Dow Allen wrote:
> On Apr 11, 9:26 pm, Thomas Richter<t...@math.tu-berlin.de> wrote:
>> That you cannot compress below 1.5bits/sample is the Shannon
>> result of lossless channel coding.
>
> Yes, he already knows this much. What you seem to overlook is
> that he's refuted, both experimentally and theoretically,
> Shannon's theory, the pigeonhole principle, and even the
> Kraft's Inequality. Naturally he's keeping details of his
> method secret; wouldn't you?

I wouldn't claim nonsense in first place, actually. (-: Initially, when
I saw the problem my reaction was that there is potentially a chance for
a very small improvement because the number of possible combinations for
a string of given size is a tiny bit smaller than the number of
combinations of all files. There are for example less than 3^3 = 27
possible three letter strings with the given constraint, so you can
compress them better because not all combinations are possible. The
string "CCC" is, for example, not possible.

However, the way the problem is stated right now, namely compress a
large number of strings of the same type simultaneously into one common
file kills this advantage completely because then all you know is just
where to separate strings again, and the advantage is going to zero for
the total string size going to infinity.

So yes, there remains a possibility for an incredibly small improvement
that tends to zero as N->infinity. How to take any advantage of it I do
not see right now, i.e. without requiring infinite precision in an encoder.

Greetings,
Thomas


LawCounsels

unread,
Apr 12, 2012, 6:08:59 AM4/12/12
to
On Apr 12, 1:27 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
> On 11.04.2012 16:36, James Dow Allen wrote:
>
> > On Apr 11, 9:26 pm, Thomas Richter<t...@math.tu-berlin.de>  wrote:
> >> That you cannot compress below 1.5bits/sample is the Shannon
> >> result of lossless channel coding.
>
> > Yes, he already knows this much.  What you seem to overlook is
> > that he's refuted, both experimentally and theoretically,
> > Shannon's theory, the pigeonhole principle, and even the
> > Kraft's Inequality.  Naturally he's keeping details of his
> > method secret; wouldn't you?
>
> I wouldn't claim nonsense in first place, actually. (-: Initially, when
> I saw the problem my reaction was that there is potentially a chance for
> a very small improvement because the number of possible combinations for
> a string of given size is a tiny bit smaller than the number of
> combinations of all files. There are for example less than 3^3 = 27
> possible three letter strings with the given constraint, so you can
> compress them better because not all combinations are possible. The
> string "CCC" is, for example, not possible.

YOU'RE ON RIGHT TRACK HERE !

OBSERVANT

> However, the way the problem is stated right now, namely compress a
> large number of strings of the same type simultaneously into one common
> file kills this advantage completely because then all you know is just
> where to separate strings again, and the advantage is going to zero for
> the total string size going to infinity.

YES THE PROBLEM NEEDED TO & WAS 'SIMPLIFIED' ... now that some
understandings in place alreasdy I can now reveal the 'final' complete
details :

. each sequence's distance to the start position of the next sequence
( ie from start of
present sequence TO start of the next sequence ) IS ALWAYS FIXED [ ie
ascertainable
# of bits ! ] .... FOR SIMPLICITY AT THIS STAGE can just assume this
to be constant fixed
513 bits throughout !!!

HINT : HOWEVER BUT STILL YES , the present exact 1,000 sequence CAN
INDEED BE
COMPRESSED SMALLER

( not withstanding this said to be 'proven' to be 1,5 bits / symbol
entropy !!! BUT as James
said I could not be at liberty to publicise this complete 'new' method
at this time ... INDEED they
were experimentally tested proven & mathematics proved )

> So yes, there remains a possibility for an incredibly small improvement
> that tends to zero as N->infinity. How to take any advantage of it I do
> not see right now, i.e. without requiring infinite precision in an encoder.
>
> Greetings,
>         Thomas

LawCounsels

LawCounsels

unread,
Apr 12, 2012, 6:10:28 AM4/12/12
to
I have posted reasons related to WHY here needs 1st wait for 0.08 bits
savings
attained

LawCounsels

LawCounsels

unread,
Apr 12, 2012, 6:34:54 AM4/12/12
to
[ should the present sequence be just 2 symbols " C C " ( 4 bits
long ) THEN will be followed
by 513 -4 = 509 bits of filler 'dont-care' bits ]

LawCounsels

unread,
Apr 12, 2012, 6:57:27 AM4/12/12
to
>HINT : HOWEVER BUT STILL YES , the present exact 1,000 sequence CAN
>INDEED BE COMPRESSED SMALLER

>( not withstanding this said to be 'proven' to be 1,5 bits / symbol
>entropy !!! BUT as James said I could not be at liberty to publicise this
>complete 'new' method at this time ... INDEED they
>were experimentally tested proven & mathematics proved )

would your acquianted companies / compressions research Institutes be
interested acquire
this IP ? .exe ready to show

I suppose a lot of usual paperworks needs be exchanged signed 1st to
send this .exe over ,
and terms in place between us

LawCounsels

biject

unread,
Apr 12, 2012, 11:38:05 AM4/12/12
to
Actually the start of each sequnce to next sequence is not fixed.
You bijective map any sequence of symbols of ABC to a file that is
meets his ending criteria.

Since your putting strings together take a sequence of 2000 symbols
from the IID source he mentioned. It will be 1.5 bits per symbol.
know to map that to a series string that ends in his type of ending
bijectively is childs play. He seemed to be aware of the Craig trap
when I asked to map to several strings. But the fact is you always
save
some bits per mapping since the last symbol is FREE often more than
one symbol is free so you will always save at least 1 bit since last
C is never needed.

example if N = 1
use A for ACC
use B for BCCC
use C for CC

use CC for CCCC when N=2

Know that it is known that you always save some bits for any
such squence regardless of number of sequences allowed in his
large compressed file. IN fact you always save bits if the
number of sequences is unconstrained.

Know if you really want to play the game and fix N to say 1000
such sequences you play the combination game which is obviously
the game he wishes to play. You find the exact number of combinations
needed and that assign strings 0 to first combination and 1 to second
and 00 to third and so on. Big Deal I don't wish to code this for him
for at least 2 reasons. He most likely has already done it. Its not
that
hard and secondly I doubt I would ever see a dime from him. Wjy would
anyone give money for this it makes no sense to me.

LawCounsels

unread,
Apr 13, 2012, 5:34:46 AM4/13/12
to
>  My Crypto codehttp://bijective.dogma.net/crypto/scott19u.ziphttp://www.jim.com/jamesd/Kong/scott19u.zipold version
> My Compression codehttp://bijective.dogma.net/
> **TO EMAIL ME drop the roman "five" **
> Disclaimer:I am in no way responsible for any of the statements
>  made in the above text. For all I know I might be drugged.
> As a famous person once said "any cryptograhic
> system is only as strong as its weakest link"- Hide quoted text -
>
> - Show quoted text -

without .exe there is no way to know if your solution attains 0.08
bits savings per sequence

should above attains your solution itself already guarantees
underwrites the prize

Thomas Richter

unread,
Apr 13, 2012, 7:48:38 AM4/13/12
to

> YES THE PROBLEM NEEDED TO& WAS 'SIMPLIFIED' ... now that some
> understandings in place alreasdy I can now reveal the 'final' complete
> details :

So, in other words, you keep changing the problem?

> . each sequence's distance to the start position of the next sequence
> ( ie from start of present sequence TO start of the next sequence ) IS
> ALWAYS FIXED
> [ ie ascertainable # of bits ! ] .... FOR SIMPLICITY AT THIS STAGE can
> just assume this
> to be constant fixed 513 bits throughout !!!

Still, makes no sense. Do you mean 513 *symbols*? Or do you mean that a
sequence has to terminate after 513 output bits have been produced? Or
"N" bits? Or "N" symbols?

Look, your problem description is lousy, at least.

> [ should the present sequence be just 2 symbols " C C " ( 4 bits
> long ) THEN will be followed
> by 513 -4 = 509 bits of filler 'dont-care' bits ]

Huh? I don't need "filler bits" if I know that the sequence terminates.
There are surely better encodings for a finitely sized sequence provided
I know that it must terminate after N symbols - or bits?

Sorry, no go.

James Dow Allen

unread,
Apr 13, 2012, 8:15:40 AM4/13/12
to
On Apr 13, 6:48 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
> So, in other words, you keep changing the problem?

And this was AFTER he posted his FINAL & COMPLETE SPECIFICATIONS.
In capital letters, no less.

I think OP needs to hire someone at journeyman's wages to
figure out what is problem is, but I don't think OP can
afford 100's of dollars. Unless someone can make change
for a Million-dollar bill.

BTW, has OP acknowledged yet that he started this thread
on the day called Poisson d'Avril?

James

lawco...@gmail.com

unread,
Apr 13, 2012, 9:52:35 AM4/13/12
to
On Friday, April 13, 2012 12:48:38 PM UTC+1, Thomas Richter wrote:
> > YES THE PROBLEM NEEDED TO& WAS 'SIMPLIFIED' ... now that some
> > understandings in place alreasdy I can now reveal the 'final' complete
> > details :
>
> So, in other words, you keep changing the problem?

NOT REALLY KEEP CHANGING ... necessarily best starts with the main basic problem description 1st THEN once audience showed overall understandings then to fill in the few fine details left out from 'simplification' ( to help get correct bearings 1st ) ... IN FACT , IT WAS 'EXPLICIT' WRITTEN "...SIMPLIFIED..." , this must means something

THE MAIN PROBLEM AS "SIMPLIFIED" DESCRIBED was a real problem on its own right with most already thought practical intractable not possible represents with less than 1.5 bits / symbol practically [ no practical present existing compressor can do this )

> > . each sequence's distance to the start position of the next sequence
> > ( ie from start of present sequence TO start of the next sequence ) IS
> > ALWAYS FIXED
> > [ ie ascertainable # of bits ! ] .... FOR SIMPLICITY AT THIS STAGE can
> > just assume this
> > to be constant fixed 513 bits throughout !!!
>
> Still, makes no sense. Do you mean 513 *symbols*? Or do you mean that a
> sequence has to terminate after 513 output bits have been produced? Or
> "N" bits? Or "N" symbols?

has to terminate after 513 output BITS

> Look, your problem description is lousy, at least.

was already from very beginning 'explicit' stated problem description was 'simplified' to help get basic understandings 1st... the subsequent fine details ( once basic grasp of problem gained ) does not alter things tremendous

... will overload everyone to start with full-monty

> > [ should the present sequence be just 2 symbols " C C " ( 4 bits
> > long ) THEN will be followed
> > by 513 -4 = 509 bits of filler 'dont-care' bits ]
>
> Huh? I don't need "filler bits" if I know that the sequence terminates.
> There are surely better encodings for a finitely sized sequence provided
> I know that it must terminate after N symbols - or bits?
>
> Sorry, no go.


yes ... you dont need them ... these so-called "filler bits" really are just other compressed data part which needs not concern us

lawco...@gmail.com

unread,
Apr 13, 2012, 9:57:17 AM4/13/12
to
On Friday, April 13, 2012 1:15:40 PM UTC+1, James Dow Allen wrote:
> On Apr 13, 6:48 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
> > So, in other words, you keep changing the problem?
>
> And this was AFTER he posted his FINAL & COMPLETE SPECIFICATIONS.
> In capital letters, no less.

was also then 'explicit' stated to be 'SIMPLIFIED"

> I think OP needs to hire someone at journeyman's wages to
> figure out what is problem is, but I don't think OP can
> afford 100's of dollars. Unless someone can make change
> for a Million-dollar bill.

well this takes 2 stage to reach , but I think more effective did the job easing in audiences 1st with main "simplified" problem description

> BTW, has OP acknowledged yet that he started this thread
> on the day called Poisson d'Avril?

NOT if someone comes up fine tunes improved bits savings/ per sequence
to 0.08 from present 0.063 attained


lawco...@gmail.com

unread,
Apr 13, 2012, 10:03:32 AM4/13/12
to
On Friday, April 13, 2012 2:57:17 PM UTC+1, lawco...@gmail.com wrote:
> On Friday, April 13, 2012 1:15:40 PM UTC+1, James Dow Allen wrote:
> > On Apr 13, 6:48 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
> > > So, in other words, you keep changing the problem?
> >
> > And this was AFTER he posted his FINAL & COMPLETE SPECIFICATIONS.
> > In capital letters, no less.
>
> was also then 'explicit' stated to be 'SIMPLIFIED"

IF SOMEONE COMES UP WITH 0.08 BITS SAVINGS PER SEQUENCE ON THIS POSTED
"SIMOPLIFIED" FINAL & COMPLETE SPECIFICATIONS ALL THE SAME ( without any of the subsequent fine filled-in details ) WINS THE REWARD ..... in this sense it was Final & Complete enough to win the rewards base on

Thomas Richter

unread,
Apr 13, 2012, 6:54:04 PM4/13/12
to
On 13.04.2012 15:52, lawco...@gmail.com wrote:
> On Friday, April 13, 2012 12:48:38 PM UTC+1, Thomas Richter wrote:
>>> YES THE PROBLEM NEEDED TO& WAS 'SIMPLIFIED' ... now that some
>>> understandings in place alreasdy I can now reveal the 'final' complete
>>> details :
>>
>> So, in other words, you keep changing the problem?
>
> NOT REALLY KEEP CHANGING ... necessarily best starts with the main basic problem description 1st THEN once audience showed overall understandings then to fill in the few fine details left out from 'simplification' ( to help get correct bearings 1st ) ... IN FACT , IT WAS 'EXPLICIT' WRITTEN "...SIMPLIFIED..." , this must means something

It did change the problem completely. It seems "a minor detail" for you,
but unless the problem specification is complete, there is no solution.
Just to give you the difference, compressing an infinite number of
sequences to a common stream is a *different* problem than compressing a
set of finitely sized streams. Actually, for the reasons indicated above.

BTW, "upper case" doesn't make you sound more credible.

>> Huh? I don't need "filler bits" if I know that the sequence terminates.
>> There are surely better encodings for a finitely sized sequence provided
>> I know that it must terminate after N symbols - or bits?
>>
>> Sorry, no go.
>
>
> yes ... you dont need them ... these so-called "filler bits" really are just other compressed data part which needs not concern us

No, again - a problem change! Yes, it does make a change whether the
stream terminates at this specific point or whether "other data" follows
and you need to include truncation or termination information of some sorts.

Just to give you an idea why this is important: Consider an arithmetic
coder to generate the output. By convention, we can truncate the file if
we are sure that the remaining sequence generated by the AC coder is an
infinite repetition of zeros (or ones, for that matter). A decoder
could, if running into an EOF, pull "by convention" just zeros or ones.
This would allow a clever encoder to truncate the output stream by all
terminating zero bits (or one-bits, depending on the convention) and
hence make it shorter.

If this sounds rather academic to you, then I should probably say that
exactly this type of trick is played in some existing compression
applications - of course using the "inherent" side information that the
compressed stream has a finite size, and that this size can be obtained
from other sources as side information. This trick allows you to
compress a little bit better. Before you ask, it is prior art and a
known trick, so no money to collect for this.

So, no, your problem IS NOT SPECIFIED FULLY. (Does upper case help to
understand this better? I doubt it, but let's try).

What would be necessary to know, for example, whether these "other bits"
need to be decoded as well, probably by using the same arithmetic
decoder. If so, a simple "bit counter" to define the size of the decoder
input (rather than the decoder output) does not make too much sense in
first place. Or whether the input stream is "self-truncating", i.e.
there is any other side information I could use to understand that
decoding has reached an end.

To give you a glimpse of an idea what AC coders do: There are typically
two lengths one could understand as the "size" of an AC coded stream:

a) the minimal number of input bits required to decode the signal up to
a point A and then stop ("truncation") - and -

b) the minimal number of input bits required to decode the signal up to
a point, and then to correctly continuing the decoding for any bits beyond.

Tricks mentioned as in the paragraph above should make clear that in
general the "length a)" is smaller than the "length b)", in the MQ coder
typically by two to three *bytes*. This is all due to the internal
buffering and delay.

Unless you understand such technicalities, or at least are able to
define the problem really completely, there is absolutely nothing that
can be said.

Greetings,
Thomas




LawCounsels

unread,
Apr 16, 2012, 5:34:29 AM4/16/12
to
On Apr 10, 10:29 am, LawCouns...@aol.com wrote:
> On Monday, 9 April 2012 17:03:04 UTC+1, biject  wrote:
> > On Apr 9, 5:19 am, James Dow Allen <jdallen2...@yahoo.com> wrote:
> > > On Apr 9, 12:58 pm, lawcouns...@gmail.com wrote:
>
> > > > a source with probability of producing an 'a' symbol 25% of time
> > > > a 'b' symbol 25% of times a 'c' symbol 50% of times
>
> > > Allow me to recommend the optimal Huffman code:
> > >    c - 0
> > >    a - 10
> > >    b - 11
> > > This can be improved, though only slightly, using details
> > > you've omitted from your summary.
>
> > > This was so trivial, I'll discount it down to, say $950.
>
> > > If this is unsatisfactory, I'll withdraw from the contest.
> > > Even paid at minimum wage I'm afraid it would take significant
> > > funds (payable in advance, please!) just to elicit a
> > > proper problem statement from you.
>
> > > I don't have PayPal.  Contact me for instructions on how
> > > to pay the $950.  :-)
>
> > > James
>
> >   Lets see c is .5 * 1 = .5  b = .25*2 = .5  c = .25*2 = .5
> > see thats .5 + .5 + .5 = 1.5  for the average sequence  while
> > if you encode each with 1.5849625  you save about .0849625 which
> > is more than the .08  It appears your in the money. I have a
> > hunch that there still is something missing in which case I would
> > not count on the money yet.
>
> >  First of all does he want at least .08 bits saved in every case
> > or just the average case.  If its the average case you could be
> > on the right track.  If its every case then since you write only
> > whole numbers of bits the .08 savings gets a little harder. It
> > would be nice if the guy decides you haven't won just what does
> > he want. I have read it several times and yet I do not think its
> > clear enough to tackle without him saying oh I meant this and not
> > that.
>
> > Assuming he doesn't declare you the winner
> > 1) is the savings an average things or does each file have to be less.
> > 2) how do you measure the savings is it .08 from a 1.5849625 per
> > symbol
> > or is it .08 less then 1.5
> > 3) not sure why you say source C = .5 while A and B = .25  the
> > fact is even if the source is A = B = C = 1/3  for short files
> > if you run the sources enough times and created a 100 files each
> > you still could get the same set of 100 files for both cases.
> > So you test set up is not valid. There is nothing magical about
> > your source.  Except if I know its a fixed IID souce from say 2 or
> > 3 different models as you create more files. You can with increasing
> > probability determine which one it most likely is. But you can't be
> > 100% certain which one it is unless you do an ever increasing number
> > of file.
>
> >  David A. Scott
> > --
> >  My Crypto code
> >http://bijective.dogma.net/crypto/scott19u.zip
> >http://www.jim.com/jamesd/Kong/scott19u.zipold version
> > My Compression codehttp://bijective.dogma.net/
> > **TO EMAIL ME drop the roman "five" **
> > Disclaimer:I am in no way responsible for any of the statements
> >  made in the above text. For all I know I might be drugged.
> > As a famous person once said "any cryptograhic
> > system is only as strong as its weakest link"
>
> THE COMPLETE SPECIFICATIONS :
> =============================
>
> 1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS  & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b'  & the # of 'c' is invariable near = 2 * the # of 'b'  THUS the probability model here is 25% : 25% : 50%
>
> 2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences
>
> 3. IF you compressed file bitslength  =<  1.5 * N   - ( 0.08 * N )  THEN YOU WIN THE REWARDS !   ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long   ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )
>
> 4. there is no restrictions on memory storage requirements , you may even show your .exe works on 'research network supercomputer cluster' , BUT processing must complete within a day- Hide quoted text -
>
> - Show quoted text -

=============================================================
UPDATE ANNOUNCEMENT :
US$3M DATA COMPRESSION PRIZE :
=============================================================

AM NOW MADE AWARE OF COUPLE OF SOLUTIONS PUT FORTH CERTAIN
TO EASILY FAR EXCEED THE REQUIRED 0.08 BIT COMPRESSION SAVINGS PER
SEQUENCE NEEDED ....

WILL NOW NEEDS TAKE SOME TIME TEST DEVELOP CONFIRM THE SOLUTION

KEEP YOUR SOLUTIONS COMING .....
.

Warm Regards,
LawCounsels

LawCounsels

unread,
Apr 16, 2012, 5:48:46 AM4/16/12
to
On Apr 13, 11:54 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
YES TRUE in the above MQ case

However here would already be failing badly real 'desperate' if solver
here needs ever resort to
possible but 'far far too small' potential savings from this AC
truncations which in all cases further
diminishes to insignificant when N -> arbitrary large size [ & only if
AC is the solution method
depended on ... ] .... would be better here not to be needless
distracted by non-essential details

Thomas Richter

unread,
Apr 16, 2012, 7:08:45 AM4/16/12
to
On 16.04.2012 11:48, LawCounsels wrote:

> YES TRUE in the above MQ case
>
> However here would already be failing badly real 'desperate' if solver
> here needs ever resort to
> possible but 'far far too small' potential savings from this AC
> truncations which in all cases further
> diminishes to insignificant when N -> arbitrary large size [& only if
> AC is the solution method
> depended on ... ] .... would be better here not to be needless
> distracted by non-essential details

No, its savings are of the same magnitude as the savings you get from
requiring that the sequence is 513 bits long. N is not infinity here,
unless you changed the problem again. If N goes to infinity, there are
no savings from the truncation condition, neither from the c = b + 2
condition. This condition only provides "split points", but these do not
matter for an infinite number of sequences.

So, now what? Will you provide a better problem desciption - the current
description is not sufficient.




lawco...@gmail.com

unread,
Apr 16, 2012, 7:40:46 AM4/16/12
to
solutions to the Data Compression REWARDS as already stated ( without any subsequent further fine details needed ) does not require any of the possible but 'far far too small' potential savings AC truncations savings if further now imposes limits each sequence length to 513 bits , to attain Net 0.08 bits or more compressions savings

.... the couple 'certain' solutions put forth so far now to be tested developed to confirm has not made use of AC at all

think the Data Compression REWARDS as stated best SIMPLEST be just left as is intact

[ however , I should also leave to anyone attempting a winning solution to 'as of right' if he so chooses , to assume each sequence can be of at most 513 bits long ... if this helps his solution strategy ]

James Dow Allen

unread,
Apr 16, 2012, 11:11:11 AM4/16/12
to
On Apr 16, 6:08 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
> > depended on ... ] .... would be better here not to be needless
> > distracted by non-essential details
>
> No, its savings are of the same magnitude as the savings you get from
> requiring that the sequence is 513 bits long.

I don't think this is strictly true.

My understanding of the problem may be incomplete(!) but
I think you're comparing a *per-file* truncation cost of,
let's guess, about 5 bits *on average*, with a *per-token*
savings from the 513-bit limit.

Let's guess that savings is one nonillionth of a bit.
(Unless the a-b-c constraints have changed while I wasn't
looking, one nonillionth of a bit may be a severe
overestimate but let's go on....)

While one nonillionth of a bit is small, it's a *per-token*
savings and after a nonillion tokens, will add up to
a bit. After several *decillion* tokens, these will add to
several thousand bits, while the truncation cost will
remain only about 5 bits, on average.

Does this help?

James

Thomas Richter

unread,
Apr 16, 2012, 1:31:47 PM4/16/12
to
Not really, and I don't think this is the case. As far as I understand
the problem now, the problem is that *each* of the individual
subsections of the file have to end at 513 bits (output bits). Why that
makes sense I do not know, but let it be as it is: It means that the
number of necessary truncations is linear in the file size, and so is
the number of truncations due to the condition of c = b + 2. IOW, I
believe these two effects are of the same magnitude. All that depends of
course on how this "513 bits" constraint comes into play, which I do not
understand fully.

Greetings,
Thomas


James Dow Allen

unread,
Apr 16, 2012, 5:27:46 PM4/16/12
to
On Apr 17, 12:31 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
> As far as I understand
> the problem now, the problem is that *each* of the individual
> subsections of the file have to end at 513 bits (output bits). Why that
> makes sense I do not know, ... All that depends of
> course on how this "513 bits" constraint comes into play, which I do not
> understand fully.

I apologize; it looks like I missed the latest memo.

Do we still get the $1 MILLION REWARD for satisfying the
FINAL and COMPLETE specification, with this latest frill just
an extra-credit problem for the $3 MILLION bonus?

Is there a YouTube link?

Best ever,
James

LawCounsels

unread,
Apr 24, 2012, 6:46:23 AM4/24/12
to
Hi :

initial tests confirmed easy exceeded 8.0 bits savings per sequence
target .... am making this into .exe now

can any of you help lead arrange R&Ds at Intel/ Google / NASA / CERN
the likes ?

BTW : has any of you get close to 8.0 as yet ? pls tell

Warm Regards,
LawCounsels

Jim Leonard

unread,
Apr 24, 2012, 10:27:46 AM4/24/12
to
On Apr 24, 5:46 am, LawCounsels <lawcouns...@gmail.com> wrote:
> initial tests confirmed easy exceeded 8.0 bits savings per sequence
> target .... am making this into .exe now

Congratulations. Can you reverse the target back into the original
source?

> can any of you help lead arrange R&Ds at Intel/ Google / NASA / CERN
> the likes ?

Sure, as soon as you have a working decompressor.

> BTW : has any of you get close to 8.0 as yet ? pls tell

Depends on the data. For example, an 8-bit RLE scheme will achieve
253:1.

Fibonacci Code

unread,
Apr 28, 2012, 12:55:13 PM4/28/12
to
Years ago, I had already achieve over 8.0 bits sequence for random
tenary sequences. (Random Generator) with n = 128


Cheers

Raymond.




Fibonacci Code

unread,
Apr 30, 2012, 10:36:33 AM4/30/12
to
On Apr 11, 12:56 am, LawCounsels <lawcouns...@gmail.com> wrote:
> On Apr 10, 5:03 pm, James Dow Allen <jdallen2...@yahoo.com> wrote:
>
>
>
>
>
>
>
>
>
> > Does it seem strange that I knew I was going to dislike
> > New Google Groups? (though I didn't know *what* I'd dislike
> > about it.)  Almost every change made in some systems seems
> > for the worse not better.
>
> > In the example below, I click on lawcounsel's link only
> > to get some "overview" page with no content.
>
> > (Old Google Groups is amusing too, of course.  I'm now
> > looking at a window with no less than five scroll-bars,
> > 3 of which I had to manipulate just to get this far!)
>
> > On Apr 10, 9:29 pm, lawcouns...@gmail.com wrote:
>
> > > I would refer you to Entropy Definition : Request for Comments
> > >https://groups.google.com/forum/#!to...on/w7QSfqtfeg4
> > > These sequences have already been encoded compression saves minimum Net 0.063 bits per sequence !!!
> > > 1st to reach 0.08 bits Net compression savings per sequence WINS US$3M+
>
> > Three comments:
> > 1.  Your earlier post had
> >    <= 1.5 * N   - ( 0.08 * N )
> > Are we now to understand that the first N here is number of trits,
> > and the second number of sequences?
>
> YES ... corrected since to read  <= 1.5 * N   - ( 0.08 * # OF
> SEQUENCES )
>
> > 2.  I didn't read about the "saves minimum Net 0.063 bits per
> > sequence."
> > (As I said the link doesn't work.)  By "minimum" do you mean
> > "actual in one experiment"?  I'm guessing fluke.
>
> FOR "ANY" ONE SUCH SEQUENCE [ mathematics 'guaranteed' minimum 0.063
> bits savings ]
>
>  .... ALSO VERIFIED OVER LARGE # OF SUCH SEQUENCES
>
> > 3.  It may seem paradoxical that no compression savings are available
> > despite the c=b+2 constraint.  But consider a simpler related problem:
> >    Flip a coin and stop on the first Heads.  Compress the result.
> > H occurs with prob. 1/2
> > TH occurs with prob 1/4
> > TTH with prob 1/8, etc.
> > No way to compress despite the constraint.
>
> perhaps particular constraint here is superficial ... reducible
> 'fundamental' equivalent to common 'fair coin' tosses type here
>
>
>
>
>
>
>
> > James

Any very high entropy binary file when chopped to ternary trits will
have close distribution of 50 percent 0, 25 percent 10 , 25 percent
11.

Where reverse Huffman apply 0-A, 10-B, 11-C

So this is what law mean turning a random file to biases probability.

He is now try to ask this room to compress the reverse Huffman
probability,

An attempt to create recursive compression


0101100010101111000001111
0
10
11

Well, 3 million usd was nothing to this kind of feat, well until now
luckily,

No one found the solution yet, include me !

it won't work dear, I did work on the same bias 7 years ago.

I do have a few close to jackpot algorithms, but still I had abandon
them, the reason is simple,

50% 25% 25% is not precise and not constant for all the files you try
to resolve.

The differences need entropy and thus overwrite the savings.


Regards,

Fibonacci



lawco...@gmail.com

unread,
Apr 30, 2012, 11:34:57 AM4/30/12
to
On Monday, April 30, 2012 3:36:33 PM UTC+1, Fibonacci Code wrote:
>
> Any very high entropy binary file when chopped to ternary trits will
> have close distribution of 50 percent 0, 25 percent 10 , 25 percent
> 11.

am awful busy at the moment hardly time to spare .... can you kindly 1st help quick clear simple explain how arbitrary choosing any # of bits from a random binary file will ALWAYS give 50% '0' 25% '10' 25% '11' ?

... to proceed then : )


BTW : are you or anyone here fluent with C# & leading edge combinatorics ( 'multinomial' enumerative lexicocographic rank Index etc , or can easy quick pick up ) ... am looking for a few more high calibre 'confidential' co-researchers developers on this unprecedented immensely rewarding profit-shares .

email your short details private to LawCounsels at aol dot com

Thomas Richter

unread,
Apr 30, 2012, 12:53:39 PM4/30/12
to
Am 30.04.2012 17:34, schrieb lawco...@gmail.com:
> On Monday, April 30, 2012 3:36:33 PM UTC+1, Fibonacci Code wrote:
>>
>> Any very high entropy binary file when chopped to ternary trits will
>> have close distribution of 50 percent 0, 25 percent 10 , 25 percent
>> 11.
>
> am awful busy at the moment hardly time to spare .... can you kindly 1st help quick clear simple explain how arbitrary choosing any # of bits from a random binary file will ALWAYS give 50% '0' 25% '10' 25% '11' ?

If the file is a perfect Bernoulli source, then indeed the probabilities
are exactly that: p(0) = 1/2 = p(1). Since we assume the source to be
iid, symbol probabilities are independent, hence p(x_n x_{n-1} ) =
p(x_n) * p(x_{n-1}) and hence p(00) = p(10) = p(01) = p(11) = 1/2 * 1/2
= 1/4.

This is a five-second argument. Note that "random" may mean a lot of
things, but Bernoulli source is probably a good interpretation (as in:
what an ideal coin throw experiment should generate, for a fair coin).

> BTW : are you or anyone here fluent with C#& leading edge combinatorics ( 'multinomial' enumerative lexicocographic rank Index etc , or can easy quick pick up ) ... am looking for a few more high calibre 'confidential' co-researchers developers on this unprecedented immensely rewarding profit-shares .

No thanks - not interested in doing business with you.

Greetings,
Thomas

Fibonacci Code

unread,
Apr 30, 2012, 12:58:38 PM4/30/12
to
Not always, as I mentioned

50% 25% 25% is not precise and not constant for all the files you try
> to resolve.

It is very likely for high entropy file, eg densely packed.

So it won't work.


Cheers,

Fibonac

lawco...@gmail.com

unread,
Apr 30, 2012, 1:18:45 PM4/30/12
to
Agrees ! it certainly will not work as you correct said on any very high entropy binary file chopped to ternary trits ( since very likely will
have close distribution of 50 percent 0, 25 percent 10 , 25 percent
11 ).....

IMPORTANT TO NOTE your very same high entropy binary file chopped to ternary trits above IS VERY COMPLETE DIFFERENT from the 'constraint' present in the US$3M data compression sequences NAMELY EACH SEQUENCE WHICH IS COMPRESSABLE ENDS WITH EXACT 2 MORE 'C's THAN # OF 'B's within each sequence [ this is complete different & absent from merely 50% '0' 25% '10' 25% '11' probabilities ]



> Cheers,
>
> Fibonac

0 new messages