769 views

Skip to first unread message

Jul 10, 2005, 11:28:17 AM7/10/05

to

Hi,

Can you explain in layman terms why the MT (e.g. MT19937) isn't suitable for

encryption purposes?

TIA

Can you explain in layman terms why the MT (e.g. MT19937) isn't suitable for

encryption purposes?

TIA

Steven

(I'm not a cryptologist, nor a mathematician, so please type slowly)

Jul 10, 2005, 1:37:43 PM7/10/05

to

On Sun, 10 Jul 2005 15:28:17 GMT, "steven"

<steven_s...@telenet.be> wrote:

<steven_s...@telenet.be> wrote:

>Hi,

>Can you explain in layman terms why the MT (e.g. MT19937) isn't suitable for

>encryption purposes?

Because it is a linear non-secure pseudorandom number generator.

Makoto Matsumoto et al have a submitted cryptographically improved

version of MT to the ECRYPT Stream Cipher Project:

http://www.ecrypt.eu.org/stream

http://www.ecrypt.eu.org/stream/ciphers/cryptmtfubuki/cryptmtsource.zip

http://www.ecrypt.eu.org/stream/ciphers/cryptmtfubuki/cryptmtfubuki.pdf

Hope that helps

Wolfgang

--

In order to e-mail me a reply to this message, you will have

to remove PLEASE.REMOVE from the address shown in the header

or get it from http://home.netsurf.de/wolfgang.ehrhardt

(Free AES, CRC, Hash, and HMAC source for Pascal/Delphi)

Jul 10, 2005, 1:55:04 PM7/10/05

to

"Wolfgang Ehrhardt" <Wolfgang.Ehrhar...@munich.netsurf.de>

wrote in message news:42d15ca9...@news.individual.net...

> On Sun, 10 Jul 2005 15:28:17 GMT, "steven"

> <steven_s...@telenet.be> wrote:

>

> >Hi,

> >Can you explain in layman terms why the MT (e.g. MT19937) isn't suitable

for

> >encryption purposes?

>

> Because it is a linear non-secure pseudorandom number generator.

> Makoto Matsumoto et al have a submitted cryptographically improved

> version of MT to the ECRYPT Stream Cipher Project:

>

> http://www.ecrypt.eu.org/stream

> http://www.ecrypt.eu.org/stream/ciphers/cryptmtfubuki/cryptmtsource.zip

> http://www.ecrypt.eu.org/stream/ciphers/cryptmtfubuki/cryptmtfubuki.pdf

>

> Hope that helps

>

Not really :-(

Maybe I should have posed my question differently: Why is MT non-secure

(while e.g. BBS is secure)?

Thanx for replying however.

Steven

Jul 10, 2005, 2:01:25 PM7/10/05

to

"steven" <steven_s...@telenet.be> writes:

> Maybe I should have posed my question differently: Why is MT non-secure

> (while e.g. BBS is secure)?

> Maybe I should have posed my question differently: Why is MT non-secure

> (while e.g. BBS is secure)?

Um, why are oranges orange (while e.g. lemons are yellow)?

MT was not designed as a cryptographic generator. There's a

straightforward way (described in the MT paper) to reconstruct the

internal state. Remember where it says something like it's

equidistributed in 623 dimensions? That means that if you start

counting in 624 or more dimensions, all bets are off. That's

different from a cryptographic generator, which can't be distinguished

from uniform no matter how many dimensions you use.

Jul 10, 2005, 3:41:55 PM7/10/05

to

In article <7xwtnyz...@ruckus.brouhaha.com>,

There can be no "cryptographic generator" in that sense

which uses a computer program. This includes BBS, which

to be sure has more dimensions, and is not linear.

Some of the seed programs for the Mersenne generator have

been too simple. Has anyone been able to crack it given

random seeding for all of the input, which would make it

more than 19,000 seed bits?

--

This address is for information only. I do not claim that these views

are those of the Statistics Department or of Purdue University.

Herman Rubin, Department of Statistics, Purdue University

hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Jul 10, 2005, 3:53:46 PM7/10/05

to

hru...@odds.stat.purdue.edu (Herman Rubin) writes:

> >counting in 624 or more dimensions, all bets are off. That's

> >different from a cryptographic generator, which can't be distinguished

> >from uniform no matter how many dimensions you use.

>

> There can be no "cryptographic generator" in that sense

> which uses a computer program. This includes BBS, which

> to be sure has more dimensions, and is not linear.

> >counting in 624 or more dimensions, all bets are off. That's

> >different from a cryptographic generator, which can't be distinguished

> >from uniform no matter how many dimensions you use.

>

> There can be no "cryptographic generator" in that sense

> which uses a computer program. This includes BBS, which

> to be sure has more dimensions, and is not linear.

Well, I mean in a feasible amount of time on a real computer. That

means there's no P-time distinguishing algorithm. Whether BBS fits

that description is an open problem.

> Some of the seed programs for the Mersenne generator have

> been too simple. Has anyone been able to crack it given

> random seeding for all of the input, which would make it

> more than 19,000 seed bits?

I believe the MT paper says how to do that.

Jul 10, 2005, 4:42:23 PM7/10/05

to

Herman Rubin wrote:

>Some of the seed programs for the Mersenne generator have

>been too simple. Has anyone been able to crack it given

>random seeding for all of the input, which would make it

>more than 19,000 seed bits?

>Some of the seed programs for the Mersenne generator have

>been too simple. Has anyone been able to crack it given

>random seeding for all of the input, which would make it

>more than 19,000 seed bits?

Yes, I think so. I don't recall the details, but if I remember

correctly it was reasonably straightforward.

The cryptanalyst needs more than 19,000 bits of output in this case,

of course.

Jul 11, 2005, 11:56:41 PM7/11/05

to

Herman Rubin wrote:

> Some of the seed programs for the Mersenne generator have

> been too simple. Has anyone been able to crack it given

> random seeding for all of the input, which would make it

> more than 19,000 seed bits?

> Some of the seed programs for the Mersenne generator have

> been too simple. Has anyone been able to crack it given

> random seeding for all of the input, which would make it

> more than 19,000 seed bits?

Cracking it is trivial even in that case, provided you have just over

19,000 output bits.

John Savard

Jul 12, 2005, 12:01:20 AM7/12/05

to

steven wrote:

> Can you explain in layman terms why the MT (e.g. MT19937) isn't suitable for

> encryption purposes?

> Can you explain in layman terms why the MT (e.g. MT19937) isn't suitable for

> encryption purposes?

> (I'm not a cryptologist, nor a mathematician, so please type slowly)

As has been noted, it's a linear shift register.

But that doesn't really answer the question. Why are linear shift

registers insecure?

It is because there is a simple relationship between the successive

32-bit words of pseudorandom data that the Mersenne Twister produces.

The relationship will allow those words to be in a sequence that does

not exactly repeat itself for a very long time, but it only takes a

short time to determine all the information you need to duplicate the

entire sequence.

The successive words of data give the entire internal state; in a

secure keystream generator, the internal state is nearly impossible to

derive from the outputs.

John Savard

Jul 12, 2005, 4:25:53 AM7/12/05

to

<jsa...@ecn.ab.ca> wrote in message

news:1121140493.0...@g47g2000cwa.googlegroups.com...

news:1121140493.0...@g47g2000cwa.googlegroups.com...

Now that's a clear answer. Thanks John.

Silly question: what if the feedback polynomial would change after each

shift?

Steven

Jul 13, 2005, 4:06:41 PM7/13/05

to

steven wrote:

> Silly question: what if the feedback polynomial would change after each

> shift?

> Silly question: what if the feedback polynomial would change after each

> shift?

Answer: that depends on how complicated the rule is that makes it

change.

John Savard

http://www.quadibloc.com/crypto/intro.htm

Jul 13, 2005, 4:40:48 AM7/13/05

to

What does "trivial" means in this case?

Cristiano

Jul 13, 2005, 7:04:22 PM7/13/05

to

"steven" <steven_s...@telenet.be> wrote in message

news:l8bAe.141612$NN6.7...@phobos.telenet-ops.be...

I have never had much interest in cryptography,

but some of the work I have done on producing

and testing randomness seems to be of interest to people

in that field. I have seen frequent mention of the

Mersenne Twister, which does very well in tests of

randomness such as in the Diehard battery. A good part

of its appeal seems to be that it has a very long period.

Far less complicated methods can be used to get much

longer periods and as good or better results in tests of

randomness, with simpler programs, and---at least

to me---more attractive theory.

Here is an example in C, with period some 10^33000

times as long as that of the Mersenne Twister:

static unsigned long Q[4096],c=362436;

unsigned long CMWC4096(void){

unsigned long long t, a=18782LL,b=4294967295LL;

static unsigned long i=4095;

unsigned long x,r=b-1;

i=(i+1)&4095;

t=a*Q[i]+c;

c=(t>>32); t=(t&b)+c;

if(t>r) {c++; t=t-b;}

return(Q[i]=r-t); }

The generator produces, in reverse order, the base

b=2^32-1 "digits" in the expansion of 1/p, where

p=18782b^4096+1 and the period of this CMWC RNG is

p-1>10^39460.

It requires 4096 seeds 0 <= x_i < 2^32-1 (the Q array)

and an initial seed 0 <= c < 18782.

For any such seed set, the RNG produces, in reverse

order, the base-b digits of the expansion of k/p

for some 0<k<p.

Because b is a primitive root of p, any two such

expansions are just circular rotations of one another.

An initial seed c < 18782 and the 4096 elements

in the static array Q[4096] should be assigned

values before calls to the function CMWC4096(),

otherwise the first few thousand returned values

will be zeros. (That is consistent with the view

that the choice of seeds merely chooses a random

starting point in a huge circle of over 10^39460

base-(2^32-1) digits, (all possible 32-bit integers

except for FFFFFFFF)).

Failing to initialize the Q[4096] array (set by

default to 0's) merely provides a starting point

at a long string of zeros, which should be occasionally

encountered in any random string.)

Is it possible to predict forthcoming values,

given a string of previous values?

Yes, but you need more than 4097 previous values to do it,

(better than the Mersenne Twister, which needs < 650).

But given, say, 2000 previous values, you only know

that you are somewhere in the immense loop of over

10^39460 "digits" of the expansion of 1/p, and that

particular string will appear over 2^2096 times in the

expansion.

So you would need a lot of computer time to find just

where it appears, and thus be able to predict coming values.

(I have, thanks to help of colleagues in Hong Kong,

a java version of CMWC4096, should readers ask for it.)

An answer to your question might be framed in terms

similar to that above:

Most random number generators produce a sequence of numbers

which can be likened to the loop of digits in the expansion of 1/p above,

(but with far smaller loops). Choosing random seed values, (the random

part of RNG), puts you somwhere in that loop, from where successive

numbers follow. If you have a certain number of observed values you may

be able to find their place in that loop, and thus figure out the rules of

the

RNG, or, knowing its type, find its parameters or rules of production.

But if the RNG's loop is very very long, you may have to work hard to

predict future values. The Mersenne Twister requires a pretty long

observed sequence, but far short of those based on the expansions

of 1/p, for which current abilities to find primes can be expected to

provide expansions that will require some 250000 32-bit numbers

in order to find your place in the loop.

George Marsaglia

Jul 13, 2005, 7:21:39 PM7/13/05

to

"George Marsaglia" <g...@stat.fsu.edu> wrote in message

news:1-qdnU7JG-N...@comcast.com...

>

>

> Here is an example in C, with period some 10^33000

> times as long as that of the Mersenne Twister:

>

> static unsigned long Q[4096],c=362436;

> unsigned long CMWC4096(void){

> unsigned long long t, a=18782LL,b=4294967295LL;

> static unsigned long i=4095;

> unsigned long x,r=b-1;

> i=(i+1)&4095;

> t=a*Q[i]+c;

> c=(t>>32); t=(t&b)+c;

> if(t>r) {c++; t=t-b;}

> return(Q[i]=r-t); }

>

Thank you!

Jul 14, 2005, 2:32:12 PM7/14/05

to

George Marsaglia <g...@stat.fsu.edu> wrote:

> "steven" <steven_s...@telenet.be> wrote in message

> news:l8bAe.141612$NN6.7...@phobos.telenet-ops.be...

> > Hi,

> > Can you explain in layman terms why the MT (e.g. MT19937) isn't suitable

> > for

> > encryption purposes?

> > TIA

> >

> > Steven

> > (I'm not a cryptologist, nor a mathematician, so please type slowly)

> >

> I have never had much interest in cryptography,

> but some of the work I have done on producing

> and testing randomness seems to be of interest to people

> in that field. I have seen frequent mention of the

> Mersenne Twister, which does very well in tests of

> randomness such as in the Diehard battery. A good part

> of its appeal seems to be that it has a very long period.

> Far less complicated methods can be used to get much

> longer periods and as good or better results in tests of

> randomness, with simpler programs, and---at least

> to me---more attractive theory.

> Here is an example in C, with period some 10^33000

> times as long as that of the Mersenne Twister:

Do you have proff to such statements? Can you provide us with

a paper please?

JLC

Jul 14, 2005, 5:06:56 PM7/14/05

to

George Marsaglia wrote:

> Here is an example in C, with period some 10^33000

> times as long as that of the Mersenne Twister:

That could be interesting, but there is the proof that MT has a very good

k-distribution, for example it is:

(623,32)-distributed

(1246,16)-distributed

(2492,8)-distributed

(9968,2)-distributed

which could be useful in some application.

Please, could you post some value for your generator?

> static unsigned long Q[4096],c=362436;

You say that your generator requires an initial seed 0 <= c < 18782.

Can that c be changed in something like c=RND % 18782?

Thank you

Cristiano

Jul 14, 2005, 6:40:07 PM7/14/05

to

"Jean-Luc Cooke" <jlc...@engsoc.org> wrote in message

news:db6b3c$n90$1...@driftwood.ccs.carleton.ca...

Let b=2^32-1, p=18782*b^4096+1 and k=p-1.

It may take many hours to verify, (and many more

hours for the original search to find p),

but b^k=1 mod p and

b^(k/2) != 1 mod p and b^(k/9391) != 1 mod p

for 2 and 9391, the two prime divisors of p-1,

It follows that p is a prime for which b is a primitive root.

The period of the base-b expansion of 1/p is p-1;

more generally, the period of the base-b expansion of

k/p is the order of b mod p. (see, for example, Marsaglia

and Zaman, `` A new class of random number generators'', (1991)

Annals of Applied Probability} V 1, No. 3, 462--480, which merely

states what seems to have long been known in number theory.

Thus the base-b expansion of 1/p, for b=2^32-1,

has period 18782*b^4096 = 10^39460.877333...,

and this period is more than 10^33000 times as long as

that of the Mersenne Twister.

The C function CMWC4096 is a

complimentary-multiply-with-carry RNG, which generates

a sequence of pairs x_n,c_n by means of the recursions

x_n=(b-1)-(ax_{n-r}+c_{n-1} \bmod b,

c_n=\lfloor[(ax_{n-r}+c_{n-1})/b]\rfloor.

(TeX notation).

You will have to verify that the C function CMWC4096

does this for b=2^32-1.

A description of methods for producing, in reverse order,

the base-b digits of a rational k/p is in the above paper for

add-with-carry, and its extension to multipy-with-carry

is described, for example, in

http://tbf.coe.wayne.edu/jmasm/vol2_no1.pdf.

You might also want to read the postscript file

mwc1.ps in

The Marsaglia Random Number CDROM

including

The Diehard Battery of tests pf Randomness

Jul 14, 2005, 6:41:58 PM7/14/05

to

"Cristiano" <cristi...@NSquipo.it> wrote in message

news:QtABe.162196$75.69...@news4.tin.it...

Yes, ANY c in 0 <= c < 18782 will do, and ANY set of 32-bit values

in 0 <= x < 2^32-1 may be used as seeds in the Q array.

I don't know what (623,32), etc. means, but those who do should

be be able to run values to see whether the 10^49360 base-b "digits"

in the expansion of 1/p seen to have those properties. As with pi,e

and expansions of most irrationals, the expansions of rationals k/p

for large primes p are expected to pass extensive tests of randomness,

in the sense that they may be reasonably considered the output of

a sequence of iid variates taking values in [0,b-1) with equal

probabilities.

Extensive testing using Diehard and its the tough-test supplements

confirm this.

A paper verifying this for pi,e,sqrt(2) and various k/p is forthcoming

George Marsagliia

Jul 15, 2005, 8:19:35 PM7/15/05

to

George Marsaglia wrote:

> An initial seed c < 18782 and the 4096 elements

> in the static array Q[4096] should be assigned

> values before calls to the function CMWC4096(),

> otherwise the first few thousand returned values

> will be zeros. (That is consistent with the view

> that the choice of seeds merely chooses a random

> starting point in a huge circle of over 10^39460

> base-(2^32-1) digits, (all possible 32-bit integers

> except for FFFFFFFF)).

> Failing to initialize the Q[4096] array (set by

> default to 0's) merely provides a starting point

> at a long string of zeros, which should be occasionally

> encountered in any random string.)

> An initial seed c < 18782 and the 4096 elements

> in the static array Q[4096] should be assigned

> values before calls to the function CMWC4096(),

> otherwise the first few thousand returned values

> will be zeros. (That is consistent with the view

> that the choice of seeds merely chooses a random

> starting point in a huge circle of over 10^39460

> base-(2^32-1) digits, (all possible 32-bit integers

> except for FFFFFFFF)).

> Failing to initialize the Q[4096] array (set by

> default to 0's) merely provides a starting point

> at a long string of zeros, which should be occasionally

> encountered in any random string.)

I tested your generator using Qi=0 for 0 <= i <= 4095 and it fails the test

because I get many 0's followed by many 0xfffffffe's.

The test is failed also when I initialize Q with some sort of counter or

clock.

But if I discard the first 4096*10000 values or more, the generator seems to

pass the test.

Is there any method to initialize Q in the right way or the best thing is to

always discard the first few thousand returned values?

Thank you

Cristiano

Jul 16, 2005, 4:57:03 AM7/16/05

to

That is good, but the k-distribution is a "stronger" distribution.

This is what I mean with the "(n,b)-distributed" notation:

1) consider an n-dimensional hypercube;

2) take n consecutive b-bit numbers formed taking the b msb;

3) these n numbers are the coordinate of a point in the hypercube;

then, for the whole period, you will see exactly the same number of points

in each cell of the hypercube.

Cristiano

Jul 16, 2005, 6:06:47 AM7/16/05

to

"Cristiano" <cristi...@NSquipo.it> wrote in message

an initial carry c in 0 <= c < 18782 and 0 <= Q[i] < b=2^32-1,

merely places you at some unique random point in the immense cycle of

over 10^49360 digits in the base-b expansion of 1/p for

the prime p=18782*b^4096+1. There are exactly 18782*b^4096

ways to choose that initial random point, and ---just as, for example,

in the expansion of pi we must expect rare runs of many 0's---we should

expect that the expansion of 1/p should have rare sets of many 0's.

Choosing certain initial seed set puts you at one of those points. There are

others that yield the sequence 1,2,3,...,9999999999999 and others that will

give, in sequence, the birthdates and current weights of the entire

readership

of all Google newsgroups, in alphabetical order. But the rareness for

which

we get such output---or any particular output--- is consistent with our

mathematical model of randomness.

George Marsaglia

Jul 16, 2005, 8:42:25 AM7/16/05

to

I asked for the right seeding method because I *always* get bad results with

this seeding:

Q[i]= RDTSC() % 0xffffffff

the RDTSC() function reads the time-stamp counter (a 64-bit model specific

register which is incremented every clock cycle) of my AthlonXP 3000+

processor.

I get Q's like these:

D749D65 D749E1B D749E8C D749EFD ... D7BCD19 D7BCD89 D7BCDF9 D7BCE69

or

18B090EA 18B091A0 18B09211 18B09282 ... 18B7ECC0 18B7ED30 18B7EDA0 18B7EE10

we have many fixed high order bits.

I can seed your generator using many different sequences and

*sistematically* it will fail the test.

If I use Q[i]= RDTSC() * RDTSC() % 0xffffffff

which generates Q's like these:

8C84B61F 6287F9C5 D18A895D 408DCFD8 ... 34C63518 AF686DC2 2A0B5D4F A4AF03BD

I *always* get good results.

Why the latter method is "better" than the former?

Cristiano

Jul 16, 2005, 1:54:53 PM7/16/05

to

In the simplest possible terms there is a difference between something

producing a random looking distrubtion of bytes and producing a stream

of bytes that is indistinguishable from a random source.

producing a random looking distrubtion of bytes and producing a stream

of bytes that is indistinguishable from a random source.

The MT is an algorithm that passes the first test but not the second.

It is easy to distinguish MT from a random source and this fact alone

is considered a break of the MT. Fufilling the indistinguishability

criteria is neccessary and sufficient for producing a secure

stream-cipher.

Simon.

Jul 16, 2005, 3:00:24 PM7/16/05

to

Simon Johnson wrote:

> In the simplest possible terms there is a difference between something

> producing a random looking distrubtion of bytes and producing a stream

> of bytes that is indistinguishable from a random source.

> In the simplest possible terms there is a difference between something

> producing a random looking distrubtion of bytes and producing a stream

> of bytes that is indistinguishable from a random source.

To me, "random looking" seems = "indistinguishable from a random source".

What do you mean?

> The MT is an algorithm that passes the first test but not the second.

> It is easy to distinguish MT from a random source and this fact alone

> is considered a break of the MT.

No general purpose test can distinguish the MT sequences from a random

source.

> Fufilling the indistinguishability

> criteria is neccessary and sufficient for producing a secure

> stream-cipher.

It's necessary but *not* sufficient; a crypto (P)RNG must be also

unpredictable.

Cristiano

Jul 16, 2005, 3:13:48 PM7/16/05

to

"Cristiano" <cristi...@NSquipo.it> writes:

> To me, "random looking" seems = "indistinguishable from a random source".

> What do you mean?

> To me, "random looking" seems = "indistinguishable from a random source".

> What do you mean?

"indistinguishable" means you have no practical way to tell which is which.

> No general purpose test can distinguish the MT sequences from a random

> source.

I don't know what you mean by "general purpose test", but

"indistinguishable" you can't tell the difference using ANY test, not

just general purpose tests. If, after studying the MT algorithm and

source code for a year or two, you can come up with an MT-specific

test based on the intimate details of how MT works, that distinguishes

an MT stream from a random stream, MT is distinguishable.

That said, MT is getting a bad rap here; it was designed be very fast

and to have reasonable statistical properties for physical simulations

and stuff like that. It was never intended to be a stream cipher and

its designers specifically advised people that it wasn't suitable as one.

> It's necessary but *not* sufficient; a crypto (P)RNG must be also

> unpredictable.

If it was predictable, you obviously could distinguish it from a

random source, so indistinguishabability is sufficient.

Message has been deleted

Jul 16, 2005, 5:09:21 PM7/16/05

to

"Simon Johnson" <simon....@gmail.com> writes:

Well, I am not sure that the second is actually a criterion at all. How,

given simply the stream do you decide if it has been satisfied? If tomorrow

I break say RC4 or AES, do you then say that the test is no longer

satisfied but today it is? Ie, this would seem to be not a test but a

hope.

>Simon.

Jul 16, 2005, 5:25:26 PM7/16/05

to

In article <roYBe.167453$75.71...@news4.tin.it>,

I would always initialize any pseudo-random generator with

the output of at least a fair physical generator.

Jul 16, 2005, 5:30:24 PM7/16/05

to

Unruh <unruh...@physics.ubc.ca> writes:

> >It is easy to distinguish MT from a random source and this fact alone

> >is considered a break of the MT. Fufilling the indistinguishability

> >criteria is neccessary and sufficient for producing a secure

> >stream-cipher.

>

> Well, I am not sure that the second is actually a criterion at all. How,

> given simply the stream do you decide if it has been satisfied?

> >It is easy to distinguish MT from a random source and this fact alone

> >is considered a break of the MT. Fufilling the indistinguishability

> >criteria is neccessary and sufficient for producing a secure

> >stream-cipher.

>

> Well, I am not sure that the second is actually a criterion at all. How,

> given simply the stream do you decide if it has been satisfied?

You're at the point where maybe you should look at a formal mathematical

treatment. Bellare and Rogaway's lecture notes are a good place to start:

http://www-cse.ucsd.edu/users/mihir/cse207/classnotes.html

> If tomorrow I break say RC4 or AES, do you then say that the test is

> no longer satisfied but today it is? Ie, this would seem to be not

> a test but a hope.

That's the best anyone can do with our current state of knowledge.

It's conjectured that AES can't be distinguished from a random

permutation, where "conjecture" means "unsolved math problem".

Sometimes an unsolved problem gets solved and then it's unsolved

yesterday and solved today.

So, three things can happen with AES:

1) Someone proves AES is distinguishable from random. That's

considered a break of the cipher even if it doesn't lead to practical

security compromises in fielded applications. It's happened with many

other ciphers (including RC4 and more recently SHA1) and might happen

with AES. That's bad for users but not that big a deal in a

theoretical sense. Nobody really expects this (especially for a very

fast and definite distinguisher) but it might happen.

2) Someone proves AES is indistinguishable. That's a HUGE result

because it means the P vs. NP problem is solved. Nobody expects this

and it will be a big surprise if it happens anytime soon.

3) Nobody proves anything. We keep on thinking and hoping, without

definite proof, that AES is strong. This is the most likely way

things will continue for a while.

Jul 16, 2005, 5:35:06 PM7/16/05

to

"Cristiano" <cristi...@NSquipo.it> writes:

>Simon Johnson wrote:

>> In the simplest possible terms there is a difference between something

>> producing a random looking distrubtion of bytes and producing a stream

>> of bytes that is indistinguishable from a random source.

>To me, "random looking" seems = "indistinguishable from a random source".

>What do you mean?

>> The MT is an algorithm that passes the first test but not the second.

>> It is easy to distinguish MT from a random source and this fact alone

>> is considered a break of the MT.

>No general purpose test can distinguish the MT sequences from a random

>source.

Sure. If it can be broken then it can be distinguished from a random

source. It has patterns which a random source would not have. Note that ALL

prngs are different from random sources. They can be predicted, which a

random source cannot.

>> Fufilling the indistinguishability

>> criteria is neccessary and sufficient for producing a secure

>> stream-cipher.

>It's necessary but *not* sufficient; a crypto (P)RNG must be also

>unpredictable.

A random source cannot be predicted. And no PRNG is unpredictable. Just

give me the algorithm and the (short) key and I can predict it forever.

Now, can I figure out how to predict it just given the output alone? I do

not know but I do know that there are very very strong patterns in the

output based on the fact that it is generated by a PRNG.

>Cristiano

Jul 16, 2005, 6:04:10 PM7/16/05

to

Unruh <unruh...@physics.ubc.ca> writes:

> A random source cannot be predicted. And no PRNG is unpredictable. Just

> give me the algorithm and the (short) key and I can predict it forever.

> A random source cannot be predicted. And no PRNG is unpredictable. Just

> give me the algorithm and the (short) key and I can predict it forever.

When we talk about PRNG's as stream ciphers, the idea is that you're

given the algorithm but not the key. The distinguishability criterion

means you get two streams of data. One comes from the (known) PRNG

algorithm with an unknown key, and the other is random, and you're

asked which is which. If you can find a way to tell the difference

a significant amount of the time, you've broken the cipher. Again,

see Bellare and Rogaway's notes for formal definitions.

Jul 16, 2005, 8:04:27 PM7/16/05

to

> http://www-cse.ucsd.edu/users/mihir/cse207/classnotes.html

It is distiguishable from random. Given a tiny bit of knowledge, the key,

all output is known. No random generator has that property.

I simply need to search through 2^128 bits of space ( a tiny amount

compared to all strings of length a google) and I WILL discover the key and

will discover that AES is completely predictable.

>2) Someone proves AES is indistinguishable. That's a HUGE result

>because it means the P vs. NP problem is solved. Nobody expects this

>and it will be a big surprise if it happens anytime soon.

No, it has absolutely nothing to do with P and NP.

>3) Nobody proves anything. We keep on thinking and hoping, without

>definite proof, that AES is strong. This is the most likely way

>things will continue for a while.

The strength of AES has nothing to do with either of the above.

Jul 16, 2005, 8:06:33 PM7/16/05

to

Unruh <unruh...@physics.ubc.ca> writes:

> It is distiguishable from random. Given a tiny bit of knowledge, the key,

> all output is known. No random generator has that property.

> I simply need to search through 2^128 bits of space ( a tiny amount

> compared to all strings of length a google) and I WILL discover the key and

> will discover that AES is completely predictable.

> It is distiguishable from random. Given a tiny bit of knowledge, the key,

> all output is known. No random generator has that property.

> I simply need to search through 2^128 bits of space ( a tiny amount

> compared to all strings of length a google) and I WILL discover the key and

> will discover that AES is completely predictable.

If we're talking about these terms as used in cryptography, the

criterion is polynomial-time tests in the key size. A 2^128 search is

considered infeasible. "Polynomial time" is where P vs. NP comes in.

Jul 16, 2005, 8:06:27 PM7/16/05

to

Paul Rubin <http://phr...@NOSPAM.invalid> writes:

easy. Search the key space. It is small. (It may take a while but since

when did math care about human lifetimes). Find the key. show which one is

derived from teh PRNG.

Jul 16, 2005, 8:16:34 PM7/16/05

to

Unruh wrote:

>Paul Rubin <http://phr...@NOSPAM.invalid> writes:

>>You're at the point where maybe you should look at a formal mathematical

>>treatment. Bellare and Rogaway's lecture notes are a good place to start:

>

>> http://www-cse.ucsd.edu/users/mihir/cse207/classnotes.html

>

>Paul Rubin <http://phr...@NOSPAM.invalid> writes:

>>You're at the point where maybe you should look at a formal mathematical

>>treatment. Bellare and Rogaway's lecture notes are a good place to start:

>

>> http://www-cse.ucsd.edu/users/mihir/cse207/classnotes.html

>

>It is distiguishable from random. Given a tiny bit of knowledge, the key,

>all output is known.

>all output is known.

Argh. You really need to read Bellare and Rogaway. They make it

all formal. In particular, the usual notion of indistinguishability

for cryptography is *computational* indistinguishability: no computationally

bounded attacker should be able to distinguish. (e.g., no attacker using

more less than 2^80 clock cycles should be able to distinguish with

advantage better than 2^-32.)

It's all spelled out there. You really would benefit from reading it.

Right now, all I can say is that your comments make you appear to be a

little unaware of some of the basic notions of theoretical cryptography.

Those comments are missing key points, points which have been already

addressed in the basic literature.

Jul 16, 2005, 10:28:43 PM7/16/05

to

Paul Rubin <>:

> Unruh <unruh...@physics.ubc.ca> writes:

> Unruh <unruh...@physics.ubc.ca> writes:

You are, of course, right that the 2^128 search can be considered

infeasible so that the brute-force attack does not make AES

distinguishable from a random permutation.

However, this is not really about P and NP. The term "polynomial

time" only has a very fuzzy meaning here; this is not the strict

notion from complexity theory. There is no definition of AES for

arbitrarily long keys, so it is meaningless to talk about asymptotics:

there is no security parameter that can be increased to make the

adversary's task harder. While the 2^128 step search talks a very

long time indeed, it still is a constant-time attack.

Jul 16, 2005, 10:32:46 PM7/16/05

to

Paul Rubin <>:

> Unruh <unruh...@physics.ubc.ca> writes:

> Unruh <unruh...@physics.ubc.ca> writes:

You are, of course, right that the 2^128 search can be considered

infeasible so that the brute-force attack does not make AES

distinguishable from a random permutation.

However, this is not really about P and NP. The term "polynomial

time" only has a very fuzzy meaning here; this is not the strict

notion from complexity theory. There is no definition of AES for

arbitrarily long keys, so it is meaningless to talk about asymptotics:

there is no security parameter that can be increased to make the

adversary's task harder. While the 2^128 step search takes a very

Jul 16, 2005, 10:44:38 PM7/16/05

to

Bodo Moeller <bmoe...@acm.org> writes:

> However, this is not really about P and NP. The term "polynomial

> time" only has a very fuzzy meaning here; this is not the strict

> notion from complexity theory. There is no definition of AES for

> arbitrarily long keys, so it is meaningless to talk about asymptotics:

> there is no security parameter that can be increased to make the

> adversary's task harder. While the 2^128 step search talks a very

> long time indeed, it still is a constant-time attack.

> However, this is not really about P and NP. The term "polynomial

> time" only has a very fuzzy meaning here; this is not the strict

> notion from complexity theory. There is no definition of AES for

> arbitrarily long keys, so it is meaningless to talk about asymptotics:

> there is no security parameter that can be increased to make the

> adversary's task harder. While the 2^128 step search talks a very

> long time indeed, it still is a constant-time attack.

Fair enough. However, AES reduces to (say) a SAT problem with at

worst a few thousand terms, a relatively small problem. We (perhaps

without justification) use "SAT is in P" to mean "SAT is tractable"

even though the P-time algorithm for recognizing SAT could have a

horrendously large exponent.

If someone proves AES has no better distinguisher than brute force, it

means that the corresponding SAT problem also has no faster solution

than searching the AES keyspace, which goes against what we normally

pretend it means for there to be a P-time SAT solver.

Obviously there is a gap in this logic, as you point out: there could

be a P-time SAT solver which is still so slow that using it to break

AES is slower than using traditional brute force. In this flavor of

complexity theory we just handwave that issue by only caring about the

asymptotics and pretending all the constants are small. So there is a

certain amount of hope involved about the constants, but the relevance

of P vs. NP comes from the existence of that SAT reduction, which is a

definite thing.

Jul 16, 2005, 11:49:39 PM7/16/05

to

Bodo Moeller <bmoe...@acm.org> wrote in news:dbcfor$fqf$1...@lnx107.hrz.tu-

darmstadt.de:

darmstadt.de:

> While the 2^128 step search talks a very

> long time indeed, it still is a constant-time attack.

>

It's a lot quicker when we get it to shut up.

J

--

__________________________________________

http://www.impeach-bush-now.org

Joe Peschel

D.O.E. SysWorks

http://members.aol.com/jpeschel/index.htm

__________________________________________

Jul 17, 2005, 2:00:38 AM7/17/05

to

"Paul Rubin" <http://phr...@NOSPAM.invalid> wrote in message

news:7xirzas...@ruckus.brouhaha.com...

> Sometimes an unsolved problem gets solved and then it's unsolved

> yesterday and solved today.

:-)

Jul 17, 2005, 4:24:35 AM7/17/05

to

Paul Rubin wrote:

> "Cristiano" <cristi...@NSquipo.it> writes:

>> To me, "random looking" seems = "indistinguishable from a random

>> source". What do you mean?

>

> "indistinguishable" means you have no practical way to tell which is

> which.

>

>> No general purpose test can distinguish the MT sequences from a

>> random source.

>

> I don't know what you mean by "general purpose test", but

> "indistinguishable" you can't tell the difference using ANY test, not

> just general purpose tests.

> "Cristiano" <cristi...@NSquipo.it> writes:

>> To me, "random looking" seems = "indistinguishable from a random

>> source". What do you mean?

>

> "indistinguishable" means you have no practical way to tell which is

> which.

>

>> No general purpose test can distinguish the MT sequences from a

>> random source.

>

> I don't know what you mean by "general purpose test", but

> "indistinguishable" you can't tell the difference using ANY test, not

> just general purpose tests.

When I talk about generators, I always have in mind the general purpose

tests (RaBiGeTe, TESTU01, NIST, DH).

"Random looking" = "indistinguishable" = sequence which looks like random

because pass all the general purpose tests.

But with the "new" definition of "indistinguishable" ("new" for me)

obviously I totally agree with Simon Johnson.

> If, after studying the MT algorithm and

> source code for a year or two, you can come up with an MT-specific

> test based on the intimate details of how MT works, that distinguishes

> an MT stream from a random stream, MT is distinguishable.

I guess that it is always possible to come up with a specific test for *any*

non-crypto PRNG.

Cristiano

Jul 17, 2005, 4:28:38 AM7/17/05

to

Sebastian Gottschalk wrote:

> Cristiano wrote:

>

>> No general purpose test can distinguish the MT sequences from a

>> random source.

>

> Just take a dimension higher than the highest dimension of

> equistribution and you'll find something non-equidistributed.

> Cristiano wrote:

>

>> No general purpose test can distinguish the MT sequences from a

>> random source.

>

> equistribution and you'll find something non-equidistributed.

No GP test can "take" the 624nd dimension of something; you need to write ad

hoc test.

Cristiano

Jul 17, 2005, 4:31:34 AM7/17/05

to

"Cristiano" <cristi...@NSquipo.it> writes:

> I guess that it is always possible to come up with a specific test

> for *any* non-crypto PRNG.

> I guess that it is always possible to come up with a specific test

> for *any* non-crypto PRNG.

Yes, that's why writing crypto PRNG's is more demanding than writing

non-crypto PRNG's. Crypto means designing against an intelligent

adversary who knows what algorithms you're using and is trying to

beat you any way he can.

Jul 17, 2005, 4:43:12 AM7/17/05

to

Unruh wrote:

>

> A random source cannot be predicted. And no PRNG is unpredictable.

> Just give me the algorithm and the (short) key and I can predict it

> forever.

>

> A random source cannot be predicted. And no PRNG is unpredictable.

> Just give me the algorithm and the (short) key and I can predict it

> forever.

Eh! Eh! If I use a CPRNG I will keep secret the key!

You'll try to break the generator without any knowledge of the key.

> Now, can I figure out how to predict it just given the output alone?

> I do not know but I do know that there are very very strong patterns

> in the output based on the fact that it is generated by a PRNG.

Are you sure? I'd say that the patterns are very very weak because it's very

hard to write a test which is able to find a pattern.

Even if you test many Gbits of a CPRNG using the most effective tests, you

don't see any pattern.

Cristiano

Jul 17, 2005, 6:51:38 AM7/17/05

to

"Cristiano" <cristi...@NSquipo.it> wrote:

>George Marsaglia wrote:

>> "Cristiano" <cristi...@NSquipo.it> wrote in message

>> news:QtABe.162196$75.69...@news4.tin.it...

>George Marsaglia wrote:

>> "Cristiano" <cristi...@NSquipo.it> wrote in message

>> news:QtABe.162196$75.69...@news4.tin.it...

..... <snip> .....

>>> That could be interesting, but there is the proof that MT has a very

>>> good k-distribution, for example it is:

>>> (623,32)-distributed

>>> (1246,16)-distributed

>>> (2492,8)-distributed

>>> (9968,2)-distributed

>>> which could be useful in some application.

>>> Please, could you post some value for your generator?

What are you on about? The (623, 32) distribution implies the others

automatically. This is true whenever we compare larger with smaller word sizes

in a perfectly packed code, and binary is of course a perfectly packed code.

Secondly, you should not use this notation. There is a standard definition of

(m,k) distribution which goes back at least to the 1st edition of Knuth's

famous book, The Art of Computer Programming (1969), and it has nothing in

common with your notation.

..... <snip> .....

>That is good, but the k-distribution is a "stronger" distribution.

Nonsense.

In the book mentioned above, Knuth put a lot of emphasis on the idea that

infinite-distribution of a random variate was a strong criterion for

"randomness". This seems to have led to a widespread belief that high orders

of k-distribution are enough by themselves to say that a sequence of integers

is "as close as we can get" to random. It is easy to show that this is wrong.

(Note that the various theorems proved by Knuth applied to continuous variates

on [0,1), not to integers).

First of all, consider pure multiplicative linear-congruential generator (MLCG)

with a modulus P. It is trivial to find the parameters such that the output

is at least k-distributed for 32 bit integers. This is true for any member of

the family of LCGs with P = 2^(32*(k+m)) + 1. In your case, the mininum

desired k = 623, and with no effort at all (my freeware primality testing

program shows run times of < 0.2 sec) we find that the smallest member of the

family is m = 2, the next member is m=10, etc. We have an infinite family of

LCGs, every one of which is exactly distributed for 32 bit integers and every

one has k > 623 -- but every one is very non-random because of the failings of

LCGs as a class. LCGs have values which cluster on hyperplanes, they fail the

spectral test and other tests, their parameters can be discovered from only a

few values from the sequence, etc etc. They are very NON-random and useless

for cryptographic purposes: no matter how high the k-distribution (and this

can be extended to infinity).

Secondly, I'm sure it's not just LCGs. LCGs fail a variety of tests due to the

high degree of pattern in the sequence, which is in turn due to the simple

generator function. I assume that we can see the same scenario of

(detectable non-randomness despite good distribution properties) with other

PRNGs of simple structure: can someone else comment?

Thirdly, there is more important way in which LCGs are non-random. If the

sequence values are integer (or more generally, quantised to q levels), a true

random sequence should show some runs, with the probability of consecutive

repeats being 1/q, 1/q^2, etc. Generators that have outputs that come close to

this distribution are behaving much more like true random numbers than ones

that don't. In this respect, full-period LCGs are the worst - their

probability of a consecutive repeat of the same number is always exactly 0,

regardless of generator period, output sequence length, and distribution order.

The upshot is that the k-distribution is much weaker evidence of randomness

than the behaviour Prof. Marsaglia mentioned.

ross

Jul 17, 2005, 10:10:08 AM7/17/05

to

"Joe Peschel" <jpes...@no.spam.org> wrote in message

news:Xns9695E7F0BEEFE...@216.168.3.44...

> Bodo Moeller <bmoe...@acm.org> wrote in

> news:dbcfor$fqf$1...@lnx107.hrz.tu-

> darmstadt.de:

>

>> While the 2^128 step search talks a very

>> long time indeed, it still is a constant-time attack.

>>

>

> It's a lot quicker when we get it to shut up.

>

All you have to do is compile it with debugging turned-off!

Jul 17, 2005, 10:55:48 AM7/17/05

to

ro...@nospam.com wrote:

> "Cristiano" <cristi...@NSquipo.it> wrote:

>> George Marsaglia wrote:

>>> "Cristiano" <cristi...@NSquipo.it> wrote in message

>>> news:QtABe.162196$75.69...@news4.tin.it...

>

> ..... <snip> .....

>

>>>> That could be interesting, but there is the proof that MT has a

>>>> very good k-distribution, for example it is:

>>>> (623,32)-distributed

>>>> (1246,16)-distributed

>>>> (2492,8)-distributed

>>>> (9968,2)-distributed

>>>> which could be useful in some application.

>>>> Please, could you post some value for your generator?

>

> What are you on about? The (623, 32) distribution implies the others

> automatically. This is true whenever we compare larger with smaller

> word sizes in a perfectly packed code, and binary is of course a

> perfectly packed code.

> "Cristiano" <cristi...@NSquipo.it> wrote:

>> George Marsaglia wrote:

>>> "Cristiano" <cristi...@NSquipo.it> wrote in message

>>> news:QtABe.162196$75.69...@news4.tin.it...

>

> ..... <snip> .....

>

>>>> That could be interesting, but there is the proof that MT has a

>>>> very good k-distribution, for example it is:

>>>> (623,32)-distributed

>>>> (1246,16)-distributed

>>>> (2492,8)-distributed

>>>> (9968,2)-distributed

>>>> which could be useful in some application.

>>>> Please, could you post some value for your generator?

>

> What are you on about? The (623, 32) distribution implies the others

> automatically. This is true whenever we compare larger with smaller

> word sizes in a perfectly packed code, and binary is of course a

> perfectly packed code.

I know that.

In a first moment I thought to post the entire table about the

k-distribution of MT as in MT paper, then I posted just few values. What's

the problem?

> Secondly, you should not use this notation. There is a standard

> definition of (m,k) distribution which goes back at least to the 1st

> edition of Knuth's famous book, The Art of Computer Programming

> (1969), and it has nothing in common with your notation.

M. Matsumoto, in his paper, uses the correct terminology: "k-distribution

to v-bit accuracy", but for simplicity I used that "strange" notation.

>> That is good, but the k-distribution is a "stronger" distribution.

>

> Nonsense.

> In the book mentioned above, Knuth put a lot of emphasis on the idea

> that infinite-distribution of a random variate was a strong criterion

> for "randomness". This seems to have led to a widespread belief

> that high orders of k-distribution are enough by themselves to say

> that a sequence of integers is "as close as we can get" to random.

> It is easy to show that this is wrong. (Note that the various

> theorems proved by Knuth applied to continuous variates on [0,1), not

> to integers).

Nonsense? Try to understand why I wrote "stronger" distribution with the

quotation marks.

Prof. Marsaglia said:

"[...] the expansions of rationals k/p for large primes p are expected to

pass extensive tests of randomness, in the sense that they may be reasonably

considered the output of a sequence of iid variates taking values in [0,b-1)

with equal probabilities."

which should mean that in the whole period you see all the values from 0 to

b-2 the same number of times.

I said "stronger" because all the good generators output numbers which are

uniformly distributed, but there could be some generator that has a low

order k-distribution.

> The upshot is that the k-distribution is much weaker evidence of

> randomness than the behaviour Prof. Marsaglia mentioned.

Your main problem is that you think I said "high order k-distribution =

randomness", but I've never said that; the only thing I said is that an high

order k-distribution could be useful in some application.

My only mean to see if a sequence is random looking is to test it.

So, back to my question: can you tell me, please, the order of the

k-distribution of Prof. Marsaglia's generator?

Cristiano

Jul 17, 2005, 11:24:27 AM7/17/05

to

Why? Many times it's perfectly good to initialize a big state with a simple

generator or with a counter or with an high resolution timer (like the P4

TSC).

Anyway, here I'm trying to understand why CMWC4096 fails sistematically when

I use some particular method.

As I said, extensive testing showed that with the seeding method: Q[i]=

RDTSC() % 0xffffffff, CMWC4096 always fails, this means that there is an

incredibly big amount of non-valid sequences.

Cristiano

Jul 17, 2005, 1:03:35 PM7/17/05

to

"Cristiano" <cristi...@NSquipo.it> writes:

Yes, I will. It is called the algorithm which is itself a test, with an

unknown element, the key. Since the key is far far shorter than the stream

I have, that key with the algorithm defines a pattern.

Now I agree that it is not easy to find that pattern. It takes perhaps

something like 2^128 attempts, but once I have found it the pattern is

completely predictable and deterministic, and can be extended to an

arbitrarily large piece of output.

Or to use the Chaitin/Kolmogorov definition of randomness, the output is

entirely non-random. The length of the shortest program (the algorithm plus

the key) is far far shorter than the output.

And no matter which sample I take, this is always true for that PRNG. Ie,

even the average length of the program required to output a large number of

samples is far shorter than the output (whcih is not true of a true RNG)

>Cristiano

Jul 17, 2005, 2:06:48 PM7/17/05

to

Unruh <unruh...@physics.ubc.ca> writes:

> Or to use the Chaitin/Kolmogorov definition of randomness,

> Or to use the Chaitin/Kolmogorov definition of randomness,

In cryptography, we much don't care about that. We care about

computational tractability.

Jul 17, 2005, 2:37:11 PM7/17/05

to

Althought I substantially agree, I don't have any experience in testing a

generator using the generator itself.

You say that "it takes perhaps something like 2^128 attempts" (and this is

what I call "it is very hard to write a test"), but do you know if it is

just an academic question?

Cristiano

Jul 17, 2005, 6:07:16 PM7/17/05

to

Paul Rubin <http://phr...@NOSPAM.invalid> writes:

And where is the conflict?

Jul 17, 2005, 10:04:22 PM7/17/05

to

Unruh <unruh...@physics.ubc.ca> writes:

> >> Or to use the Chaitin/Kolmogorov definition of randomness,

>

> >In cryptography, we much don't care about that. We care about

> >computational tractability.

>

> And where is the conflict?

> >> Or to use the Chaitin/Kolmogorov definition of randomness,

>

> >In cryptography, we much don't care about that. We care about

> >computational tractability.

>

> And where is the conflict?

Tractability is about what you can do on real computers using a

practical amount of resources. The Kolmogorov/Chaitin complexity of

CPRNG sequence is irrelevant if you don't know the key and have no

practical way to find it.

Reply all

Reply to author

Forward

0 new messages