Wolfram's rule 30 as a hash function

Rade

unread,

Aug 18, 2010, 8:17:20 PM8/18/10

to

Excuse my ignorance if following proposal was already discussed. The
idea is to use Wolfram's rule 30 as a hash function.

Let length of constant c be a desired length of a hash. Constant c can
be arbitrary chosen. For example if 128 bits hash is required the
constant c may easily be 128 zeros. The string s for hashing is then
concatenated to the constant c to form a starting row r for rule 30; r
= c + s. The row r is then evolved twice row length. For example if c
= 128 bits in length and s = 128 bits in length then evolution is
performed 512 times (column length is 512).

Now the part (length of c) of last row serves as a hash. From above
example the first 128 bits of 512th row is the hash h of the string s.

Comments please.

Rade

Joseph Ashwood

unread,

Aug 21, 2010, 9:03:30 PM8/21/10

to

"Rade" <rade.v...@gmail.com> wrote in message
news:89f8bb91-1751-4da7...@n19g2000prf.googlegroups.com...

> Excuse my ignorance if following proposal was already discussed. The
> idea is to use Wolfram's rule 30 as a hash function.

> Comments please.

Benefits
Simplicity.
Understandability.
Easily proven behaviors.

Problems
Not even close to random.
Simple algebraic structure.
Current S-box function far from random (look at the left most cell
versus new cell, 75% of the time the output is simply toggled).
S-box too small.
Loses entropy quickly.
Fixed input length.
SSSSSSSLLLLLLLOOOOOOOWWWWWWW!!!!!!!
Extremely ordered result.
Highly reversible process (only the lost entropy can't be reversed).

The S-box would have to be changed to something differentially and linearly
strong, but that's just S-box selection, it's a known process. The problems
appear to be too large for the function to work as a cryptographic hash
function.
Joe

Scott Fluhrer

unread,

Aug 22, 2010, 8:30:50 PM8/22/10

to

"Joseph Ashwood" <ash...@msn.com> wrote in message
news:oN_bo.1740$8A2....@newsfe22.iad...

> "Rade" <rade.v...@gmail.com> wrote in message
> news:89f8bb91-1751-4da7...@n19g2000prf.googlegroups.com...
>> Excuse my ignorance if following proposal was already discussed. The
>> idea is to use Wolfram's rule 30 as a hash function.
>
>> Comments please.
>
> Benefits
> Simplicity.
> Understandability.
> Easily proven behaviors.
>
> Problems
> Not even close to random.
> Simple algebraic structure.
> Current S-box function far from random (look at the left most cell
> versus new cell, 75% of the time the output is simply toggled).
> S-box too small.

However, it makes up for the weak sbox by using it a *lot* of times.

> Loses entropy quickly.
That is actually false. It's easy to show that for any two distinct inputs,
the internal state will still differ after c/2 generations. Hence, no
entropy at all will be lost in the first 64 generations (if c=128 as given
by the example by the OP). Entropy might be lost after that point, but it's
not at all clear that it'd be significant.

> Fixed input length.
Nope, the construction is well defined for any input size (for a 1209 bit
input, you use s = 1209).

> SSSSSSSLLLLLLLOOOOOOOWWWWWWW!!!!!!!
That's the real rub. Not only is it slow, it actually takes time quadratic
in the length of the input (!). You really don't want to hash a megabyte
message with this algorithm.

> Extremely ordered result.
Huh?

> Highly reversible process (only the lost entropy can't be reversed).

Huh? Yes, it is feasible to compute possible possible previous generations,
but it's not at all clear how that can be used to compute collisions.

>
> The S-box would have to be changed to something differentially and
> linearly strong, but that's just S-box selection, it's a known process.
> The problems appear to be too large for the function to work as a
> cryptographic hash function.

Real problems (in addition to the slowness that Joe mentioned, which is a
real killer):

- Having the constant c be an all zero bit string doesn't work; consider an
input which consists of all zero bits. In that case, the initial state will
consist of all zeros, and because of how rule 30 works, the final state will
consist of all zeros, and so the hash output will consist of all zeros.
Hence, two strings of all zero bits of differing lengths will hash to the
same value, thus we have a collision.

This can obviously be addressed by selecting c more intellegently; we want
to select a value where the attacker cannot cause the entire hash state to
fall into a simple pattern. Two obvious ways of doing this is selecting a
pattern of 0's with a 1 in the middle (and so you get the rule 30 tree
pattern in the initial states), or just setting it to an arbitrary pattern
(such as the initial bits of the binary expansion of pi).

- This hash function cannot be computed 'online'; that is, computed via one
pass through the message being hashed, using only a bounded pool of memory.
In practice, this is a nice feature, and one which most hash functions (all
Merkle Damgaard hash functions, for example) possess.

--
poncho

unruh

unread,

Aug 23, 2010, 1:09:20 AM8/23/10

to

YOu admit this hash function is extremely slow and unwieldy. You admit
that it is useless for anything but veryshort messages. Yet you continue
discussing it. Why?

Mok-Kong Shen

unread,

Aug 23, 2010, 8:13:11 AM8/23/10

to

Rade wrote:
> .....The idea is to use Wolfram's rule 30 as a hash function.[snip]

A rather OT question: What is special about rule 30 among all
of Wolfram's rules? I am ignorant but I heard once a lecturer
claiming that rule 137 has some distinguished properties
(I barely understood the point due to my very poor knowledge).

M. K. Shen

Pubkeybreaker

unread,

Aug 23, 2010, 9:10:32 AM8/23/10

to

On Aug 18, 8:17 pm, Rade <rade.vucko...@gmail.com> wrote:
> Excuse my ignorance if following proposal was already discussed. The
> idea is to use Wolfram's rule 30 as a hash function.

Excuse *MY* ignorance. What is Wolfram's "rule 30"???

Noob

unread,

Aug 23, 2010, 10:40:13 AM8/23/10

to

Pubkeybreaker wrote:

> Excuse *MY* ignorance. What is Wolfram's "rule 30"???

I don't know, but Google might ;-þ

http://www.google.com/search?q=wolfram+rule+30

Rade

unread,

Aug 23, 2010, 9:39:03 PM8/23/10

to

On Aug 23, 10:30 am, "Scott Fluhrer" <sfluh...@ix.netcom.com> wrote:
> "Joseph Ashwood" <ashw...@msn.com> wrote in message
>
> news:oN_bo.1740$8A2....@newsfe22.iad...
>
> > "Rade" <rade.vucko...@gmail.com> wrote in message

It seems that 2 main objections to the proposed hash function can be
seen from the posts.

1st is the randomness / repetitive nature of rule 30
2nd the efficiency

1st: It is known that rule 30 is behind RNG in Wolfram's Mathematica,
(which may means existence of some "randomness quality') on the other
hand there is some rule 30 initial states which will produce
repetitive strings. The point where is the constant c all zeros is
valid and may be mended as suggested. The logic behind c = all zeros
was: even the input produces some 'non random' repetitive output it
might be still difficult to find collision (well excluding above
mentioned case).

2nd: As someone mentioned the rule 30 is simple and considerable
amount of research was already done which makes present discussion way
easier. The one attribute of the rule 30 is very important : It is
claimed (if the rule 30 is considered as a mapping) that to get output
the quickest way is to run particular state / configuration. That fact
also may means that exhaustive search is very much in the play in
dealing with the almost all aspects of the rule 30. The
http://arxiv.org/abs/1007.4660 paper tries to explain why the rule 30
with simple algorithm behind and with minimum entropy produces
significant complexity. The paper also address ROM (random oracle
methodology) in the way that some assumptions of ROM negative result
may be incorrect (which go in favour to the rule 30). The hypothesis
of the paper also questions use of standard S-box modelling (fixed
structure) in the case of the rule 30. Back to the efficiency, if
inner working of the rule 30 are understand and be indicative of some
algorithmic hashing value it is easy to use 32, 64 ....cells to
address efficiency issue. Also some automata may have fixed amount of
iterations to achieve the same effect as the rule 30 avoiding
quadratic performance dependence.

Rade

Kristian Gjųsteen

unread,

Aug 24, 2010, 2:43:55 AM8/24/10

to

Rade <rade.v...@gmail.com> wrote:
>[...] The
>http://arxiv.org/abs/1007.4660 paper [...]

>The paper also address ROM (random oracle
>methodology) in the way that some assumptions of ROM negative result
>may be incorrect (which go in favour to the rule 30).

Unfortunately, you do not understand the separation results for the
random oracle model.

You need to study more.

--
kg

Rade

unread,

Aug 26, 2010, 9:06:32 PM8/26/10

to

On Aug 24, 4:43 pm, Kristian Gjøsteen <kristiag+n...@math.ntnu.no>
wrote:
> Rade <rade.vucko...@gmail.com> wrote:
> >[...] The
> >http://arxiv.org/abs/1007.4660paper [...]

> >The paper also address ROM (random oracle
> >methodology) in the way that some assumptions of ROM negative result
> >may be incorrect (which go in favour to the rule 30).
>
> Unfortunately, you do not understand the separation results for the
> random oracle model.
>
> You need to study more.
>
> --
> kg

Anyhow, the question remains: Is the function description of the above
hash proposal using the rule 30 is shorter than input?

Quote form the ROM paper eprint.iacr.org/1998/011.pdf:

"Informal Theorem 1.1 There exist no correlation intractable function
ensembles.

Restricted correlation intractability. The proof of the above negative
result relies on the fact that the description of the function is
shorter than the input used in the attack."

Rade

Kristian Gjųsteen

unread,

Aug 27, 2010, 3:34:54 AM8/27/10

to

Rade <rade.v...@gmail.com> wrote:
>On Aug 24, 4:43 pm, Kristian Gjųsteen <kristiag+n...@math.ntnu.no>

>wrote:
>> Rade <rade.vucko...@gmail.com> wrote:
>> >[...] The
>> >http://arxiv.org/abs/1007.4660paper [...]
>> >The paper also address ROM (random oracle
>> >methodology) in the way that some assumptions of ROM negative result
>> >may be incorrect (which go in favour to the rule 30).
>>
>> Unfortunately, you do not understand the separation results for the
>> random oracle model.
>>
>> You need to study more.
>

>Anyhow, the question remains: Is the function description of the above
>hash proposal using the rule 30 is shorter than input?

Why is this question interesting? Don't quote, explain.

(Even if a proof relies on X, something without X is not automatically
interesting.)

>Quote form the ROM paper eprint.iacr.org/1998/011.pdf:
>
>"Informal Theorem 1.1 There exist no correlation intractable function
>ensembles.
>
>Restricted correlation intractability. The proof of the above negative
>result relies on the fact that the description of the function is
>shorter than the input used in the attack."

--
kg

Rade

unread,

Aug 30, 2010, 12:50:23 AM8/30/10

to

On Aug 27, 5:34 pm, Kristian Gjøsteen <kristiag+n...@math.ntnu.no>
wrote:
> Rade <rade.vucko...@gmail.com> wrote:
> >On Aug 24, 4:43 pm, Kristian Gjøsteen <kristiag+n...@math.ntnu.no>

It can be argued that a full description of the proposed hash function
is more complex than function's input. The argument relies on McCabe
complexity of the proposal. A step for one bit in the row to the next
row involves one branching meaning that path complexity of the
proposal is at least 2 ^ 2 ( c + s). That also mean that the proposal
might be correlation intractable.

Rade

Kristian Gjųsteen

unread,

Aug 30, 2010, 2:56:19 AM8/30/10

to

Rade <rade.v...@gmail.com> wrote:

>> Rade <rade.vucko...@gmail.com> wrote:
>> >Quote form the ROM paper eprint.iacr.org/1998/011.pdf:
>>
>> >"Informal Theorem 1.1 There exist no correlation intractable function
>> >ensembles.
>>
>> >Restricted correlation intractability. The proof of the above negative
>> >result relies on the fact that the description of the function is
>> >shorter than the input used in the attack."
>

>It can be argued that a full description of the proposed hash function
>is more complex than function's input. The argument relies on McCabe
>complexity of the proposal. A step for one bit in the row to the next
>row involves one branching meaning that path complexity of the
>proposal is at least 2 ^ 2 ( c + s).

Never mind this at the moment.

> That also mean that the proposal
>might be correlation intractable.

Why?

--
kg

Rade

unread,

Aug 31, 2010, 9:18:26 PM8/31/10

to

On Aug 30, 4:56 pm, Kristian Gjøsteen <kristiag+n...@math.ntnu.no>
wrote:

The long answer might be found at http://arxiv.org/abs/1007.4660

In short the answer goes like this: the paradigm of the function had
the meaning of finding some pattern in some data and encapsulating
that data in the form of the formula or the graph. Traditionally there
is a kind a formula like y = f ( x ) ( for example y = 2 x ). Someone
just plugs input x and there is output y . Therefore multiplying by 2
is function description and every input is treated by the same
transformation recipe ( * 2 ).

There is fundamental difference in the case of the rule 30. There is
no single transformation recipe for mapping input to output. You kind
a have definition of how to do transformation but the composition of
the particular function will depend on the particular path of the
algorithm which is dependant on particular input. That means that rule
30 does define process to the point that process is deterministic, but
only with the input description ( how every input relates to the
particular transformation ) the whole process is fully described.

For the case of y = 2 x , correlation of x is obvious, but for the
rule 30 every x will go through different transformation ( composite
function where order of bit flipping is important ) effectively ruling
out any correlation.

Rade

Kristian Gjųsteen

unread,

Sep 1, 2010, 4:33:23 AM9/1/10

to

Rade <rade.v...@gmail.com> wrote:
>There is fundamental difference in the case of the rule 30. There is
>no single transformation recipe for mapping input to output.

Of course there is. You have a computer program that computes the
transformation. The computer program is the single recipe.

>[...] effectively ruling
>out any correlation.

No. You haven't argued convincingly for that, I'm afraid.

--
kg

Rade

unread,

Sep 1, 2010, 7:57:36 PM9/1/10

to

On Sep 1, 6:33 pm, Kristian Gjøsteen <kristiag+n...@math.ntnu.no>
wrote:

As it was mentioned in previous post: You kind a have definition of

how to do transformation but the composition of the particular
function will depend on the particular path of the algorithm which is
dependant on particular input.

It may be easier to explain the point by 3n+1 problem. The 3n + 1
algorithm is simple and indeed define whole process. But every input
will be processed by different formula. For example for input x = 5
formula is y = ( ( 3 * x ) + 1 ) \ 2 \ 2 \ 2 \ 2 and for input x = 8
formula is y = x \ 2 \ 2 \ 2 and so on.

Therefore to fully describe 3n+1 process, inputs with corresponding
formula should be included.

Correlation example: The numbers are drawn from the hat 7, 4, 5,,
9 ... ; after the numbers are paired in this fashion ( 7 , 4 ) ( 5 ,
9 ) ... . To describe whole process all pairs should be listed because
of the hat thing (meaning representation of pairs can not be
compressed, meaning no correlation). Practically if someone has to
list every input to describe some process, inputs are not correlated.

Rade

Kristian Gjųsteen

unread,

Sep 2, 2010, 6:04:50 AM9/2/10

to

Rade <rade.v...@gmail.com> wrote:
>It may be easier to explain the point by 3n+1 problem. The 3n + 1
>algorithm is simple and indeed define whole process. But every input
>will be processed by different formula. For example for input x = 5
>formula is y = ( ( 3 * x ) + 1 ) \ 2 \ 2 \ 2 \ 2 and for input x = 8
>formula is y = x \ 2 \ 2 \ 2 and so on.

I think you are confusing formula and function.

There's a well-defined function, there's a simple algorithm for computing
the function, but there's no simple formula that describes the function.
So what?

By the way, there is a (complicated) formula for n iterations of Rule
30 on l input bits.

--
kg

JT

unread,

Sep 5, 2010, 5:11:41 PM9/5/10

to

On 1 Sep, 10:33, Kristian Gjøsteen <kristiag+n...@math.ntnu.no> wrote:

Oh yes he did

JT