RC4 ?

An0nYm0Us UsEr

unread,

Sep 13, 1994, 5:30:36 PM9/13/94

to

SUBJECT: RC4 Source Code

I've tested this. It is compatible with the RC4 object module
that comes in the various RSA toolkits.

/* rc4.h */
typedef struct rc4_key
{
unsigned char state[256];
unsigned char x;
unsigned char y;
} rc4_key;
void prepare_key(unsigned char *key_data_ptr,int key_data_len,
rc4_key *key);
void rc4(unsigned char *buffer_ptr,int buffer_len,rc4_key * key);

/*rc4.c */
#include "rc4.h"
static void swap_byte(unsigned char *a, unsigned char *b);
void prepare_key(unsigned char *key_data_ptr, int key_data_len,
rc4_key *key)
{
unsigned char swapByte;
unsigned char index1;
unsigned char index2;
unsigned char* state;
short counter;

state = &key->state[0];
for(counter = 0; counter < 256; counter++)
state[counter] = counter;
key->x = 0;
key->y = 0;
index1 = 0;
index2 = 0;
for(counter = 0; counter < 256; counter++)
{
index2 = (key_data_ptr[index1] + state[counter] +
index2) % 256;
swap_byte(&state[counter], &state[index2]);

index1 = (index1 + 1) % key_data_len;
}
}

void rc4(unsigned char *buffer_ptr, int buffer_len, rc4_key
*key)
{
unsigned char x;
unsigned char y;
unsigned char* state;
unsigned char xorIndex;
short counter;

x = key->x;
y = key->y;

state = &key->state[0];
for(counter = 0; counter < buffer_len; counter ++)
{
x = (x + 1) % 256;
y = (state[x] + y) % 256;
swap_byte(&state[x], &state[y]);

xorIndex = state[x] + (state[y]) % 256;

buffer_ptr[counter] ^= state[xorIndex];
}
key->x = x;
key->y = y;
}

static void swap_byte(unsigned char *a, unsigned char *b)
{
unsigned char swapByte;

swapByte = *a;
*a = *b;
*b = swapByte;
}

Marcus Leech

unread,

Sep 15, 1994, 1:32:29 PM9/15/94

to

In article <359qjg$55v$1...@mhadg.production.compuserve.com>,
Bob Jenkins <7451...@CompuServe.COM> wrote:
>
>Assuming the posted code is correct, RC4 works like this:
>1 A keybuffer is initialized to a permutation of 0..(2^ALPHA)-1
>2 The user key is used to shuffle this permutation
>3 The encryption algorithm itself methodically swaps each keybuffer
> term with another keybuffer term (chosen randomly), and reports a
> third randomly chosen keybuffer term.
>4 This third term is XORed with the user's message.
>
>So RC4 is just a random number generator, where the results are that
>third randomly chosen keybuffer term.
>

Does this not make it quite susceptible to known-plaintext? The ALLEGED RC4
random "generator" is a 256-byte state-vector with deterministic
transition rules. Since the transition to the next state is *not*
data dependent, then perhaps a known-plaintext attack with
Epsilon*256-bytes of known plaintext could reveal enough about the
state of the generator to determine initial state (the key).
--
Marcus Leech |Any opinions expressed are mine. |+1 613 763 9145
VE3MDL | and not those of my employer |+1 613 567 5484
mle...@bnr.ca | |

Bob Jenkins

unread,

Sep 15, 1994, 11:51:50 AM9/15/94

to

Assuming the posted code is correct, RC4 works like this:
1 A keybuffer is initialized to a permutation of 0..(2^ALPHA)-1
2 The user key is used to shuffle this permutation
3 The encryption algorithm itself methodically swaps each keybuffer
term with another keybuffer term (chosen randomly), and reports a
third randomly chosen keybuffer term.
4 This third term is XORed with the user's message.

So RC4 is just a random number generator, where the results are that
third randomly chosen keybuffer term.

Some useful properties of RC4:
* It's fast. It produces one random value in 19 instructions.
* The method of munging the keybuffer is reversible, so the RNG is a
permutation of states. The possible states are all possible
permutations of the keybuffer cross all possible values of x and y.
Therefore the expected cycle length is ((2^ALPHA)!*(2^(2*ALPHA)))/2,
or about 2^1687 values when ALPHA=8.
* Each value 0..(2^ALPHA)-1 appears exactly once in the key buffer, so
it would be hard for some values to appear more often than others.
* If you assume values chosen pseudorandomly from the array are
independent, then all results are entirely independent. So an
attack can't make that assumption.

A not-so-useful property:
RC4 is biased by about c/n^2, n being the size of the internal state.
ALPHA point of failure bias expected cycle length
2 2^7 2^4 2^6
3 2^18 2^9 2^19
4 2^24 2^12 2^51
5 2^28 2^14 2^126
6 ? 2^32 ? ? 2^16 ? 2^307
7 ? 2^36 ? ? 2^18 ? 2^725
8 ? 2^40 ? ? 2^20 ? 2^1687
If your application can't stand a bias of 1 part per million, then
don't use RC4. Gaps of length 0 (i.e. a a) are too likely and
gaps of 1 (i.e. a b a) are too unlikely.

#define ALPHA 2 (2^22 values)
gap: expect 15 get 172443.0856
1.040738 0.952832 0.833954 0.871353 0.902145 1.722928 ...

#define ALPHA 3 (2^23 values)
gap: expect 31 get 2134.8148
1.018901 0.956675 1.009969 0.994801 1.022663 1.008202 ...

#define ALPHA 4 (2^27 values)
gap: expect 63 get 377.7582
1.002988 0.993389 1.000811 1.001269 1.002527 0.999880 ...

#define ALPHA 5 (2^29 values)
gap: expect 127 get 159.7498
1.000763 0.998885 0.999738 1.001230 1.000181 1.000470 ...

- Bob Jenkins

Hal

unread,

Sep 15, 1994, 5:33:06 PM9/15/94

to

Bob Jenkins <7451...@CompuServe.COM> writes:
>A not-so-useful property:
>RC4 is biased by about c/n^2, n being the size of the internal state.
> ALPHA point of failure bias expected cycle length
> 2 2^7 2^4 2^6
> 3 2^18 2^9 2^19
> 4 2^24 2^12 2^51
> 5 2^28 2^14 2^126
> 6 ? 2^32 ? ? 2^16 ? 2^307
> 7 ? 2^36 ? ? 2^18 ? 2^725
> 8 ? 2^40 ? ? 2^20 ? 2^1687
>If your application can't stand a bias of 1 part per million, then
>don't use RC4. Gaps of length 0 (i.e. a a) are too likely and
>gaps of 1 (i.e. a b a) are too unlikely.

That's very interesting. How did you derive this result? Was it an
empirical measurement as from your tables which follow or was it based
on a calculation? Do you have any theory as to why this PRNG would
have this gap property?

>#define ALPHA 2 (2^22 values)
>gap: expect 15 get 172443.0856
> 1.040738 0.952832 0.833954 0.871353 0.902145 1.722928 ...

Could you explain these numbers? I gather that ALPHA is the log base 2
of the table size. So ALPHA of 2 is a four-entry table, a permutation
of 0..3, along with two variables which themselves range from 0 to 3.
There would be 4*4*24 possible states, or 384. I'm not sure how this
relates to the numbers you show here, or to the table entry above for
ALPHA=2.

Thanks -
Hal Finney

Hal

unread,

Sep 16, 1994, 11:48:56 AM9/16/94

to

Another thing I had noticed about the RC4 listing is a curiosity in the
key setup. Using Bob Jenkins' notation, the algorithm is:

for (i=0; i<SIZE; ++i)
m[i] = i;
for (i=0; i<SIZE; ++i) {
x = m[i];
a = (x+a+k[i])&MASK;
y = m[a];
m[i]=y; m[a]=x;
}

This generates a random permutation of 0..MASK based on the key vector k.
But the interesting thing is that this is very similar to the naive and
incorrect algorithm for generating a random permutation:

for (i=0; i<SIZE; ++i)
m[i] = i;
for (i=0; i<SIZE; ++i) {
x = m[i];
a = random(MASK+1); /* Generate a random number from 0..MASK */
y = m[a];
m[i]=y; m[a]=x;
}

The correct algorithm uses "a = random(i+1)". So this suggests that the
permutation created by the RC4 key setup is somewhat biased.

There may be good cryptographic reasons for the setup that is used. It
is also slightly faster since no divides are needed. One of the results
though is that it may mix up the state array m a little less than would
be ideal. In particular, m[0] will be equal to k[0] about 1/e of the
time, because it will get swapped just the one time. It's not clear how
to exploit this, though.

Hal

Bob Jenkins

unread,

Sep 16, 1994, 1:38:10 PM9/16/94

to

Oops .. I was measuring the number of passes to convergence, not
the number of results. Corrected table, measured in results:

SIZE convergence estimated cycle length (#states/2)
2^1 2^2 2^1
2^2 2^3..2^7 2^8
2^3 2^8..2^16 2^20
2^4 > 2^26 2^51

Jim -- the attack worked like this: assume you've got the correct
internal state. Do the appropriate switch. Check the result to make
sure you're right. When you find out you're wrong, change something
so that it looks like you were right. The critical thing is that,
once it does converge, you never correct anything again so it stays
converged.

The correcting step can map two original states into one. Eventually
all states will converge to one or more roots. One of the roots is
the true internal state, therefore the method might work. Envision it
as a zillion streams (the initial states) that flow into rivers. The
attack succeeds when your stream joins the stream which is the true
internal state. Why would that take sqrt(#states) steps? Dunno.

Another data point: I checked the cycle lengths for various seeds
when SIZE=2^3. In passes, these lengths were 119437, 40265, 6625,
5533, 3629, 1203, and 7. Keybuffer[i]=(i*5+1)%8, y=0, was on a
cycle of length 7. Results = #passes*SIZE. The internal state
can't repeat in less than a pass because it takes one pass for the
counter to repeat.

Cycles of 119437 passes were seen about 3/4 of the time; I suspect
there were two of them. I expected a cycle length of about 161280
because there are 322560 possible states at the end of passes.

- Bob Jenkins

Bob Jenkins

unread,

Sep 16, 1994, 12:03:29 AM9/16/94

to

My estimate of c/n^2 for bias was entirely empirical, and the
numbers I gave for the gap test were copied from my test outputs.
The numbers below "gap" were the number of gaps seen of length
0,1,2,3,4,5,... adjusted for their expected frequency. The 2^22
next to ALPHA=2 is how many values I actually examined; I did
overkill to try to discern the patterns from the random fluxes.

Right, ALPHA=3 means an array of 8 terms with 3 bits per term.
Right, cycle length of 2^6 was a mistake for ALPHA=2. Meant 2^8.

I have another data point: for ALPHA=6 the gap test fails after
seeing 2^33 results. (a a) is too likely, (a b a) too unlikely.
Don't know why. I'll revise my estimate to 2^43 for ALPHA=8.

Just because a sequence of results from an RNG can be obtained
doesn't mean the internal state of the RNG can be deduced easily.
For example, for the quadratic residue generator (BBS), deducing
the internal state (even given arbitrarily many uncorrupted
results) is as hard as factoring. To break a stream cypher, you
need to find an attack to deduce the RNG's internal state!

So here's an attack on rc4 that's faster than testing all possible
internal states, but still not practical: the guess-and-generate
attack. If rc4 is

for (i=0; i<SIZE; ++i)

{
x = m[i];
a = (x+a)&MASK;

y = m[a];
m[i]=y; m[a]=x;

r[i] = m[((x+y)&MASK)];
}

then the attack is to place an arbitrary permutation in g,
make an initial guess of c, guess i correctly, then run

for (i=0; i<SIZE; ++i)
{

x = g[i];
c = (x+c)&MASK;
y = g[c];
g[i]=y; g[c]=x;
for (j=0; j<SIZE; ++j) if (g[j]==r[i]) break;
z = (x+y)&MASK;
g[j]=g[z]; g[z]=r[i];
}

in parallel (the attack using rc4's r's). Eventually g and c will
converge to m and a. Tests reveal "eventually" to be:

SIZE convergence estimated cycle length (#states/2)

2^1 2^1 2^1
2^2 2^1..2^5 2^8
2^3 2^5..2^13 2^20
2^4 > 2^22 2^51

For SIZE=2^8 I imagine this attack, as is, should take 2^900 steps
or so. It seems to succeed after examining about the square root
of all possible states. I'm not very clear on why this attack
works; it probably has something to do with the birthday paradox.

- Bob Jenkins

Hal

unread,

Sep 16, 1994, 5:38:36 PM9/16/94

to

Bob Jenkins <7451...@CompuServe.COM> writes:

>My estimate of c/n^2 for bias was entirely empirical, and the
>numbers I gave for the gap test were copied from my test outputs.
>The numbers below "gap" were the number of gaps seen of length
>0,1,2,3,4,5,... adjusted for their expected frequency. The 2^22
>next to ALPHA=2 is how many values I actually examined; I did
>overkill to try to discern the patterns from the random fluxes.

Thanks. What abou the "expect" and "get" values? Were they some kind
of chi-square or similar statistics? Could you use them to come up with
a confidence level for distinguishing these outputs from random numbers?

> for (i=0; i<SIZE; ++i)
> {
> x = g[i];
> c = (x+c)&MASK;
> y = g[c];
> g[i]=y; g[c]=x;
> for (j=0; j<SIZE; ++j) if (g[j]==r[i]) break;
> z = (x+y)&MASK;
> g[j]=g[z]; g[z]=r[i];
> }

This is an interesting idea. It seems like the problem, as with many
cryptographic attacks, is that it is hard to tell when you are getting
close to the right permutation. Your swaps seem as likely to destroy
a good value as a bad one, unfortunately.

I had thought of a variation in which the state array was initialized
to UNDEFINED values, then you start filling in values which are consistent
with the output data stream. Where you have multiple choices you would
have to try them sequentially. Eventually you will hit an inconsistency
where no assignment of values works given what has been done before, and
at that point you backtrack and change one of the earlier-assigned
values. I suspect, though, that this is not going to have a better
running time than your approach, intractably long for the 2**8 case.

Hal

Michael Paul Johnson

unread,

Sep 16, 1994, 8:18:13 PM9/16/94

to

mle...@bnr.ca (Marcus Leech) writes:
>Does this not make it quite susceptible to known-plaintext? The ALLEGED RC4
> random "generator" is a 256-byte state-vector with deterministic
> transition rules. Since the transition to the next state is *not*
> data dependent, then perhaps a known-plaintext attack with
> Epsilon*256-bytes of known plaintext could reveal enough about the
> state of the generator to determine initial state (the key).

Yes. Quite. Never re-use a session key with this cipher.

-----BEGIN PGP SIGNED MESSAGE-----

A few initial observations on the alledged RC-4 code that was posted here.
This may or may not be valid. Look at the code, yourself, if you like, from
the sci.crypt posting, or at ripem.msu.edu.

1. It is a pure stream cipher, with no dependencies on the data. This means
that it is subject to a trivial known plain text attack if you ever re-use a
key. In practical terms, this means that the only thing you should ever use
a re-usable key for is to encrypt a random (or strong pseudorandom) key at
the beginning of the message. You should then switch to the new session key
for the remainder of the file. This results in some data expansion, but only
by the length of the session key. This is not likely to be a problem in most
applications.

2. I am suprised at (1) how simple the cipher is, and (2) how closely it
resembles another proprietary (but unclassified) cipher that I have seen. The
biggest difference between alledged RC-4 and brand X is that brand X includes
some data dependencies. Brand X is stronger against a plain text attack, and
can be used for data authentication. The plain text attack part is not a big
deal if you heed the caution in observation 1, above. Brand X performs 3
swaps per byte emitted instead of 1, and uses different index manipulation,
but it also uses a 256 byte array and very similar principles of operation.
The 3 swaps are not needed to avoid short stream cycles, but they do make the
authentication properties work.

3. The key expansion of alledged RC-4 and brand X work a little differently.
The alledged RC-4 version is faster than brand X, but is more likely to
generate the same state from a different user key. The key AAAAA is the same
as the key AAAAAAA. The key setup in alledged RC-4 makes no attempt to
differentiate between different sized keys, but it does in brand X.

4. The approximate probability of two user keys of the same length producing
the same permutation of the state array increases with the length of the key
from something rather small (about 10^-20) at a normal key size of 16 bytes
(128 bits), to about half for the maximum key size supported (256 bytes).
This is interesting, but not a great help in a brute force attack, since this
effect doesn't become significant when using keys small enough for brute
force attacks to be practical. It is also not clear to me how this reduction
in effective key size is of much value to a cryptanalyst.

5. The algorithm, though simple, is complex enough to avoid short cycles (as
far as I could tell).

6. The worst case for similar keys with slight differences yielding similar
outputs is, of course, the case where the same state permutation is created.
The next worst case seems to be for long keys that differ in the last byte.
Because of this, and because of the effects noted in (4), above, this cipher
is probably not worth using key lengths longer than about 32 bytes. Longer
keys, like pass phrases, might be stronger against partial leaks of data when
processed through a good hash function (like SHA or MD5) than they would be
if used directly. This would help make similar keys with slight differences
(i. e. one capitalization different at the end of the key) generate
drastically different initial states. Hashing the key before passing it to
the alledged RC-4 key expansion function is very important if you plan to
intentionally cripple this otherwise good cipher to a key size of 40 bits.

8. Two obvious brute force attacks are apparent: the processing intensive
and the storage intensive. The processing intensive attack computes internal
keys (state array permutations) each time. The storage intensive attack
precomputes all of the state arrays in advance for the key length of
interest. It is speculated that 40 bit (5 byte) keys for this algorithm are
within easy reach of NSA machinery for one or both of these, given the
current state of the US export regulations. 16 byte keys are better, unless
you want to export a product embedding this algorithm.

9. The algorithm appears to be carefully constructed to balance the output
statistics.

10. Other than what is mentioned above, I don't yet see an obvious attack
better than brute force. The most important part is to heed the warning not
to re-use any session key, thus making a known plain text attack trivial.
If you do that, then this algorithm may be suitable for use where speed is
paramount.

___________________________________________________________
| |
|\ /| | | Michael Paul Johnson Colorado Catacombs BBS 303-772-1062 |
| \/ |o| | PO Box 1151, Longmont CO 80502-1151 USA Jesus is alive! |
| | | / _ | m...@csn.org aka m...@netcom.com m.p.j...@ieee.org |
| |||/ /_\ | ftp://ftp.csn.net/mpj/README.MPJ CIS: 71331,2332 |
| |||\ ( | ftp://ftp.netcom.com/pub/mpj/README.MPJ -. --- ----- ....|
| ||| \ \_/ | PGPprint=F2 5E A1 C1 A6 CF EF 71 12 1F 91 92 6A ED AE A9 |
|___________________________________________________________|

-----BEGIN PGP SIGNATURE-----
Version: 2.7

iQCVAgUBLnozufX0zg8FAL9FAQHtJQQA6n9AkCsMl/AO5lW2cjKvFp/dTNyKNs++
LRsUI1s3iq2JTh/Ib2IQ0SuIxUjBosvpLnQN7qNagU2631G3eJOplY45ASJfoZhP
OvD3g1K/iHGV9m9Q302dCqJGCJtH1ai8//MiRgpCjL2Kac/KaKrdYL0rB0Nwd61i
tvmSVQpCc9w=
=BC7S
-----END PGP SIGNATURE-----

Hal

unread,

Sep 18, 1994, 12:31:58 PM9/18/94

to

I thought last night I had found a possible cyclic behavior in the
alleged RC4 algorithm, but it turns out it can't happen.

If we ever entered the top of the loop with y==x+1 and state[y]==1, there
would be a problem. x would be incremented so state[x]==1. Then y
gets state[x] added to it so it becomes x+1 again. Then we swap state[x]
and state[y] so state[y] is 1 again. This relationship therefore becomes
invariant.

However, for the same reason, it can't happen. Entering the loop,
state[y] always holds what y was just incremented by in the previous
iteration. If that is a 1 that means y was incremented by 1 the previous
time, which means it entered the previous loop as y==x+1, and similarly
state[y] would have had to be 1 on that previous iteration. So unless
the algorithm starts in this state it can't fall into it. Since the
algorithm starts with y==x and not y==x+1 it can't start in that state.
Hence it never happens.

It is interesting that it is the pre-increment of x in the loop which
makes sure it can't start in this state. If x were incremented at the
bottom of the loop rather than the top then if state[0] were 1 at
startup this problem would occur. So this behavior was probably known
to the designers and guarded against.

So, about the only thing this tells you is that when state[x]==1 in the
calculation, y is never x+1. Maybe this could provide some subtle
information to help a cryptanalyst but it doesn't seem very useful.

I wonder if there are any other particular relations between x, y, and
the state array which might produce unusual patterns of behavior.

Hal Finney

Hal

unread,

Sep 22, 1994, 1:54:38 PM9/22/94

to

Bob Jenkins <7451...@CompuServe.COM> writes:

>I counted how many times each value of s[x] (m[x], keybuffer[x])
>appeared when a gap of length 0 occurred. As you said, the value
>0 should appear not very often at all. I did it for (SIZE=2^4).

>nju: expect 15 get 66427.0738
> 7827 61405 61564 61211 61806 61061 61113 61086
> 60688 61720 61362 61525 61728 61798 61420 118734

>This means 0 appeared 7827 times, 1 appeared 61405 times, ... and
>SIZE-1 appeared 118734 times. There's your compensating value.
>If you get a gap of 0, then there's a 1/SIZE bias in favor of m[i]
>having been -1.

Very interesting result! So s[x] = 255 is twice as likely to produce
repetitions as other values. That is probably more useful than my
result that s[x] = 0 cannot produce reps.

I can explain this now that Bob has discovered it. Suppose on one
iteration we have:

i s[i]

x 255
y b

Then we output s[b+255]. We swap them, increment x, then add s[x] to
y. But s[x] will be 255 again due to the swap. So y decrements
and we have:

i s[i]

y b
x 255

and once again we output s[b+255].

Generally, when s[x] = 255, random values of y will result in a
repeated value 1/256 of the time. Plus, from this, every time y = x+1
we get a guaranteed repetition. So combining these gives an average
of 2 repetitions every 256 times s[x] = 255.

Hal