Say you have a black box which performs RC4 encryption using a key
whose value is unknown, but whose bit-length is known - say 56, or
whatever.
Say you can encrypt any amount of known plaintext (not chosen
plaintext) with that black box, & get the ciphertext; and you can do
this with different known plaintexts, however many times you wish to.
Is there any way, in this scenario, to determine the key by any method
faster than a brute-force search?
I suspect not - but being a crypto bimbo, could someone confirm?
Thanks,
TC
>Say you have a black box which performs RC4 encryption using a key
>whose value is unknown, but whose bit-length is known - say 56, or
>whatever.
>
>Say you can encrypt any amount of known plaintext (not chosen
>plaintext) with that black box, & get the ciphertext; and you can do
>this with different known plaintexts, however many times you wish to.
>
>Is there any way, in this scenario, to determine the key by any method
>faster than a brute-force search?
XOR the plaintext against its ciphertext
counterpart to get the RC4 key stream.
You can use that stream to decrypt any
ciphertext message until the key is
changed.
J
__________________________________________
Joe Peschel
D.O.E. SysWorks
http://members.aol.com/jpeschel/index.htm
__________________________________________
Joe, I know that already. But I want to recovery the key itself, so I
can use the key for subsequent encryption/decryption, without having
to carry around an arbitrarily large saved keystream.
Possible?
TC
>jpes...@aol.commune.net (Joe Peschel) wrote in message
>news:<20021002000018...@mb-mq.aol.com>...
>> ke...@aussiemail.com.au (TC) writes:
>>
>> >Say you have a black box which performs RC4 encryption using a key
>> >whose value is unknown, but whose bit-length is known - say 56, or
>> >whatever.
>> >
>> >Say you can encrypt any amount of known plaintext (not chosen
>> >plaintext) with that black box, & get the ciphertext; and you can do
>> >this with different known plaintexts, however many times you wish to.
>> >
>> >Is there any way, in this scenario, to determine the key by any method
>> >faster than a brute-force search?
>>
>> XOR the plaintext against its ciphertext
>> counterpart to get the RC4 key stream.
>> You can use that stream to decrypt any
>> ciphertext message until the key is
>> changed.
>
>Joe, I know that already. But I want to recovery the key itself, so I
>can use the key for subsequent encryption/decryption, without having
>to carry around an arbitrarily large saved keystream.
The best attack I can think of to recover
the original RC4 key is to mount a
dictionary attack. David Wagner has done
quite a bit work with RC4; he might know
if there is a way to derive the key from
the key stream.
Maybe he'll tell us.
I am curious why you would want to continue
to use the same RC4 key, though.
Indeed! The first thing I learned about how to use RC4 safely is to
never use the same key twice..!
Because the product that I am (ahem) "investigating" does precisely
that!
It encrypts data in fixed-size pages. Each page gets a different key,
but all of these keys are identical from one run to the next. That is,
on run #1, page #1 is encrypted with key A, page #2 is encrypted with
key B, page #3 is encrypted with key C, & so on. A, B and C are
related in some deterministic manner. On run #2, pages #1, #2 & #3 get
the same three keys A-C.
I want to try & get the actual key (A) corresponding to the fixed
keystream for page #1. If I can do that, I'll get key (B) for page #2,
and key C for page #3, then try to determine the relationship (f)
between those 3 keys. If I can find that relationship, I can decrypt
any arbitrary page 'n' by using key f(A,n).
Cheers,
TC
That scheme is broken, there are several key-recovery attacks on that scheme
that are faster than exhaustive key search.
Previously you mentioned that the scheme used the RC4 stream cipher, the key
size was 64 bits and that the plaintext was known to you. With those
assumptions, I can think of three efficient key-recovery attacks on the
scheme you describe above, all of which are likely to be feasible for even
a weak adversary: a key-collision attack, a time/memory/data tradeoff
attack, and the Fluhrer/Mantin/Shamir attack.
ATTACK: Key-collision attack
The first attack is a key-collision attack [1]. This attack requires
approximately 2^40 work, 2^24 memory and 2^24 ciphertext "pages". It does
not require the adversary to have any knowledge of the relationships between
the different keys used to encrypt the pages.
1) Run the scheme once, producing 2^24 known plaintext/ciphertext
pairs. Each pair corresponds to a distinct "page" in the scheme.
2) Use the known plaintext to compute the keystream from each of
the 2^24 plaintext/ciphertext "page" pairs. Store a prefix of
each of the 2^24 keystream samples in memory (more than 64 bits).
3) Choose a random key K' and compute C = RC4(K).
4) Check whether C appears in memory. If so, the key K is
expected to be the key which encrypted the page that corresponds
to C.
5) If no match is found, return to step 3 and keep searching.
The first match is expected to appear after about 2^40 cycles.
ATTACK: Time/Memory/Data tradeoff with KSW sampling
The second attack is a time/memory/data tradeoff attack exploiting the low
sampling resistance of RC4. A time/memory/data tradeoff from [2] on 64-bit
RC4 using KSW sampling and the RC4 invariance weakness from [3] requires a
one-time precomputation of about 2^46 work followed by an attack using 2^24
work, 2^32 memory and 2^20 ciphertext/plaintext "page" pairs to recover a
key. It does not require the adversary to have any knowledge of the
relationships between the different keys used to encrypt the pages.
Note that the precomputation is independent of the actual keys, the
adversary only needs to do it once. Once the precomputation is complete, it
is relatively inexpensive to recover a key. One can even imagine a more
capable adversary performing the precompuation and making the results
public, thereby enabling even weak adversaries.
ATTACK: Fluhrer/Mantin/Shamir attack
The third attack is the attack by Fluhrer, Mantin, and Shamir on WEP like
systems. Their attack exploits the weaknesses in the RC4 key schedule. You
didn't mention the details of the deterministic relationship between the
keys, but assuming that the relationship is appropriate, this attack should
be feasible. I estimate this attack would require a single "run" producing
approximately 2^23 "pages" and very little work. See the original paper [3]
for the details.
REFERENCES
[1] Biham, "How to Forge DES Encrypted Messages in 2^28 Steps", Technion
Computer Science Department Technical Report CS0884, 1996.
[2] Biryukov and Shamir, "Cryptanalytic Time/Memory/Data Tradeoffs for
Stream Ciphers", ASIACRYPT 2000, LNCS 1976, 2000, pp. 1-13.
<http://www.wisdom.weizmann.ac.il/~albi/ac00tradeoff.ps>
[3] Fluhrer, Mantin and Shamir, "Weaknesses in the Key Scheduling
Algorithm of RC4", Eighth Annual Workshop on Selected Areas in
Cryptography, SAC 2001, 2001.
<http://www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/Rc4_ksa.ps>
-Richard
Richard
Within a day of reading your post, I had coded-up attack #1 (key
collision attack), run it overnight, & determined 3 of the relevant
RC4 keys - on a 133Mhz pentium running everything (including RC4) in
interpreted Microsoft Visual Basic For Applications (VBA)!
I estimate that an exhaustive key search would take 3000 hours on my
PC, so your approach was 3 orders of magnitude faster (7 hours versus
3*3000). I must say I am astonished that guessing keys at random,
could produce such a great result. I would not have thought of that
approach, in a gazillion years.
In case anyone's interested, here are the details.
First, the key length. I did not actually say that it was 56 bits. I
said that it was "e.g." 56 bits. This was an obfuscation technique,
designed to hide my embarassment about the actual length: 32 bits! I
suppose you guys could brute-force a 32-bit keyspace in a quadillionth
of a second - but I can't.
Regarding known plaintexts, I said that I could generate any # of
these, together with their corresponding ciphertexts. In fact, a
technical problem made this difficult. I eventually settled for 3600
known plaintext/ciphertext pairs, each of 2Kb in length. That is, 3600
known RC4 keystreams, each of 2Kb in length. Generating these streams
took less than an hour of coding & testing.
Regarding the keystream "prefixes", I decided to start by storing only
the leading 2 bytes, so I could bang them into numeric variables & use
numeric comparions (which are 10x faster that string comparisons on my
PC - even for 2-byte strings).
The numbers above are way below the numbers in your description of
attack #1. However, I had to start somewhere!
Thus we have:
For n = 1 to 2600:
set list(n) = first 2 bytes of keystream(n).
Do forever:
create a random 32-bit key;
generate the first 2 bytes of keystream for that key;
check the list for those bytes;
if found at entry n:
regenerate full 2Kb of keystream(n);
generate full 2Kb keystream for key;
if the two are identical, WE HAVE A KEY!
if not, we have a false positive.
When I ran this up, it got 180 hits in 5000 trials - all false
positives. This seemed a bit high, so I increased the keystream prefix
length from 2 bytes, to 3. I then got only 6 hits in 50,000 trials
(again, all false positives). This seemed fair, so I "let it rip", and
went to bed.
When I woke up next morning, the code had done 4 million trials in 7
hours, and found 3 matching keys!
Two of those keys were very similar. I wrote some code to iterate a
particular byte of one of them, through all the 256 values. Voila: 198
extra matching keys! A short time later I determined the relationship
between the page number, and the encryption key. Page #1 was always
encrypted with the same base key. Subsequent pages were encrypted
using a different key determined in a simple manner from the base key,
and the page number. Using this information I then wrote a small
procedure which replicated the encryption & decryption feature of the
"black box" product.
In retrospect, it all seems so simple that I can't help feeling I must
have missed something! However, "the proof of the pudding is in the
eating." My procedure successfully decrypts files encrypted by the
"black box" product.
In summary of my implementation of the key collision attack:
Given 2600 known 2Kb keystreams, and a 133Mhz machine running an
interpreted language, it finds matching RC4 keys at a rate of about {4
million trials / 7 hours} per key; whereas an exhaustive key search is
estimated to take 3000 hours (worst case) per key (on that PC).
Richard, thanks very much for your help.
TC
>Richard Parker <ric...@electrophobia.com> wrote in message news:<B9C41429.4037F%ric...@electrophobia.com>...
<snip informative text>
>
>In summary of my implementation of the key collision attack:
>Given 2600 known 2Kb keystreams, and a 133Mhz machine running an
>interpreted language, it finds matching RC4 keys at a rate of about {4
>million trials / 7 hours} per key; whereas an exhaustive key search is
>estimated to take 3000 hours (worst case) per key (on that PC).
>
>Richard, thanks very much for your help.
>
>TC
Thanks for the play-by-play on your implementation of Richard's help,
an excellent dissemination of info.
Good show.
Thanks Yama.
You may be interested to hear what happened from there.
The black box product in question, comes in three versions. I easily
cracked v1, using the methods outlined so far. I then cracked v2,
using the same methods. (v2 added some obfuscation to the key
generation step, but it was still quite simple to figure out, once the
Key Collision Attack gave me some keys.)
v3, however, defeated all attacks. I ran tens of millions of Key
Collision trials, & got nothing. I generated new keystreams, & tried
again (several times). I found out about the byte #2 distinguisher for
RC4, and checked for that in my saved keystreams. The v1 & v2
keystreams seemed to show the bias (ie. byte #2 is zero 1/128 of the
time, instead of 1/256 of the time), but the v3 keystreams did not -
so I thought, maybe they've gone to RC4(DROP)? So I re-ran all the
trials (on my trusty 133Mhz Pentium running Visual Basic for
Applications!) using RC4(DROP-256, 512 & 768) on every key.
Still no go! So I re-ran it all again, this time generating the first
8 bytes of keystream for each random key, and checking for that
keystream *anwhere* in each 2Kb saved keystream: ie. not just starting
at byte 1, 257, 513 or 769.
*Still* no go!!
Then, gazing wearily at some of my code, I noticed:
----------------------------------------------
' generate random 32-bit key.
sKey = Chr$(Int(Rnd*255)) & Chr$(Int(Rnd*255)) & _
Chr$(Int(Rnd*255)) & Chr$(Int(Rnd*255))
----------------------------------------------
OUCH!! (Do you see it?)
Cheers,
TC
If Rnd is uniform in 0.0 .. 1.0, and Int() truncates, don't you want
to use 256 rather than 255?
--Mike Amling
Precisely! In VBA, Rnd() returns a random real value between 0 and 1 -
including 0, but *excluding* 1. So none of my tests so far had
processed any keys with &hFF bytes. I'm now re-running everything on
random keys like FFxxxxxx, xxFFxxxx, xxxxFFxx and xxxxxxFF.
Cheers,
TC
Yama, if you read this, here's the final installment.
v3 succumbed to the blowtorch last night. Thus, my mission on earth is
now complete, & I can re-ascend to the heavens. But before I go,
here's another snafu with Rnd()!
I run the following code in a loop to generate random 32-bit keys.
Depending on options selected, the loop executes up to 16 million
times in 8 hours on my slow PC. I can only run it for 8 hours at a
time, because I'm not prepared to leave it running during the day.
sKey = Chr$(Int(Rnd*256)) & Chr$(Int(Rnd*256)) & _
Chr$(Int(Rnd*256)) & Chr$(Int(Rnd*256))
What is wrong with this happy picture?
... leave ...
... space ...
... for ...
... earthlings ...
... to ...
... consider ...
... their ...
... answers ...
Oops. I wonder what the repeat count of VBA Rnd() might be?
2^24! (and I don't mean factorial)
... must keep temper ... must keep temper ...
I call Rnd() 4 times per random key, so the keys repeat after 2^22 (4
million) values. I could have run the loop 160 *gazillion* times, & it
still would only have tested 4 million keys. I don't think 4 million
is a very good proportion of a 32-bit keyspace!
Any readers using VBA (or similar BASIC variants), can easily confirm
their own repeat counts, in a few seconds, with something like the
following:
dim s1 as string, s2 as string, x as long
for x=1 to 1000: s1 = s1 & " " & int(rnd*256): next
for x=1001 to 2^24: rnd: next
for x=1 to 1000: s2 = s2 & " " & int(rnd*256): next
debug.print s1=s2
Bye all,
TC