89 views

Skip to first unread message

Dec 29, 2008, 9:25:02 PM12/29/08

to

Howdy,

I've recently come across a broken Blowfish implementation.

A correct Blowfish implementation has the Feistel/round-function:

F(a,b,c,d) = ((S_1[a] + S_2[b]) ^ S_3[c]) + S_4[d]

where a,b,c,d are 8 bits and S_1 to S_4 are the four S-boxes.

The broken Blowfish implementation has, instead:

F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

So only one of the four S-boxes / 8 bit values is actually used

during encryption/decryption, and the range of F is considerably

reduced.

How weak is this broken Blowfish implementation? I can't

wrap my head around the key schedule so my attempt at

cryptoanalysis has failed. If anyone here can see some

obvious weaknesses I'd like to here about them.

Björn

Dec 29, 2008, 10:22:04 PM12/29/08

to

Björn Edström wrote:

>I've recently come across a broken Blowfish implementation.

>A correct Blowfish implementation has the Feistel/round-function:

>

>F(a,b,c,d) = ((S_1[a] + S_2[b]) ^ S_3[c]) + S_4[d]

>

>where a,b,c,d are 8 bits and S_1 to S_4 are the four S-boxes.

>

>The broken Blowfish implementation has, instead:

>

>F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

>I've recently come across a broken Blowfish implementation.

>A correct Blowfish implementation has the Feistel/round-function:

>

>F(a,b,c,d) = ((S_1[a] + S_2[b]) ^ S_3[c]) + S_4[d]

>

>where a,b,c,d are 8 bits and S_1 to S_4 are the four S-boxes.

>

>The broken Blowfish implementation has, instead:

>

>F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

For what it's worth (which is not very much), I don't immediately see

any plausible way to exploit this on the full Blowfish, though I haven't

spent any serious amount of time thinking about this. 16 rounds, with

key-dependent S-boxes, forgives many sins.

The original Blowfish paper writes:

I used four different S-boxes instead of one S-box primarily to avoid

symmetries when different bytes of the input are equal, or when the

32-bit input to function F is a bytewise permutation of another 32-bit

input. I could have used one S-box and made each of the four different

outputs a non-trivial permutation of the single output, but the four

S-box design is faster, easier to program, and seems more secure.

[...]

If the four indexes chose values out of the same S-box, a more complex

combining function would be required to eliminate symmetries. I

considered using a more complex combining function in Blowfish (using

modular multiplications, rotations, etc.), but chose not to because

the added complication seemed unnecessary.

http://www.schneier.com/paper-blowfish-fse.html

This certainly seems like good design, but I don't immediately see any

attack that would be worth worrying about in practice.

If I controlled the broken code and were in charge of defense, I'd

probably schedule a fix to this code at some convenient future version

of the software, to be safe -- but I wouldn't issue an emergency patch.

If I were in charge of attacking such a system, I suppose I'd spend more

time thinking about possible cryptanalytic attacks, but my first instinct

is that I'm not too optimistic this will lead to a practical attack.

If I were attacking the system, I'd probably look for other kinds of

implementation bugs (see, e.g., http://www.schneier.com/blowfish-bug.txt)

as well as other ways to defeat the system security without breaking

the crypto head-on.

Dec 30, 2008, 10:46:18 AM12/30/08

to

In article <gjc44s$11n8$1...@agate.berkeley.edu>,

David Wagner <daw-...@taverner.cs.berkeley.edu> wrote:

>Björn Edström wrote:

>>I've recently come across a broken Blowfish implementation.

>>A correct Blowfish implementation has the Feistel/round-function:

>>

>>F(a,b,c,d) = ((S_1[a] + S_2[b]) ^ S_3[c]) + S_4[d]

>>

>>where a,b,c,d are 8 bits and S_1 to S_4 are the four S-boxes.

>>

>>The broken Blowfish implementation has, instead:

>>

>>F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

David Wagner <daw-...@taverner.cs.berkeley.edu> wrote:

>Björn Edström wrote:

>>I've recently come across a broken Blowfish implementation.

>>A correct Blowfish implementation has the Feistel/round-function:

>>

>>F(a,b,c,d) = ((S_1[a] + S_2[b]) ^ S_3[c]) + S_4[d]

>>

>>where a,b,c,d are 8 bits and S_1 to S_4 are the four S-boxes.

>>

>>The broken Blowfish implementation has, instead:

>>

>>F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

Note that these not only use the same s-box, but

the same byte input!

>For what it's worth (which is not very much), I don't immediately see

>any plausible way to exploit this on the full Blowfish, though I haven't

>spent any serious amount of time thinking about this. 16 rounds, with

>key-dependent S-boxes, forgives many sins.

Seems to me that it has horrible differential

cryptanalysis properties; changes to any of the

bytes other than "a" propagate straight through.

Also, I have a feeling that F, as described, has a

very limited range. (But I haven't checked.)

Greg.

--

Greg Rose

232B EC8F 44C6 C853 D68F E107 E6BF CD2F 1081 A37C

Qualcomm Australia: http://www.qualcomm.com.au

Dec 30, 2008, 6:45:12 PM12/30/08

to

Björn Edström wrote:

> The broken Blowfish implementation has, instead:

>

> F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

> The broken Blowfish implementation has, instead:

>

> F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

Greg Rose wrote:

> Note that these not only use the same s-box, but

> the same byte input!

Eek. This is terrible! Thanks, Greg. I don't know how I

missed that; I guess I need to read more carefully.

So this looks a severe weakness with serious implications.

Observations: The only non-linearity here comes from applying the S-box

to the a-part of the left side and the right side. The rest is entirely

linear. That means that the ciphertext depends non-linearly on 16 bits

of the plaintext and linearly on 48 bits of the plaintext. Not good.

Also, the F-function does indeed have a reduced range: the low 2 bits

of the output of the F-function will always be zero, so 2 bits of each

side of the plaintext are not affected by any nonlinearity.

Attack 1: The low 2 bits of the left part of the ciphertext will be the

low 2 bits of the left part of the plaintext, xor a fixed 2-bit value

(this fixed 2-bit value is determined solely by the key). Same for

the low 2 bits of the right part of the text. So if you have one known

plaintext, you can derive these 4 bits of key material, and then every

subsequent ciphertext leaks 4 bits of information about the plaintext.

Depending upon how the application uses this cipher, this alone could

potentially be devastating. There are variants of this attack that may

apply even if you do not have any known plaintext but if you have a good

statistical model of those 4 bits of the plaintext.

Attack 2: Suppose that we fix the a-part of the left and right side

of the plaintext (16 bits fixed), and let the remaining 48 bits of

the plaintext vary. Then the a-parts of the left and right side of the

ciphertext will stay the same, and the remaining 48 bits of the ciphertext

will be just the xor of the 48 plaintext bits against some fixed value

that depends only upon the key and the 16 fixed bits of plaintext.

Suppose we see a ciphertext C, whose decryption P is unknown. Suppose we

can find some other known plaintext/ciphertext pair (P',C') such that

the a-parts of C match the a-parts of C' (a 16-bit condition). Then we

can conclude that the a-parts of P match the a-parts of P', so we learn

16 bits of information about the decryption of C. Also, the remaining

48 bits just involve an xor with a constant that is the same for both

(P,C) and (P',C'), so we can recover the remaining 48 bits of P trivially

as well (it's just the 48 bits C xor P' xor C'). So if we can find a

known text-pair whose ciphertext matches C in 16 bit positions, then

we can decrypt C. It follows that if we have a collection of around

2^16 known text pairs, then we'll be able to decrypt most ciphertext we

subsequently see. If we have only a fraction of that many known texts, we

can decrypt a corresponding fraction of the subsequent ciphertexts we see.

Attack 3: Suppose we see a lot more than 2^16 ciphertexts, and we have

some statistical model on the corresponding plaintexts. Then we may be

able to decrypt all of the traffic. For instance, the 16-bit a-parts

of each ciphertext are encrypted under a simple substitution cipher,

so we may be able to use frequency analysis to recover some information

about that. Likewise, we can partition the ciphertext into buckets,

where the ciphertexts in each bucket all have the same 16-bit a-parts.

Then within a bucket, the other 48 bits are "encrypted" simply by xor-ing

with a fixed constant (the same constant for all ciphertexts in a bucket,

but a different constant in each bucket). If the plaintexts follow some

statistical distribution, say, that of English, and if we have enough

ciphertexts in each bucket, it may be able to recover cleartext.

Summary: If I haven't made any mistakes in the analysis above, this

cipher looks broken. If I were a defender in charge of software that

used this cipher, I'd do an urgent investigation to examine the impact:

the impact could be significant and might well warrant an emergency patch,

depending upon the circumstances.

Dec 30, 2008, 9:10:28 PM12/30/08

to

David Wagner wrote:

> Björn Edström wrote:

>> The broken Blowfish implementation has, instead:

>>

>> F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

>

> Greg Rose wrote:

>> Note that these not only use the same s-box, but

>> the same byte input!

>

> Eek. This is terrible! Thanks, Greg. I don't know how I

> missed that; I guess I need to read more carefully.

>

> So this looks a severe weakness with serious implications.

>

> Björn Edström wrote:

>> The broken Blowfish implementation has, instead:

>>

>> F(a,b,c,d) = ((S_1[a] + S_1[a]) ^ S_1[a]) + S_1[a]

>

> Greg Rose wrote:

>> Note that these not only use the same s-box, but

>> the same byte input!

>

> Eek. This is terrible! Thanks, Greg. I don't know how I

> missed that; I guess I need to read more carefully.

>

> So this looks a severe weakness with serious implications.

>

Thanks for the analysis. I have informed the software

vendor of the problem.

Björn

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu