The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Multidimensional linear cryptanalysis
 There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic. There was an error processing your request. Please try again. Standard view   View as tree
 1 message

From:
To:
Cc:
Followup To:
Subject:
 Validation: For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon.

More options Sep 2 2011, 10:28 am
Newsgroups: sci.crypt.research
From: jdm <james.d.mclaugh...@gmail.com>
Date: 2 Sep 2011 14:28:19 GMT
Local: Fri, Sep 2 2011 10:28 am
Subject: Multidimensional linear cryptanalysis

After a somewhat frustrating time the last few days trying to get my
head round the workings of multidimensional linear cryptanalysis, I
thought I'd post some of my notes here - partly to help out anyone
else interested in this, partly to see if anyone could fill in the
gaps/point out any errors.

(In the below, "TPS" stands for "Target partial subkey", and refers to
the set of key bits the attacker hopes to obtain.)
============================================================
Multidimensional linear cryptanalysis builds on existing work with
multiple linear approximations. Given a set of m linearly independent
approximations, it is able to exploit not only these approximations
but also the 2^{m}-m-1 approximations spanned by them. (Note that I am
not sure how many of these 2^{m}-1 approximations need to have
significant bias; although I do know that the m "base approximations"
do not have to have significant - or even any - bias as long as a
sufficient number of the spanned approximations do.)

For (1 <= i <= m), let u_i denote the input bitmask of the ith base
approximation, and let w_i denote the corresponding output bitmask.
Let U be a matrix with m rows such that the ith row of U is u_i, and
let the matrix W be constructed from the w_i in the same way. Let v_i
denote the bitmask of involved key bits of the ith base approximation,
and let V be the m-row matrix with v_i as its ith row.

<139> describes a multidimensional version of Matsui's Algorithm 1,
recovering m key bits. I am not fully sure of the details, but the
gist of the attack seems to be as follows:

For each known plaintext/ciphertext pair, compute the vector (U \cdot
P) \oplus (W \cdot C).

Increment the m-bit counter corresponding to this value.

Let p denote the probability distribution (which we do not necessarily
know) of (U \cdot P) \oplus (W \cdot C) \oplus (V \cdot K). We may
have to estimate p using Corollary 1 in "Multidimensional Linear
Cryptanalysis of Reduced Round Serpent", using the biases (derived/
estimated by xoring the base approximations and using the Piling-Up
Lemma? Estimated using the biases which occurred during the attack?)
of all 2^{m}-1 approximations.

//"Statistical Tests for Key Recovery Using Multidimensional Extension
of Matsui's Algorithm 1" (Section 4):
The values in the counters allow us to compute q, an estimated
probability distribution for (U \cdot P) \oplus (W \cdot C). I think
this is just done based on Pr_q(i) being estimated as the proportion
of the time value i occurred in the attack.

For each m-bit TPS candidate k, a probability distribution p^k for (U
\cdot P) \oplus (W \cdot C) is estimated (it is stated that every p^k
is a permutation of the distribution p). I am not sure, but I think
this is done by using Corollary 1 of "Multidimensional Linear
Cryptanalysis of Reduced Round Serpent" to estimate the distribution
of (U \cdot P) \oplus (W \cdot C) \oplus (V \cdot k). (and then
perhaps using our knowledge of the (V \cdot k) values to derive from
that an estimated p.d. for (U \cdot P) \oplus (W \cdot C)).

In <139>, the closer the distributions q and p^k are (according to the
Kullback-Leibler distance), the higher key k is ranked. I believe that
the Kullback-Leibler distance is related to the log-likelihood ratio,
that this cannot be applied in all cases, and that in these situations

Section 5.1 of "A New Technique for Multidimensional Linear
Cryptanalysis with Applications on Reduced Round Serpent" suggests an
alternative method. Instead of accepting the k value such that q and
p^k are closest as the most likely candidate, the preferred candidate
is obtained by taking k such that q and p^k differ the most, and then
flipping all its bits. A statistical argument as to why this was the
better method was presented.

"Multidimensional Extension of Matsui's Algorithm 2" and "A New
Technique for Multidimensional Linear Cryptanalysis with Applications
on Reduced Round Serpent" describe the multidimensional version of
Algorithm 2.

for (each plaintext/ciphertext pair)
/*
or optimise using counters for the number of times each value of the
involved text bits appears.
*/
{
for (each TPS candidate c)
{
Partially encrypt/decrypt the involved bits
Compute the vector [U \cdot (exposed input)] \oplus [W \cdot
(exposed output)]

Add 1 to the counter for (that vector with that TPS.)
}

}

What happens next varies. In "Linear Cryptanalysis of Reduced-Round
PRESENT", for each TPS candidate c, the probability distribution of
the [U \cdot (exposed input)] \oplus [W \cdot (exposed output)]
counters is estimated, and the l_{2}-distance between this and the
uniform distribution is computed. The higher the distance, the more
likely c is to be the correct key. It appears that this was a chi-
squared statistic, and that the method described below could not be
applied to PRESENT, as "the distribution of linear approximations in
PRESENT" was claimed to be too key-dependent. I do not understand how
that could have been the case for PRESENT - a very typical SPN - but
not Serpent.

In "Multidimensional Extension of Matsui's Algorithm 2" and "A New
Technique for Multidimensional Linear Cryptanalysis with Applications
on Reduced Round Serpent", something akin to Algorithm 1 is carried
out on the partially decrypted/encrypted data for each TPS candidate
c. The candidate k such that q and p^k as described earlier (except
with exposed approximation input and output instead of plaintext and
ciphertext) differ the most (according to the Kullback-Leibler
distance - or presumably some suitable chi-squared statistic for
PRESENT) is obtained, and this difference is the "score" for c. The
higher the score of TPS candidate c, the more likely it is to be
correct.

"Linear Cryptanalysis of Reduced-Round PRESENT" used nine sets of
approximations. Each of these approximations was of dimension eight,
used the input bits of one particular S-box, and the output bits of
another S-box in a later round. By letting the first four base
approximations be (S-box input bit i = 0) and the remaining four be (S-
box output bit j = 0), the vector [U \cdot (exposed input)] \oplus [W
\cdot (exposed output)] was made equal to the concatenation ((exposed
input)||(exposed output)), saving the time that would otherwise have
been involved in ANDing with bitmasks and calculating parities.

(It also results in an attack that bears so little resemblance to the
previous versions of linear cryptanalysis that -  well, it's linear
cryptanalysis Jim, but not as we know it!)

Sources:
[1] "Linear Cryptanalysis of Reduced-Round PRESENT",
[2] "Multidimensional Linear Cryptanalysis of Reduced Round Serpent",
[3] "Linear Cryptanalysis of Reduced-Round PRESENT",
[4] "Multidimensional Extension of Matsui's Algorithm 2",
[5] "A New Technique for Multidimensional Linear Cryptanalysis with
Applications on Reduced Round Serpent",
[6] "Statistical Tests for Key Recovery Using Multidimensional
Extension of Matsui's Algorithm 1".