Dear PALOMA team, dear all,
As just shown at the KpqC workshop, we can mount a fast CCA2 attack on the
PALOMA KEM as described in the PALOMA specification (chapter 3 of the
submission document). The attack also works against the latest reference
software (the software obtained by starting from the submission package and
applying the patch described by the PALOMA authors in their email sent Tue,
23 Apr 2024 02:10:37 -0700).
The attack recovers the session key (shared secret) given a ciphertext, a
public key, and a decapsulation oracle for other ciphertexts. The attack takes
1 query to RO_G and O(n) queries to the decapsulation oracle and to RO_H. The
decapsulation oracle is used only for comparison to guessed session keys, so
this attack can also be used as a reaction attack using observations of whether
a server successfully responds to data that the attacker encrypted under those
guessed keys.
We implemented this attack, calling subroutines from the reference software. Our
experiments worked for all 100 KATs for PALOMA-128, taking around 30 seconds for
each session-key recovery on one core of a 3GHz Intel Skylake. To reproduce the
experiments, make a new directory, save the three attached files in it, and run
"sh analyze-paloma-2.sh". This will download the PALOMA software, apply the
patch from the PALOMA team from the 23 April email, and re-run the experiments.
Tweaking the script to attack PALOMA-256 takes under 7 minutes per session-key
recovery. Note that the PALOMA software requires several GB of disk space for
the KATs.
Together with Alex Pellegrini, we had previously found and implemented a
reaction attack against the software in the original submission package,
exploiting a deviation of the software from the specification. That attack did
not work against the specified version of PALOMA. The software deviation was
then fixed by the patch mentioned above.
For the specified PALOMA KEM, we also have a new partial-key-recovery attack
which in one call to RO_G and at most n calls to the decapsulation oracle and to
RO_H can determine whether 0 is one of the support elements in L and, if so, which
column it belongs to in the public matrix. We implemented this and found that
this causes the reference software to enter an infinite loop, which is another
deviation from the specified KEM. The infinite loop allows an easy timing attack
obtaining the same information.
Our new attacks are possible for two reasons:
* The extended Patterson decoder has predictable outputs under certain inputs
and in particular often outputs the 0 vector. (This was also exploited in the
previous reaction attack.)
* The use of r does not sufficiently limit what is a valid ciphertext. (This is
a new result exploited in the new attacks.)
The first part manifests itself in two different forms.
First of all, the Extended Patterson decoder inherently fails to decode a
weight-1 error if that error is at the support position alpha = 0. In that case
the syndrome polynomial s(x) corresponds to 1/x modulo g so that \tilde{s}(x)
= 1 + (x/s) mod g gives 0. In that case g1 = 1, g2 = g, g12 = 1 and the output
of SolveKeyEqn is (1,0) leading to sigma = g^2, which by construction does not
have any roots in L and thus SolveKeyEqn returns the 0 vector.
This issue is specific to the Extended Patterson decoder and g reducible and
does not appear in the regular Patterson decoder. It can be caught as a special
case by checking that the input to SolveKeyEqn has (0,1) as the first two
components and returning (0,1) in that case, but care must be taken not to cause
side-channel attacks. This might also not be necessary as t=1 is not an option
for any of the parameter sets. In any case, the implementation needs to be
fixed to avoid infinite loops. (The implementation is computing g12 by calling
gf_poly_mul, which multiplies g1 by g2 modulo g, obtaining 0 instead of the
correct g. The implementation is then dividing g by 0, which loops forever.)
Secondly, random inputs to the Extended Patterson decoder often return the 0
vector. This is because they find a random polynomial of degree less than t
which will often not have roots in L. A simple model says that this occurs
with probability (1-1/q)^n, which is 62%, 51%, 45% for PALOMA-128, 192, 256.
Our experiments are close to this model.
For both of these failure cases, the attacks use the following observations:
The 0 vector is mapped to itself by any permutation of e^* = \hat{e} and we
can run RO_G in the forward direction on e^* = 0 to obtain the matching
\hat{r} = RO_G(00..0). Hence, we can produce a ciphertext which is often valid
by using this \hat{r} with any syndrome vector s of length n-k and we can
compute the KEM key \kappa = RO_H((00....0),\hat{r},s) that this would produce
if the decapsulation indeed returns the 0 vector. This means, we compare the
output of the decapsulation oracle with this \kappa to see if the decoding
step indeed returned 0. This machinery can then be used in the two attacks.
(1) Identifying if 0 \in L and finding its position in the public matrix
This attack iterates s over the n different columns of the public-key matrix
\hat{H}. The first part of the ciphertext is \hat{r} obtained from RO_G on
input the 0 vector, the second part is the column h_i.
If the output of the decapsulation oracle matches RO_H((00...0),\hat{r},h_i) we
have found the location of alpha = 0. If we do not encounter this for any of
the n columns then we know that 0 is not in L.
(2) OW-CCA2 and IND-CCA2 attack
Given a challenge ciphertext (r,s) this attack prepares related ciphertexts and
candidate KEM keys as follows. Iterating over all columns h_i we compute
challenge ciphertext (\hat{r},s+h_i), where \hat{r} is as above, and compare
the output of the decapsulation oracle to RO_H((00...0),\hat{r},s+h_i). If this
matches we learn that the decoding step has returned the 0 vector. Apart from
the issue exploited under (1) the decoder works correctly on syndromes coming
from error vectors of weight up to t. Since s is a valid syndrome, thus
corresponding to a weight-t error, having the decoding algorithm output the 0
vector means that s+h_i corresponds to an error of weight larger than t.
Hence, the position i in \hat{e} was not set.
If the outputs do not match we do not a priori know if position i was set and
had been flipped to 0 by the addition s + h_i, thus creating a syndrome matching
an error vector of weight t-1, which decodes correctly, of if the random error
locator polynomial happened to have a root in L.
We can get clarity about which of the two cases we are in by taking
combinations of two positions. If position j is known to be 0 in \hat{e}, then
performing the above with s + h_j + h_i is guaranteed to give non-zero output
if position i was set, so that the two bit flips lead to a weight-t error while
it has a significant probability of producing the 0 vector otherwise. Varying
over different known positions j quickly gives clarity. Furthermore, we know
that \hat{e} that led to s has weight t, so we can stop once we have identified
n-t positions as 0 positions.
This way of modifying s to obtain \hat{e} matches our attack from April with
Alex Pellegrini. The April attack used crashes in the code as an oracle. The
new attack relies on the CCA2 decapsulation oracle, which adds the complication
that the new attack needs to provide otherwise valid ciphertexts, achieved as
explained above.
To finish the OW-CCA2 or IND-CCA2 attack after recovering \hat{e} we apply
PermInv, for which we use r from the original ciphertext, to get e^*, and
obtain \kappa = RO_H(e^*,r,s). This concludes the OW-CCA2 attack, which is what
we implemented. For an IND-CCA2 attack, simply compare this \kappa to the
candidate k^* provided in the challenge.
The attacks as described are stopped by checking that \hat{e} as returned by
Decrypt(sk; \hat{s}) has weight t. This check is mentioned in PKE_1 which is
used in the security proof for the IND_CCA security in chapter 5 but not
present in the PKE or KEM in chapter 3 and not present in the implementation
provided. It is not clear if that alone suffices to achieve CCA2 security. At
least a proof that if \hat{e} returned by the extended Patterson decoder has
weight t then \hat{s} must be valid seems necessary here; note that this has to
be specific to the decoder used.
For the decoder used in Classic McEliece, there is a proof of this rigidity,
but the software still performs re-encryption to make sure that nothing goes
wrong. See Section 8 of
https://cr.yp.to/papers.html#goppadecoding (and
Theorems 5.1.3 and 7.1).
Inside the PALOMA security analysis (chapter 5), Alg 21 and 22 perform the
weight check and re-encryption. However, these two checks do not appear in the
KEM specification in chapter 3 and are not implemented in the C code provided
with the submission package. A system using these two checks should be able to
use the standard FO transform and achieve CCA2 security if the RSD problem is
hard. Including these checks will slow down the decapsulation procedure and may
require including the public key in the secret key. For ways to avoid this
inclusion, see Classic McEliece.
This work benefited from discussions with Jolijn Cottaar, Kathrin Hövelmanns,
Alex Pellegrini, and Silvia Ritsch.
All the best
Dan and Tanja