The question of how to safely handle multi-target attacks
* isn't specific to code-based encryption and
* isn't new,
contrary to what's suggested in the messages dated 14 Jul 2021 19:59:47
+0000 and 14 Jul 2021 21:32:40 +0000.
I'm changing the subject line accordingly, and looking at how the topic
is covered in NIST's call for proposals and in three submissions,
including two lattice submissions. For further reading on the same topic
see, e.g.,
https://blog.cr.yp.to/20151120-batchattacks.html and my
pqc-forum email dated 15 Dec 2018 02:45:56 -0000.
1. Relevant criteria established in the call for proposals:
These standards are used in a wide variety of Internet protocols,
such as TLS, SSH, IKE, IPsec, and DNSSEC. Schemes will be evaluated
by the security they provide in these applications, and in additional
applications that may be brought up by NIST or the public during the
evaluation process. Claimed applications will be evaluated for their
practical importance if this evaluation is necessary for deciding
which algorithms to standardize. ...
Security Definition for Encryption/Key-Establishment. NIST intends to
standardize one or more schemes that enable "semantically secure"
encryption or key encapsulation with respect to adaptive chosen
ciphertext attack, for general use. This property is generally
denoted IND-CCA2 security in academic literature.
The above security definition should be taken as a statement of what
NIST will consider to be a relevant attack. Submitted KEM and
encryption schemes will be evaluated based on how well they appear to
provide this property ...
Additional Security Properties. While the previously listed security
definitions cover many of the attack scenarios that will be used in
the evaluation of the submitted algorithms, there are several other
properties that would be desirable: ... A third desirable property is
resistance to multi-key attacks. Ideally an attacker should not gain
an advantage by attacking multiple keys at once, whether the
attacker's goal is to compromise a single key pair, or to compromise
a large number of keys.
Consequences for multi-target attacks:
* Regarding "security definition": The core IND-CCA2 requirement, by
definition, doesn't consider multi-key or multi-ciphertext attacks.
This has the advantage of simplicity, but a T-target attack could
gain a factor T in success probability.
* Regarding "additional security properties": An extreme form of
protection against multi-key attacks, namely zero "advantage", is
labeled as "desirable". So a scheme with security 2^300 against
single-key attacks and 2^250 against multi-key attacks is worse
than a scheme with security 2^100 against single-key attacks and
2^100 against multi-key attacks? Absurd. If the goal is to compare
schemes according to their security against multi-key attacks, then
this should be stated quantitatively, without a sharp cutoff at the
zero-loss boundary.
Neither of these covers multi-ciphertext attacks.
There is, however, a catch-all statement about schemes being evaluated
for the security that they provide in applications. To the extent that
multi-target attacks affect applications, they're covered here.
2. Let's look more closely at the rationale for the application
catch-all.
There's no clear security contract in the literature between hash
functions and typical applications of hash functions. NIST's original
attempts to specify the SHA-3 security goals ranged from incoherent
(e.g., "indistinguishable from a random oracle") to inadequate (e.g.,
"collision resistance"), and drew well-deserved criticism.
The application catch-all comes from my email "Rewriting the statement
of security criteria" to
hash-...@nist.gov dated 26 Mar 2007 21:24:04
-0400. In a nutshell, a hash function that seems collision-resistant
will be thrown away if it's shown to break clear security goals of,
e.g., an RSA-signature standard. This solved the security-specification
problem for SHA-3.
The situation for KEMs is different. The literature already has a
simple, clearly defined contract, IND-CCA2, between KEMs and the main
applications of KEMs. What's the advantage of a more complicated list of
security goals, including a requirement to look at various applications
as part of the evaluation process? The _disadvantages_ are clear:
* Post-quantum cryptanalysts are overloaded, and the overall attack
picture isn't even close to stable. See generally
https://cr.yp.to/talks.html#2021.06.08. Spreading cryptanalytic
effort across more security goals is increasing systemic risks.
* The application catch-all is open to abuse. It's an easy exercise
to write down a protocol that needs something more than IND-CCA2,
to _claim_ that the protocol is important, and to skip basic
questions such as whether there's a generic conversion from any
IND-CCA2 KEM to a KEM doing what the protocol needs.
For multi-key attacks and multi-ciphertext attacks, there's no question
regarding the real-world importance of applications allowing such
attacks, but this still doesn't mean it's a good idea to bake these
complications into the contract. Let's look at how multi-target attacks
are handled in three submissions.
3. Kyber submission, under "Multi-target attacks":
Our security analysis makes no formal claims about security bounds in
the multi-target setting. However, in the design of Kyber we made two
decisions that aim at improving security against attackers targeting
multiple users:
* We adopt the "against-all-authority" approach of re-generating the
matrix A for each public key from NewHope [10]. This protects against
an attacker attempting to break many keys at the cost of breaking one
key.
* In the CCA transform (see Alg. 8) we hash the public key into the
pre-key Kbar and the coins r. Making the coins r dependent of the
public key protects against precomputation attacks that attempt to
break one out of many keys. For details, see Subsection 5.5.
[5.5 under "Multitarget attacks using failures":] Despite the limited
gain, an attacker could consider using Grover's algorithm to
precompute values of m that produce r and e_1 with large norm and
then use this precomputed set of values of m against many users. This
multi-target attack is prevented by hashing the public key pk into
the random coins r and thereby into r and e_1 (line 3 of Alg. 8).
There are various comments here that sound like they're promising some
sort of multi-target protection---"improving security", "protects
against an attacker attempting to break many keys", "protects against
precomputation attacks that attempt to break one out of many keys"---but
there's (1) no quantitative claim of a Kyber multi-key security level,
(2) no comment on multi-ciphertext attacks, and (3) no citation of any
algorithmic papers analyzing and optimizing multi-target attacks.
Given recent batch lattice attacks such as
https://eprint.iacr.org/2020/120
it seems likely that security drops with the number of targets. If
someone finds that T-target attacks are T^(1/2) or T^(3/4) or T times
better than single-target attacks against this submission, is this
contradicting any security claims in the submission? Nope. NIST should
be saying something like "There isn't a quantified multi-target security
claim, so we're presuming a factor T security loss for T targets and
disregarding all text to the contrary."
For comparison, the message I'm replying to describes a well-known
sqrt(T) multi-target ISD attack as if this were a "concern" specifically
for "Code-Based Cryptosystems". In fact, this is yet another example of
the security of code-based cryptography being much more thoroughly
understood than the security of lattice-based cryptography.
4. If a submission is updated to cite the multi-target theorems from
https://csrc.nist.gov/CSRC/media/Presentations/faster-kyber-and-saber-via-a-generic-fujisaki-okam/images-media/session-8-duman-faster-kyber-and-saber.pdf
then is it magically immune to multi-target attacks? Nope. The theorems,
at best, relate multi-target IND-CCA security to a multi-target
"n-IND-CPA" security notion for the underlying PKE; this begs the
question of how much "n-IND-CPA" security is provided.
There isn't a stable picture of security levels for single-target search
attacks: cryptanalysts are still finding better and better attacks. We
have only limited evidence regarding the extra complications of
multi-target IND attacks---and this limited evidence suggests that
there's security degradation. More to the point, if a submission doesn't
have a clear claim to the contrary, NIST should be presuming the full T
factor loss for T-target attacks.
As a side note, the above PDF compares hashing full keys (as in, e.g.,
Kyber and NTRU Prime), hashing only a prefix (as in Three Bears), and
not hashing the key (as in Classic McEliece). It recommends prefix
hashing as as having the same multi-target security theorems as full
hashing and being faster. It says that prefix hashing is slightly slower
than hashing nothing and says "Open Question: any other disadvantages
for prefix hashing?" This question is already answered in, e.g., the
Kyber submission, which states a "depend on the entire transcript" goal
that requires full hashing. _If_ a protocol wants full hashing then
baking prefix hashing into a KEM makes the system unnecessarily complex.
These are just a few examples of the many complications that appear when
one allows the security goals to drift.
Meanwhile exactly one safe way has been identified to thoroughly handle
multi-target attacks; see below. This _doesn't_ rely on key hashing.
5. This brings me to another submission, NTRU Prime, which writes the
following under "Multi-target attacks":
We recommend handling multi-target attacks by aiming for a very high
single-target security level, and then relying on the fact that
T-target attacks gain at most a factor T. This approach is simple
and effective, and is not much more expensive than merely stopping
single-target attacks.
A different approach is to try to eliminate the gain from T-target
attacks---or, if this is not possible, to at least reduce the gain
below a factor T. This approach complicates the entire attack
surface. For every single-target attack, the reviewer is forced to
ask how much more effective multi-target attacks can be.
Occasionally the literature convincingly answers this question, but
usually it does not.
It is _conceivable_ that our hashing of the public key (see above)
makes multi-target attacks more difficult than they otherwise would
be. For example, consider an attacker who guesses many inputs,
computes the corresponding 256-bit confirmations, and scans many
ciphertexts, searching for these confirmations. The public key is
hashed into the confirmation, so this attack is limited to
ciphertexts for a particular key, limiting the total number of
targets. [footnote 7: The attack still benefits from the number of
ciphertexts encrypted to one key. For comparison, encrypting each
guessed input and comparing to all the ciphertexts has the same
benefit, but the encryption is somewhat more expensive than hashing.
The communication costs in either version of the attack can be
reduced; see [105].] On the other hand, this attack was already so
expensive as to be (1) uninteresting and (2) unlikely to be the most
effective multi-target attack against the complete system.
Unlike Kyber, "multitarget attacks using failures" aren't possible,
since there are no decryption failures. As in the case of Kyber, keys
are hashed, and there isn't a shared generator---but the documentation
points out that key hashing doesn't help against multi-ciphertext
attacks. There's a clear and quantified recommendation that protects
users against multi-key _and_ multi-ciphertext attacks. The submission
recommends dimension 761 and avoids specifying bleeding-edge dimensions.
6. As a third example, the Classic McEliece submission similarly
recommends a high security level, relying on the worst-case T factor:
"For example, with our recommended 6688/6960/8192 parameter sets, one
ciphertext is expected to be secure against an attacker without the
resources to find an AES-256 key, and 2^64 ciphertexts are expected to
all be secure against an attacker without the resources to find an
AES-192 key."
7. Procedurally, why do we have NIST pointing to multi-ciphertext
attacks and asking "How should NIST deal with this?" as if this were a
new question?
Some submissions already consider multi-ciphertext attacks. Some
submissions already say how to safely deal with multi-key _and_
multi-ciphertext attacks by (in the words of one of the submissions
quoted above) "aiming for a very high single-target security level, and
then relying on the fact that T-target attacks gain at most a factor T".
This is safe and has the virtue of not increasing cryptanalytic load.
Is there some dispute about the safety of this answer? Is there a
submission claiming safety for an alternative answer? Where? If NIST
sees some dispute or question to resolve, shouldn't it be starting by
reporting what has been said before and explaining what the question is,
rather than by acting as if it had suddenly discovered a new question?
If there _isn't_ actually a dispute, shouldn't NIST be adopting this
answer, crediting submissions accordingly (the call for proposals says
that NIST will consider "the quality of the analysis" in submissions),
figuring out the extent to which applications need multi-target security
("schemes will be evaluated by the security they provide in these
applications"), and, if there's a potential security problem,
eliminating bleeding-edge parameter sets?
Alternatively, recognizing the fact that cryptanalysts are overloaded
and that approximately 0% of the NISTPQC security analysis is handling
multi-target attacks, NIST could propose a change in the criteria to say
that multi-target attacks will be disregarded. This doesn't change the
fact that the bleeding-edge parameter sets are dangerous; it's just
setting reasonable priorities for NISTPQC.
---Dan