Security strength categories for Code Based Crypto (And trying out Crypto Stack Exchange)

708 views
Skip to first unread message

Perlner, Ray A. (Fed)

unread,
Jul 14, 2021, 3:59:52 PM7/14/21
to pqc-forum
Hi all,

In looking into the questions raised by Kirk Fleming regarding the concrete security of several of Classic McEliece's parameter sets, we ran into some issues that may concern all three of the Code-Based Cryptosystems currently in consideration in the 3rd round (Classic McEliece, BIKE, and HQC), and we would like to ask for feedback from the community. My colleagues have suggested that we try out using crypto stack exchange for question-based exchanges, so these questions (1 and 2 with slight reformatting) will appear there too, shortly.  Feel free to answer in either place.

1) It seems that an attacker can use a decoding one out of many strategy as described in https://eprint.iacr.org/2011/367.pdf to decrypt one out of a list of N ciphertexts, for a cost of approximately sqrt(N) times less than the cost of attacking a single ciphertext. While I don't think the security definition given in the NIST CFP explicitly covers this scenario, it does seem like if you're worried about 2^64 decryption queries from a CCA adversary, it may be reasonable to be worried about 2^64 target ciphertexts. How should NIST deal with this?

 (Note the above paper is cited in the BIKE and HQC specs., but only in the context of assessing security loss due to the use of quasi-cyclic matrices, not in the context of assessing security loss due to multiple ciphertexts. The paper is cited by the Classic McEliece in the sense of a multi ciphertext attack.)

2) A paper cited by members of the Classic McEliece team in their third-round workshop presentation and on this forum: https://www.mdpi.com/1999-4893/12/10/209/pdf gives the concrete number of bit operations required to perform the MMT information set decoding attack as significantly less than for any of the other ISD variants considered, including the BJMM algorithm which is widely considered to be an improvement over MMT. e.g. for classic McEliece's 8192128 parameter set, the paper gives the complexity of MMT as 2^249.74 as opposed to 2^299.89 for BJMM. Is this analysis correct?  If so, it seems like this could threaten not just some of the McEliece parameters, but also some of the parameters of the other code-based schemes. If not, what's a good source for accurate numbers?

(The paper's claimed 2^87.95 memory complexity for MMT, as compared to 2^79.97 for BJMM, and 2^67.64 for Stern does not seem large enough to make up for the discrepancy on any plausible model for the cost of memory vs bit operations. MMT is claimed similarly cheaper for other parameter sets and schemes.)

In any event, our intent is that if the parameters of the code-base schemes are found to have inaccurate security estimates at such time as we intend to standardize one or more of them, we will try to give accurate security strength categories for the parameter sets we are given, and only standardize those parameter sets that appear to at least meet category 1 in whatever security models we deem most relevant.

Thanks,

Ray Perlner (NISTPQC)

Perlner, Ray A. (Fed)

unread,
Jul 14, 2021, 5:32:45 PM7/14/21
to pqc-forum, Perlner, Ray A. (Fed)
Here are the links to crypto stack exchange:

1:
2:


From: 'Perlner, Ray A. (Fed)' via pqc-forum <pqc-...@list.nist.gov>
Sent: Wednesday, July 14, 2021 3:59 PM
To: pqc-forum <pqc-...@list.nist.gov>
Subject: [pqc-forum] Security strength categories for Code Based Crypto (And trying out Crypto Stack Exchange)
 
--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/DM6PR09MB48551334723993B099E7D5C69C139%40DM6PR09MB4855.namprd09.prod.outlook.com.

D. J. Bernstein

unread,
Jul 16, 2021, 4:19:31 PM7/16/21
to pqc-forum
The question of how to safely handle multi-target attacks

* isn't specific to code-based encryption and
* isn't new,

contrary to what's suggested in the messages dated 14 Jul 2021 19:59:47
+0000 and 14 Jul 2021 21:32:40 +0000.

I'm changing the subject line accordingly, and looking at how the topic
is covered in NIST's call for proposals and in three submissions,
including two lattice submissions. For further reading on the same topic
see, e.g., https://blog.cr.yp.to/20151120-batchattacks.html and my
pqc-forum email dated 15 Dec 2018 02:45:56 -0000.


1. Relevant criteria established in the call for proposals:

These standards are used in a wide variety of Internet protocols,
such as TLS, SSH, IKE, IPsec, and DNSSEC. Schemes will be evaluated
by the security they provide in these applications, and in additional
applications that may be brought up by NIST or the public during the
evaluation process. Claimed applications will be evaluated for their
practical importance if this evaluation is necessary for deciding
which algorithms to standardize. ...

Security Definition for Encryption/Key-Establishment. NIST intends to
standardize one or more schemes that enable "semantically secure"
encryption or key encapsulation with respect to adaptive chosen
ciphertext attack, for general use. This property is generally
denoted IND-CCA2 security in academic literature.

The above security definition should be taken as a statement of what
NIST will consider to be a relevant attack. Submitted KEM and
encryption schemes will be evaluated based on how well they appear to
provide this property ...

Additional Security Properties. While the previously listed security
definitions cover many of the attack scenarios that will be used in
the evaluation of the submitted algorithms, there are several other
properties that would be desirable: ... A third desirable property is
resistance to multi-key attacks. Ideally an attacker should not gain
an advantage by attacking multiple keys at once, whether the
attacker's goal is to compromise a single key pair, or to compromise
a large number of keys.

Consequences for multi-target attacks:

* Regarding "security definition": The core IND-CCA2 requirement, by
definition, doesn't consider multi-key or multi-ciphertext attacks.
This has the advantage of simplicity, but a T-target attack could
gain a factor T in success probability.

* Regarding "additional security properties": An extreme form of
protection against multi-key attacks, namely zero "advantage", is
labeled as "desirable". So a scheme with security 2^300 against
single-key attacks and 2^250 against multi-key attacks is worse
than a scheme with security 2^100 against single-key attacks and
2^100 against multi-key attacks? Absurd. If the goal is to compare
schemes according to their security against multi-key attacks, then
this should be stated quantitatively, without a sharp cutoff at the
zero-loss boundary.

Neither of these covers multi-ciphertext attacks.

There is, however, a catch-all statement about schemes being evaluated
for the security that they provide in applications. To the extent that
multi-target attacks affect applications, they're covered here.


2. Let's look more closely at the rationale for the application
catch-all.

There's no clear security contract in the literature between hash
functions and typical applications of hash functions. NIST's original
attempts to specify the SHA-3 security goals ranged from incoherent
(e.g., "indistinguishable from a random oracle") to inadequate (e.g.,
"collision resistance"), and drew well-deserved criticism.

The application catch-all comes from my email "Rewriting the statement
of security criteria" to hash-...@nist.gov dated 26 Mar 2007 21:24:04
-0400. In a nutshell, a hash function that seems collision-resistant
will be thrown away if it's shown to break clear security goals of,
e.g., an RSA-signature standard. This solved the security-specification
problem for SHA-3.

The situation for KEMs is different. The literature already has a
simple, clearly defined contract, IND-CCA2, between KEMs and the main
applications of KEMs. What's the advantage of a more complicated list of
security goals, including a requirement to look at various applications
as part of the evaluation process? The _disadvantages_ are clear:

* Post-quantum cryptanalysts are overloaded, and the overall attack
picture isn't even close to stable. See generally
https://cr.yp.to/talks.html#2021.06.08. Spreading cryptanalytic
effort across more security goals is increasing systemic risks.

* The application catch-all is open to abuse. It's an easy exercise
to write down a protocol that needs something more than IND-CCA2,
to _claim_ that the protocol is important, and to skip basic
questions such as whether there's a generic conversion from any
IND-CCA2 KEM to a KEM doing what the protocol needs.

For multi-key attacks and multi-ciphertext attacks, there's no question
regarding the real-world importance of applications allowing such
attacks, but this still doesn't mean it's a good idea to bake these
complications into the contract. Let's look at how multi-target attacks
are handled in three submissions.


3. Kyber submission, under "Multi-target attacks":

Our security analysis makes no formal claims about security bounds in
the multi-target setting. However, in the design of Kyber we made two
decisions that aim at improving security against attackers targeting
multiple users:

* We adopt the "against-all-authority" approach of re-generating the
matrix A for each public key from NewHope [10]. This protects against
an attacker attempting to break many keys at the cost of breaking one
key.

* In the CCA transform (see Alg. 8) we hash the public key into the
pre-key Kbar and the coins r. Making the coins r dependent of the
public key protects against precomputation attacks that attempt to
break one out of many keys. For details, see Subsection 5.5.

[5.5 under "Multitarget attacks using failures":] Despite the limited
gain, an attacker could consider using Grover's algorithm to
precompute values of m that produce r and e_1 with large norm and
then use this precomputed set of values of m against many users. This
multi-target attack is prevented by hashing the public key pk into
the random coins r and thereby into r and e_1 (line 3 of Alg. 8).

There are various comments here that sound like they're promising some
sort of multi-target protection---"improving security", "protects
against an attacker attempting to break many keys", "protects against
precomputation attacks that attempt to break one out of many keys"---but
there's (1) no quantitative claim of a Kyber multi-key security level,
(2) no comment on multi-ciphertext attacks, and (3) no citation of any
algorithmic papers analyzing and optimizing multi-target attacks.

Given recent batch lattice attacks such as

https://eprint.iacr.org/2020/120

it seems likely that security drops with the number of targets. If
someone finds that T-target attacks are T^(1/2) or T^(3/4) or T times
better than single-target attacks against this submission, is this
contradicting any security claims in the submission? Nope. NIST should
be saying something like "There isn't a quantified multi-target security
claim, so we're presuming a factor T security loss for T targets and
disregarding all text to the contrary."

For comparison, the message I'm replying to describes a well-known
sqrt(T) multi-target ISD attack as if this were a "concern" specifically
for "Code-Based Cryptosystems". In fact, this is yet another example of
the security of code-based cryptography being much more thoroughly
understood than the security of lattice-based cryptography.


4. If a submission is updated to cite the multi-target theorems from

https://csrc.nist.gov/CSRC/media/Presentations/faster-kyber-and-saber-via-a-generic-fujisaki-okam/images-media/session-8-duman-faster-kyber-and-saber.pdf

then is it magically immune to multi-target attacks? Nope. The theorems,
at best, relate multi-target IND-CCA security to a multi-target
"n-IND-CPA" security notion for the underlying PKE; this begs the
question of how much "n-IND-CPA" security is provided.

There isn't a stable picture of security levels for single-target search
attacks: cryptanalysts are still finding better and better attacks. We
have only limited evidence regarding the extra complications of
multi-target IND attacks---and this limited evidence suggests that
there's security degradation. More to the point, if a submission doesn't
have a clear claim to the contrary, NIST should be presuming the full T
factor loss for T-target attacks.

As a side note, the above PDF compares hashing full keys (as in, e.g.,
Kyber and NTRU Prime), hashing only a prefix (as in Three Bears), and
not hashing the key (as in Classic McEliece). It recommends prefix
hashing as as having the same multi-target security theorems as full
hashing and being faster. It says that prefix hashing is slightly slower
than hashing nothing and says "Open Question: any other disadvantages
for prefix hashing?" This question is already answered in, e.g., the
Kyber submission, which states a "depend on the entire transcript" goal
that requires full hashing. _If_ a protocol wants full hashing then
baking prefix hashing into a KEM makes the system unnecessarily complex.
These are just a few examples of the many complications that appear when
one allows the security goals to drift.

Meanwhile exactly one safe way has been identified to thoroughly handle
multi-target attacks; see below. This _doesn't_ rely on key hashing.


5. This brings me to another submission, NTRU Prime, which writes the
following under "Multi-target attacks":

We recommend handling multi-target attacks by aiming for a very high
single-target security level, and then relying on the fact that
T-target attacks gain at most a factor T. This approach is simple
and effective, and is not much more expensive than merely stopping
single-target attacks.

A different approach is to try to eliminate the gain from T-target
attacks---or, if this is not possible, to at least reduce the gain
below a factor T. This approach complicates the entire attack
surface. For every single-target attack, the reviewer is forced to
ask how much more effective multi-target attacks can be.
Occasionally the literature convincingly answers this question, but
usually it does not.

It is _conceivable_ that our hashing of the public key (see above)
makes multi-target attacks more difficult than they otherwise would
be. For example, consider an attacker who guesses many inputs,
computes the corresponding 256-bit confirmations, and scans many
ciphertexts, searching for these confirmations. The public key is
hashed into the confirmation, so this attack is limited to
ciphertexts for a particular key, limiting the total number of
targets. [footnote 7: The attack still benefits from the number of
ciphertexts encrypted to one key. For comparison, encrypting each
guessed input and comparing to all the ciphertexts has the same
benefit, but the encryption is somewhat more expensive than hashing.
The communication costs in either version of the attack can be
reduced; see [105].] On the other hand, this attack was already so
expensive as to be (1) uninteresting and (2) unlikely to be the most
effective multi-target attack against the complete system.

Unlike Kyber, "multitarget attacks using failures" aren't possible,
since there are no decryption failures. As in the case of Kyber, keys
are hashed, and there isn't a shared generator---but the documentation
points out that key hashing doesn't help against multi-ciphertext
attacks. There's a clear and quantified recommendation that protects
users against multi-key _and_ multi-ciphertext attacks. The submission
recommends dimension 761 and avoids specifying bleeding-edge dimensions.


6. As a third example, the Classic McEliece submission similarly
recommends a high security level, relying on the worst-case T factor:
"For example, with our recommended 6688/6960/8192 parameter sets, one
ciphertext is expected to be secure against an attacker without the
resources to find an AES-256 key, and 2^64 ciphertexts are expected to
all be secure against an attacker without the resources to find an
AES-192 key."


7. Procedurally, why do we have NIST pointing to multi-ciphertext
attacks and asking "How should NIST deal with this?" as if this were a
new question?

Some submissions already consider multi-ciphertext attacks. Some
submissions already say how to safely deal with multi-key _and_
multi-ciphertext attacks by (in the words of one of the submissions
quoted above) "aiming for a very high single-target security level, and
then relying on the fact that T-target attacks gain at most a factor T".
This is safe and has the virtue of not increasing cryptanalytic load.

Is there some dispute about the safety of this answer? Is there a
submission claiming safety for an alternative answer? Where? If NIST
sees some dispute or question to resolve, shouldn't it be starting by
reporting what has been said before and explaining what the question is,
rather than by acting as if it had suddenly discovered a new question?

If there _isn't_ actually a dispute, shouldn't NIST be adopting this
answer, crediting submissions accordingly (the call for proposals says
that NIST will consider "the quality of the analysis" in submissions),
figuring out the extent to which applications need multi-target security
("schemes will be evaluated by the security they provide in these
applications"), and, if there's a potential security problem,
eliminating bleeding-edge parameter sets?

Alternatively, recognizing the fact that cryptanalysts are overloaded
and that approximately 0% of the NISTPQC security analysis is handling
multi-target attacks, NIST could propose a change in the criteria to say
that multi-target attacks will be disregarded. This doesn't change the
fact that the bleeding-edge parameter sets are dangerous; it's just
setting reasonable priorities for NISTPQC.


---Dan
signature.asc

Kirk Fleming

unread,
Jul 18, 2021, 9:55:39 AM7/18/21
to ray.p...@nist.gov, pqc-forum, Perlner, Ray A. (Fed)
Ray Perlner asked:

> It seems that an attacker can use a decoding one out of many
> strategy as described in https://eprint.iacr.org/2011/367.pdf
> to decrypt one out of a list of N ciphertexts, for a cost of
> approximately sqrt(N) times less than the cost of attacking a
> single ciphertext. While I don't think the security definition
> given in the NIST CFP explicitly covers this scenario, it does
> seem like if you're worried about 2^64 decryption queries from
> a CCA adversary, it may be reasonable to be worried about 2^64
> target ciphertexts. How should NIST deal with this?
 
The DOOM attack isn't the most efficient approach if you have
multiple ciphertexts encapsulated under the same public key
for the Category 5 parameter sets.
 
Most of the KEMs generate the randomness used in encapsulation
from a seed and the rest allow this as an implementation option.
(Kyber, Saber, BIKE, HQC and NTRU LPrime specify 256 bit seeds.
SIKE specifies 128, 192 or 256 bit seeds. NTRU and Streamlined
NTRU Prime allow 256 bis seeds as an option. Classic McEliece
cites McTiny which cites Classic McEliece so who knows what
they're actually recommending.)
 
You should expect to recover the 256 bit seed for one of the N
ciphertexts after 2^256/N trial encapsulations. For schemes that
have confirmation values (Classic McEliece, HQC and NTRUPrime)
you don't even need to compute the full encapsulation, just the
confirmation value. This beats the sqrt(N) improvement you'd get
from DOOM for the Category 5 parameter sets. I also don't think it's
something to worry about (except maybe for SIKE's smaller seeds).
 
I'd be more worried about reusing the same seed in encapsulation
with two different public keys. Classic McEliece, BIKE and HQC
don't include the public key when deriving the randomness. This
leads to a trivial break for BIKE and HQC, and a much easier ISD
problem for Classic McEliece. Confirmation values also mean it's
trivial to spot when a seed has been reused in Classic McEliece
and HQC.
 
Ray Perlner also asked:

> A paper cited by members of the Classic McEliece team in their
> third-round workshop presentation and on this forum:
> https://www.mdpi.com/1999-4893/12/10/209/pdf gives the concrete
> number of bit operations required to perform the MMT information
> set decoding attack as significantly less than for any of the
> other ISD variants considered, including the BJMM algorithm
> which is widely considered to be an improvement over MMT. e.g.
> for classic McEliece's 8192128 parameter set, the paper gives
> the complexity of MMT as 2^249.74 as opposed to 2^299.89 for
> BJMM. Is this analysis correct?  If so, it seems like this
> could threaten not just some of the McEliece parameters, but
> also some of the parameters of the other code-based schemes.
> If not, what's a good source for accurate numbers?
 
There's already a good answer to this on crypto stack exchange.
The LEDAcrypt team's code didn't include MMT so I never fully
trusted the estimates in the paper. I always thought they looked
lower than I'd reasonably expect for the parameters.
 
Kirk

Kirk Fleming

unread,
Jul 18, 2021, 10:35:52 AM7/18/21
to Kirk Fleming, ray.p...@nist.gov, pqc-forum
I wrote:
> Most of the KEMs generate the randomness used in encapsulation
> from a seed and the rest allow this as an implementation option.
> (Kyber, Saber, BIKE, HQC and NTRU LPrime specify 256 bit seeds.
> SIKE specifies 128, 192 or 256 bit seeds. NTRU and Streamlined
> NTRU Prime allow 256 bis seeds as an option. Classic McEliece
> cites McTiny which cites Classic McEliece so who knows what
> they're actually recommending.)
 
I forgot FrodoKEM. That also specifies 128, 192 or 256 bit seeds.
 
Kirk

Ruben Niederhagen

unread,
Jul 18, 2021, 12:15:09 PM7/18/21
to pqc-forum
On 7/18/21 3:55 PM, Kirk Fleming wrote:
> Most of the KEMs generate the randomness used in encapsulation
> from a seed and the rest allow this as an implementation option.
> (Kyber, Saber, BIKE, HQC and NTRU LPrime specify 256 bit seeds.
> SIKE specifies 128, 192 or 256 bit seeds. NTRU and Streamlined
> NTRU Prime allow 256 bis seeds as an option. Classic McEliece
> cites McTiny which cites Classic McEliece so who knows what
> they're actually recommending.)

The Classic McEliece submission unambiguously refers to McTiny as an
application (not as a source) for seeded encapsulation: "See [McTiny]
for an application" (page 31 of the submission document).

Section 4.4 in the Classic McEliece submission (round 3) describes the
generation of random objects from a 32 byte (i.e., 256 bit) seed.

Best regards
Ruben

Kirk Fleming

unread,
Jul 18, 2021, 3:29:40 PM7/18/21
to Ruben Niederhagen, pqc-forum
I wrote:
>> Most of the KEMs generate the randomness used in encapsulation
>> from a seed and the rest allow this as an implementation option.
>> (Kyber, Saber, BIKE, HQC and NTRU LPrime specify 256 bit seeds.
>> SIKE specifies 128, 192 or 256 bit seeds. NTRU and Streamlined
>> NTRU Prime allow 256 bis seeds as an option. Classic McEliece
>> cites McTiny which cites Classic McEliece so who knows what
>> they're actually recommending.)
 
Ruben replied:

> The Classic McEliece submission unambiguously refers to McTiny as an
> application (not as a source) for seeded encapsulation: "See [McTiny]
> for an application" (page 31 of the submission document).
 
> Section 4.4 in the Classic McEliece submission (round 3) describes the
> generation of random objects from a 32 byte (i.e., 256 bit) seed.
 
Thanks for clarifying. Section 4.4 describes the generation of random
objects from a 256 bit seed for seeded key generation. The submission
doesn't explicitly say that a 256 bit seed is also used for generating
the fixed-weight vector in seeded encapsulation.
 
Kirk

D. J. Bernstein

unread,
Jul 19, 2021, 12:22:25 AM7/19/21
to pqc-...@list.nist.gov
The round-3 Classic McEliece specification (Section 2 of the submission)
fully defines, for the Classic McEliece KEM (Section 2.4), every detail
of how every random object is generated, starting from uniform random
bits. See, in particular,

* KeyGen (Section 2.4.3), where Step 1 generates "a uniform random
l-bit string delta" and then there's a full specification of the
SeededKeyGen computations using this string of random bits, and

* FixedWeight (Section 2.4.4), where Step 1 generates "sigma_1 t
uniform random bits" and then there's a full specification of the
computations using this string of random bits.

Having everything fully defined down to the level of uniform random bits
is useful for security analysis and cryptographic module validation, as
explained in the round-3 Classic McEliece rationale (Section 4 of the
submission) under "Generation of random objects" (Section 4.4).

The sigma_1 parameter is defined as 16 (Section 2.5.1), and the t
parameter ranges from 64 through 128 depending on the parameter set
(see Section 3 for the list of parameter sets), so the number of bits
used as input in FixedWeight is between 16*64 = 1024 and 16*128 = 2048.
(Because of rejection sampling, more bits can be used overall.)

Since FixedWeight generates a weight-t length-n bit string, the number
of possible outputs of FixedWeight is limited to n choose t, which is
between 2^456.3 and 2^946.4 depending on the parameter set.

The analysis of known attacks (Section 8 of the submission) mentions
regarding the output e that "The confirmation allows attackers to check
possibilities for e by checking their hashes. However, this is much
less efficient than ISD."

Finally, regarding multi-target attacks in general, the submission uses
the only available mechanism that's guaranteed to work: take very high
security levels and rely on the fact that T-target attacks gain at most
a factor T. See Section 8.3 of the submission, and, for further review,
my email dated 16 Jul 2021 22:19:16 +0200.

Why am I repeating what the submission already says? Answer: this seems
necessary given some recent pqc-forum comments.

First, the comment "Confirmation values also mean it's trivial to spot
when a seed has been reused in Classic McEliece" is---modulo the
question of how many e values there are---simply repeating an attack
stated in the submission (see "checking their hashes" quote above). This
isn't acknowledged, so readers are left thinking, incorrectly, that this
attack is omitted from the submission. (The multi-target comments
similarly don't acknowledge what the submission says.) Note that the
official evaluation criteria include "the quality of analysis" in
submissions.

Second, regarding the number of e values, the same pqc-forum comment
suggests that Classic McEliece generates e from a 256-bit seed. This is
wrong; it's contradicted by the Classic McEliece specification. See
above regarding "sigma_1 t uniform random bits".

Of course there are other 256-bit targets all over the place, in this
and other submissions: the session key is 256 bits; the key seed is 256
bits; people often use PRNGs that start from a 256-bit seed; etc. NIST
should simply declare that this is fine and that we all have much more
important things to do than worry about attackers successfully guessing
256-bit secrets---even with multi-target attacks, even with quantum
attacks. However, given that NIST _hasn't_ declared this and that
well-known T-target 2^256/T attacks are being used to snipe at some
submissions, it seems necessary to correct misinformation regarding the
exact amount of randomness used in Classic McEliece ciphertexts.

Third, the same commenter makes a series of remarks making readers think
that the spec _doesn't_ spell out how randomness is used ("who knows
what they're actually recommending"; "doesn't explicitly say" something
else; misrepresenting a citation as a clarification; etc.). In fact, the
spec spells out exactly how randomness is used.

---Dan (speaking for myself)
signature.asc

Derek Atkins

unread,
Jul 19, 2021, 9:39:59 AM7/19/21
to kpfl...@mail.com, ray.p...@nist.gov, pqc-...@list.nist.gov
Hi,

On Sun, 2021-07-18 at 15:55 +0200, Kirk Fleming wrote:
You should expect to recover the 256 bit seed for one of the N
ciphertexts after 2^256/N trial encapsulations.

Can you please clarify whether you mean 2^(256/N) or (2^256)/N here?

-derek

-- 
Derek Atkins
Chief Technology Officer
Veridify Security - Securing the Internet of Things®

Office: 203.227.3151  x1343
Direct: 617.623.3745
Mobile: 617.290.5355
Email: DAt...@Veridify.com

This email message may contain confidential, proprietary and / or legally privileged information and intended only for the use of the intended recipient(s) and others specifically authorized. Any disclosure, dissemination, copying, distribution or use of the information contained in this email message, including any attachments, to or by anyone other than the intended recipient is strictly prohibited.  If you received this in error, please immediately advise the sender by reply email or at the telephone number above, and then delete, shred, or otherwise dispose of this message.

Perlner, Ray A. (Fed)

unread,
Jul 19, 2021, 10:32:20 AM7/19/21
to Kirk Fleming, pqc-forum
Hi Kirk.

You wrote
>Most of the KEMs generate the randomness used in encapsulation
>from a seed and the rest allow this as an implementation option.
>(Kyber, Saber, BIKE, HQC and NTRU LPrime specify 256 bit seeds.
>SIKE specifies 128, 192 or 256 bit seeds. NTRU and Streamlined
>NTRU Prime allow 256 bis seeds as an option. Classic McEliece
>cites McTiny which cites Classic McEliece so who knows what
>they're actually recommending.)

>>(Later email): I forgot FrodoKEM. That also specifies 128, 192 or 256 bit seeds.
 
>You should expect to recover the 256 bit seed for one of the N
>ciphertexts after 2^256/N trial encapsulations. For schemes that
>have confirmation values (Classic McEliece, HQC and NTRUPrime)
>you don't even need to compute the full encapsulation, just the
>confirmation value. This beats the sqrt(N) improvement you'd get
>from DOOM for the Category 5 parameter sets. I also don't think it's
>something to worry about (except maybe for SIKE's smaller seeds).
 
>I'd be more worried about reusing the same seed in encapsulation
>with two different public keys. Classic McEliece, BIKE and HQC
>don't include the public key when deriving the randomness. This
>leads to a trivial break for BIKE and HQC, and a much easier ISD
>problem for Classic McEliece. Confirmation values also mean it's
>trivial to spot when a seed has been reused in Classic McEliece
>and HQC.
 

Good point about multitarget attacks on seeds, and about the misuse risk from reusing a key when the PK isn't used in deriving randomness. It seems like these issues could be fixed at fairly minimal cost by a) incorporating a public salt, (or a longer seed if the PKE can handle it.) AND b) including the PK in the computation for randomness. (Presumably some extra work needs to be done to check the security proof still works.) But, I'm not sure what you would do to counteract the DOOM attack other than increasing the parameters. I agree that this stuff is all more concerning at category 1 than category 5 (and I suspect the rest of the team agrees with me, though I haven't explicitly checked), since category 1 security is closer to a security level that might become practical to attack in the foreseeable future. Do you think it's worth at least encouraging teams to patch the multitarget issues that are minimal cost to patch?

Ray Perlner



From: Kirk Fleming <kpfl...@mail.com>
Sent: Sunday, July 18, 2021 10:35 AM
To: Kirk Fleming <kpfl...@mail.com>
Cc: Perlner, Ray A. (Fed) <ray.p...@nist.gov>; pqc-forum <pqc-...@list.nist.gov>
Subject: Re: [pqc-forum] Security strength categories for Code Based Crypto (And trying out Crypto Stack Exchange)
 

Kirk Fleming

unread,
Jul 19, 2021, 7:20:29 PM7/19/21
to pqc-forum
Please stop changing the subject line. It makes it harder to
follow the discussion.
 
> [T]he comment "Confirmation values also mean it's trivial to
> spot when a seed has been reused in Classic McEliece" is
> - modulo the question of how many e values there are - simply

> repeating an attack stated in the submission (see "checking
> their hashes" quote above). This isn't acknowledged, so
> readers are left thinking, incorrectly, that this attack is
> omitted from the submission.
 
No, this is not the attack stated in the submission. The comment
refers to encapsulation reusing the same e with two different
public keys. The clue is in the word "reused".
 
> [R]egarding the number of e values, the same pqc-forum comment

> suggests that Classic McEliece generates e from a 256-bit seed.
> This is wrong; it's contradicted by the Classic McEliece
> specification.
 
Ruben confirmed that section 4.4 describes encapsulation where e
is generated from a 256 bit seed.
 
> [W]ell-known T-target 2^256/T attacks are being used to snipe
> at some submissions.
 
No, this is not sniping. My point was that a sqrt(T) improvement
from DOOM isn't interesting if there's already a generic T-target
attack. I also don't think a generic T-target attack is something
to worry about. The clue is in the phrase "I also don't think it's
something to worry about".
 
Kirk

Kirk Fleming

unread,
Jul 19, 2021, 8:31:17 PM7/19/21
to Perlner, Ray A. (Fed), pqc-forum
Ray wrote:
> Good point about multitarget attacks on seeds, and
> about the misuse risk from reusing a key when the
> PK isn't used in deriving randomness. It seems like
> these issues could be fixed at fairly minimal cost
> by a) incorporating a public salt, (or a longer seed
> if the PKE can handle it.) AND b) including the PK
> in the computation for randomness. (Presumably some
> extra work needs to be done to check the security
> proof still works.)
 
Including the public key when deriving the randomness
for encapsulation is easy. Increasing the length of
the seed would be very difficult for some schemes.
 
> I'm not sure what you would do to counteract the
> DOOM attack other than increasing the parameters.
> I agree that this stuff is all more concerning at
> category 1 than category 5 (and I suspect the rest
> of the team agrees with me, though I haven't
> explicitly checked), since category 1 security is
> closer to a security level that might become
> practical to attack in the foreseeable future.
 
DOOM with N ciphertexts gives at most a sqrt(N)
improvement. The actual improvement will be less
than that. It's possible some of the Category 1
parameters already have enough of a security margin
to cope with a reasonable number of ciphertexts.
 
> Do you think it's worth at least encouraging teams
> to patch the multitarget issues that are minimal
> cost to patch?
 
That's your call but I wouldn't encourage teams to
make any major changes at this stage.
 
Kirk

Ruben Niederhagen

unread,
Jul 20, 2021, 2:45:24 AM7/20/21
to pqc-forum
On 7/20/21 1:20 AM, Kirk Fleming wrote:
> Ruben confirmed that section 4.4 describes encapsulation where e
> is generated from a 256 bit seed.

I'd like to refer the interested reader to my post earlier in this
thread for my exact statement and its context.

Also:

On 7/18/21 9:29 PM, Kirk Fleming wrote:
> [...] Section 4.4 describes the generation of random
> objects from a 256 bit seed for seeded key generation. The submission
> doesn't explicitly say that a 256 bit seed is also used for
> generating the fixed-weight vector in seeded encapsulation. Kirk
Best regards
Ruben

Perlner, Ray A. (Fed)

unread,
Jul 20, 2021, 3:24:42 PM7/20/21
to Kirk Fleming, pqc-forum

Kirk wrote:

 

>Including the public key when deriving the randomness
>for encapsulation is easy. Increasing the length of
>the seed would be very difficult for some schemes.

 

What about defending against multitarget attacks by combining the seed with a public salt, included as part of the ciphertext? Is there some problem with that strategy?

 

Ray

 

From: Kirk Fleming <kpfl...@mail.com>
Sent: Monday, July 19, 2021 8:31 PM
To: Perlner, Ray A. (Fed) <ray.p...@nist.gov>
Cc: pqc-forum <pqc-...@list.nist.gov>
Subject: Re: [pqc-forum] Security strength categories for Code Based Crypto (And trying out Crypto Stack Exchange)

 

Ray wrote:

Nicolas Aragon

unread,
Jul 22, 2021, 5:11:46 AM7/22/21
to Perlner, Ray A. (Fed), pqc-forum
Dear all,

In the NIST PQC CFP, L1 security was defined as being at least as secure in the presence of quantum attacks as AES-128 (and similarly for other levels). Of course as already noted a quantum attacker has also access to classical computers so this implied that an L1 system should also be at least as secure as AES-128 in the presence of classical attacks.

The CFP included any sort of attack as it said "Any attack that breaks the relevant security definition must require computational resources comparable to or greater than those required for key search on a block cipher with a 128-bit key (e.g. AES128)". One may thus indeed consider multi-target attacks.

The issue I have is that if there are T AES128 targets (e.g. in a TLS communication which is the mainstream application considered until now) the multi-target attack consisting on guessing a key for a known plaintext (which in HTTP/TLS and most of the other applications of TLS it is the usual setting) encrypting it and checking through a hash table if it is one of the T targets, has a expected cost of 2**128/T iterations.

So an attack on 2**128/T against a NIST PQC KEM is not relevant as it is not better than the multi-target attack against AES-128 in this case.

Not only the suggested new requirement would change the CFP from "being as secure" to "being significantly more secure" than AES, but in the mainstream application it makes no sense as if the multi-target attack to the KEM has a cost of 2**128 then the attacker can do the multi-target attack on the subsequent AES128 encryption that follows the KEM at a cost of 2**128/T.

One may consider ciphertext-only multi-targets attacks but this is not mainstream, specially given the applications expected for the NIST candidates, and my personal opinion is that comparing the security of an asymmetric primitive with the one of a symmetric primitive is not fair except if an encryption oracle is given to the attacker. If such an oracle is given then AES128 is again broken in 2**128/T.


Nicolas Aragon, speaking for myself, not as a member of a given team.

Perlner, Ray A. (Fed)

unread,
Jul 22, 2021, 9:18:11 AM7/22/21
to Nicolas Aragon, pqc-forum

Hi Nicolas,

 

You say: “The issue I have is that if there are T AES128 targets (e.g. in a TLS communication which is the mainstream application considered until now) the multi-target attack consisting on guessing a key for a known plaintext (which in HTTP/TLS and most of the other applications of TLS it is the usual setting) encrypting it and checking through a hash table if it is one of the T targets, has a expected cost of 2**128/T iterations.”

 

This is false. The mode of operation of AES used by TLS 1.3 was designed with protection against multi-target attacks in mind. See the discussion here for example: https://crypto.stackexchange.com/questions/50025/why-does-tls-1-3-use-random-looking-nonces-for-aead , or the paper, https://eprint.iacr.org/2018/993 , linked in that discussion which gives detailed analysis of the security of the mode. I believe the modes used in earlier versions of TLS, such as AES-CBC with explicit IV also provide significant protection against multi-target attacks, although I’m not finding a good paper to cite for that one at present.

 

Best,

Ray                               

 

 

 

 

From: Nicolas Aragon <nicolas...@etu.unilim.fr>
Sent: Thursday, July 22, 2021 5:11 AM
To: Perlner, Ray A. (Fed) <ray.p...@nist.gov>
Cc: pqc-forum <pqc-...@list.nist.gov>
Subject: Re: [pqc-forum] Security strength categories for Code Based Crypto (And trying out Crypto Stack Exchange)

 

Dear all,

D. J. Bernstein

unread,
Jul 22, 2021, 5:08:14 PM7/22/21
to pqc-...@list.nist.gov
'Perlner, Ray A. (Fed)' via pqc-forum writes:
> This is false. The mode of operation of AES used by TLS 1.3 was designed with
> protection against multi-target attacks in mind.

Clarification requests: Can you please state quantitatively what you
mean by "protection", and explain why you believe that, for TLS 1.3,
the paper you cited contradicts the statement that "T AES 128 targets"
in "a TLS communication" allow a (2^128)/T attack?

---Dan
signature.asc

Perlner, Ray A. (Fed)

unread,
Jul 22, 2021, 6:06:46 PM7/22/21
to D. J. Bernstein, pqc-forum
Hi Dan.

From the introduction of https://eprint.iacr.org/2018/993.pdf

"We show, with precise tight bounds, that nonce randomization increases the mu security of AE,
and apply this insight to GCM-based AE, confirming an intuition initially put forward in the design
of TLS 1.3. We also show that already in-place nonce selection strategies in TLS 1.2 effectively
improve mu security."

This flatly contradicts the statement
""T AES 128 targets" in "a TLS communication" allow a (2^128)/T attack?""
Since the latter statement would imply the attack cost is the same as it would be without nonce randomization.

You also request a quantitative statement of how much protection TLS 1.3 was designed to have against multi target attacks. It is well known that the TLS designers did not do a rigorous analysis during the design process, so such a statement cannot be made. But, the paper I cited gives a very precise analysis of what level of multi-user security the TLS 1.3 mode of operation actually provably achieves (in the ideal cipher model.) It is the most up to date source I am aware of. If you have a newer one, please provide it.

Ray

-----Original Message-----
From: pqc-...@list.nist.gov <pqc-...@list.nist.gov> On Behalf Of D. J. Bernstein
Sent: Thursday, July 22, 2021 5:08 PM
To: pqc-forum <pqc-...@list.nist.gov>
Subject: Re: [pqc-forum] Security strength categories for Code Based Crypto (And trying out Crypto Stack Exchange)

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20210722210757.38044.qmail%40cr.yp.to.

D. J. Bernstein

unread,
Jul 22, 2021, 7:39:59 PM7/22/21
to pqc-...@list.nist.gov
'Perlner, Ray A. (Fed)' via pqc-forum writes:
> This flatly contradicts the statement

Please clarify. Are you claiming specifically that "increases the mu
security" is contradicting the statement that "T AES 128 targets" in "a
TLS communication" allow a (2^128)/T attack? How do you conclude this?

> You also request a quantitative statement of how much protection TLS
> 1.3 was designed to have against multi target attacks.

Not exactly. You used the word "protection", and I asked you to "state
quantitatively what you mean by 'protection'".

If there isn't a clear definition then, to reduce the risk of error and
comply with "NIST will perform a thorough analysis of the submitted
algorithms in a manner that is open and transparent to the public", the
word should be eliminated from the discussion.

---Dan
signature.asc

Alessandro Barenghi

unread,
Sep 9, 2021, 3:23:11 AM9/9/21
to pqc-...@list.nist.gov
Dear all,
we would like to answer here to the above question, reported also on
[stackexchange], to which dr. Kirshanova provided a first answer.
Summarizing here for anyone who did not follow the stackexchange thread,
dr. Kirshanova points out that

1 - "could not reproduce the exact bit complexities from the mentioned
paper [1], the authors did not provide the source code. I'm posting
my estimators for MMT and BJMM attacks here."
2 - The estimates in our work ([paper]) consider only a three-level
BJMM algorithm, as described in the original work, which is optimal
in the dense-error regime. For Classic McEliece parameters, the
optimal depth is 2, and the claim is backed by a Sage script
[Sage_Code] estimating the running times for the MMT, BJMM-2 and
BJMM-3 algorithms. It is then claimed that "The conclusion that
BJMM algorithm is worse than MMT is incorrect because MMT is a
special case of BJMM."

R1: Concerning the first point, we note that the code computing the
paper results was made public together with the paper [code], and
it is referenced as reference [26] in it.

R2: Concerning the second point, we note that our approach to finding
the optimal parameters for the finite-regime estimation of the ISD
variants we considered, involves an exhaustive search over their
range.
This is relevant as the Sage script provided [Sage_Code] sets the
value of the l_2 parameter of the MMT algorithm to its
asymptotically optimal value, differently from what our tool [code]
does.

Willing to investigate further point 2, we revised the estimation
formulas reported in [paper] we found that the iteration count for the
MMT algorithm was inaccurate: replacing it with the correct value
however yields an approximately 2^6 increase in the required computation
effort for the largest Classic McEliece parameter set analyzed in
[stackexchange], and is thus not enough to fill-in the ~2^30 gap
existing between our estimate of BJMM-3 and MMT. We are sorry for
this issue; the current version of [code], and the following results
employ the correct formulas.

To provide a more direct comparison with the results provided by
[sage_code] and our tool, we added the exhaustive search for the optimal
value of l_2 also to the Sage code ( modified source attached as
MMT_estimation_sage.py, printout of the results as results_sage.txt and
plotted, for convenience in plots.pdf).

Picking for the sake of brevity the [3488,2720,64] parameter set, we
obtain the following:

BJMM, Depth 3 cost: 132.92 (time) 30.87 (mem)
BJMM, Depth 2 cost: 127.12 (time) 65.49 (mem)
MMT, asymp. opt. value for l_2: 133.61 (time) 61.41 (mem)
MMT, exhaustive search for opt l_2: 103.51 (time) 45.31 (mem)

which in turn is coherent with the trends in our estimates (i.e., an
optimized value for l_2 allows the MMT algorithm to be faster, in the
finite regime being considered).
We realize that this fact appears to be contradicting the fact that the
BJMM technique is an improvement over the MMT; however, we note that it
is proven to be an improvement in the asymptotic regime and considering
dense error vectors.

To investigate a little bit further this point, we provide a comparison
of the estimates provided by our tool and the sage script on two other
sets of parameters (data plotted in plots.pdf):

-- in Figure 2 we consider a code with the same length and dimension of
the aforementioned one and evaluate the ISD complexities while
increasing the error weight up to the GV bound. Observing the results,
both [sage_code] and [code] report that the l_2 parameter optimized MMT
is estimated to be faster.

-- in Figure 3 we consider a code with rate 1/2 (closer to the
worst-case rate for ISD algorithms than the ~0.79 rate of Classic
McEliece parameters) and vary the weight of the error to be found from
half of the GV bound to the GV bound itself. In this case, our estimates
show that the BJMM-3 algorithm becomes indeed faster than MMT in the
regime where the error vector has the highest weight.

A final note concerns memory access cost models: we considered memory
accesses to be costing log_2(size_of_memory). The rationale behind this
is that the circuit depth of the address decoder for a
size_of_memory-wide memory cannot be shorter than log_2(size_of_memory).
We therefore included the cost of this hidden computation, which is done
whenever something must be accessed from an indexable memory, as a means
to provide a tighter lower bound for the required computation.
We acknowledge that the cost of actually fetching the data from the
memory device itself also includes the cost of transferring the
information, and that is reasonably estimated to be sqrt(size_of_memory)
for planar memory devices or cbrt(size_of_memory) for 3d memory
(assuming it can be realized and scaled without thermodynamic issues).
The estimate for the computation-equivalent cost of such a data transfer
can thus be added to the one of the current estimates provided by our tool.

Kind regards,

Alessandro, on behalf of the authors of [paper]


[stackexchange]
https://crypto.stackexchange.com/questions/92074/number-of-bit-operations-required-for-information-set-decoding-attacks-on-code-b
[paper] https://www.mdpi.com/1999-4893/12/10/209/pdf
[code] https://github.com/LEDAcrypt/LEDAtools
[sage_code] https://github.com/ElenaKirshanova/decoingEstimator


>> In any event, our intent is that if the parameters of the code-base
>> schemes are found to have inaccurate security estimates at such time
>> as we intend to standardize one or more of them, we will try to give
>> accurate security strength categories for the parameter sets we are
>> given, and only standardize those parameter sets that appear to at
>> least meet category 1 in whatever security models we deem most relevant.
>>
>> Thanks,
>>
>> Ray Perlner (NISTPQC)
>
> --
> You received this message because you are subscribed to the Google
> Groups "pqc-forum" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to pqc-forum+...@list.nist.gov
> <mailto:pqc-forum+...@list.nist.gov>.
> To view this discussion on the web visit
> https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/DM6PR09MB48551334723993B099E7D5C69C139%40DM6PR09MB4855.namprd09.prod.outlook.com
> <https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/DM6PR09MB48551334723993B099E7D5C69C139%40DM6PR09MB4855.namprd09.prod.outlook.com?utm_medium=email&utm_source=footer>.

MMT_estimation_sage.py
plots.pdf
results_sage.txt
Reply all
Reply to author
Forward
0 new messages