Announcement of NTRU-HRSS-KEM and NTRUEncrypt merger

522 views
Skip to first unread message

John Schanck

unread,
Nov 29, 2018, 7:05:23 PM11/29/18
to pqc-...@list.nist.gov
Dear pqc-forum,

I am pleased to announce that the NTRU-HRSS-KEM and NTRUEncrypt teams
plan to merge their submissions.

In short, the merged submission will include the variant of
NTRU-HRSS-KEM that was proposed by Saito, Xagawa, and Yamakawa in [1].
(The changes are limited to the CCA conversion). It will also include a
variant of NTRUEncrypt that mirrors this construction as closely as
possible. The ntruhrss701 parameter set will be included. The NTRU-443
and NTRU-743 parameter sets will be replaced with parameter sets that
provide perfect correctness.

Here is a more detailed sketch of our planned merger.

* Design decisions that follow NTRU-HRSS-KEM:

- The parameter sets that we propose will all provide perfect
correctness.

- We will include parameter sets that are compatible with sampling
f, g, r, and m from the uniform distribution.

- We will eliminate the need for invertibility checks during key
generation.

- We will ensure that public keys and ciphertexts map to zero under
the "evaluate at 1" homomorphism.

* Design decisions that follow NTRUEncrypt:

- We will include the option of sampling f, g, r, and m from sets of
fixed-type polynomials (i.e. with a prescribed number of +1, -1,
and 0 coefficients).

* Features of NTRU-HRSS-KEM that will _not_ be included:

- The length-preserving plaintext confirmation hash.
(An artifact of the old CCA conversion.)

- The technique of sampling r from the hash of m, and the prescribed
sampling routine that was needed for the CCA conversion.

* Features of NTRUEncrypt that will _not_ be included:

- Parameters with decryption failures.

- The direct construction of a CCA PKE using the NAEP transformation.

- The ss-ntru-pke and ss-ntru-kem constructions.

- Private keys of the form 1+pF.


Best,
John (on behalf of the NTRU-HRSS-KEM and NTRUEncrypt teams)


[1] https://eprint.iacr.org/2017/1005


D. J. Bernstein

unread,
Dec 1, 2018, 7:33:39 AM12/1/18
to pqc-...@list.nist.gov
In case people are interested, here's a brief comparison of (1) the NTRU
Prime submission and (2) the merged NTRU Encrypt/NTRU HRSS proposal. I'm
speaking for myself here.


1. The main point of the NTRU Prime submission is to reduce the attack
surface for small lattice-based systems. Here are five aspects of this:

Encrypt HRSS merged Prime feature
------- ---- ------ ----- ----------------------------------------------
no yes yes yes avoid decryption failures
no yes yes yes irreducible polynomial
no no no yes avoid nontrivial number-field automorphisms
no no no yes prime modulus q
no no no yes the ring is a field

HRSS already eliminated decryption failures, and already eliminated the
x-1 factor from traditional NTRU (by multiplying various polynomials by
x-1). The merged proposal preserves these features. However, the table
shows important differences in the polynomial choice (cyclotomic vs.
x^n-x-1), and in various aspects of the choice of q. See below.


2. There are further differences in options where every choice has
security concerns---there isn't a choice that clearly simplifies attack
analysis:

Encrypt HRSS merged Prime feature
------- ---- ------ ----- ----------------------------------------------
yes no yes yes options with fixed-weight multipliers
no yes yes no options with uniform multipliers
------- ---- ------ ----- ----------------------------------------------
yes yes yes no options with random errors/"RLWE" (Noisy NTRU)
no no no yes options with rounding/"RLWR" (Rounded NTRU)
------- ---- ------ ----- ----------------------------------------------
yes yes yes yes "NTRU" options (Quotient NTRU)
no no no yes "LPR"/"RLWE"/"RLWR" options (Product NTRU)

Of course submissions offering two options always argue that both
options are of interest, and submissions picking one option will argue
that this option is better. We've already seen arguments that

* fixed-weight multipliers produce smaller sizes,
* uniform multipliers produce better speeds,
* rounding produces smaller sizes,
* Quotient NTRU has better encapsulation and decapsulation speeds,
* Quotient NTRU has smaller ciphertext sizes,
* Product NTRU has better key-generation speeds, and
* Product NTRU has smaller key sizes.

I'm not saying that these choices don't _affect_ security analysis; I'm
saying that it isn't clear which choice _simplifies_ security analysis.
The bottom of this message runs through an example.


3. The biggest point of debate is the choice of cyclotomic vs. x^n-x-1.
The NTRU Prime submission

* points to various attacks exploiting automorphisms---
* the quantum break of Gentry's original STOC 2009 FHE system
for cyclotomics,
* the non-quantum break of an analogous multiquadratic system,
and
* the quantum leap halfway from exponential to polynomial in
approximation factors broken for cyclotomic Ideal-SVP
---as evidence that the security impact of automorphisms has not
been adequately studied;

* uses the idea (proposed before all of these attacks!) to switch
from a cyclotomic field to a number field with a large Galois
group, so that automorphism computations are hard; and

* shows that this switch is compatible with excellent performance.

There are several other lattice submissions that can protect against the
possibility of cyclotomics being broken, but NTRU Prime is by far the
most efficient. Here are representative examples of public-key size +
ciphertext size in non-cyclotomic options in lattice submissions:

* Streamlined NTRU Prime 4591^761: 1218 bytes + 1047 bytes.
* LOTUS 128: 658944 bytes + 1144 bytes.
* Titanium CCA lite: 14720 bytes + 3008 bytes.
* Round2 n1 l1: 3455 bytes + 4837 bytes.
* Frodo 640: 9616 bytes + 9736 bytes.
* EMBLEM II.c: 10016 bytes + 14792 bytes.
* Lizard N663: 1390592 bytes + 10896 bytes.
* Odd Manhattan 128: 1626240 bytes + 180224 bytes.

NTRU Prime is the only lattice submission that eliminates cyclotomic
structure while still fitting each public key and each ciphertext into
the guaranteed IPv6 MTU, 1280 bytes.

(I'm not saying that these other submissions are pointless. Titanium
argues that it can protect against cyclotomics being broken _and_ the
NTRU Prime ring being somehow broken, as long as there's some other
polynomial that isn't broken. The other submissions listed above argue
that they can protect against _all_ polynomials being broken, as long as
unstructured lattices aren't broken. Another protection option is stated
in the NTRU Prime paper: "one has to ask whether using the increased
costs in other ways, such as larger lattice dimensions, would produce a
larger security benefit".)

In the opposite direction, the obvious way to argue for submissions
using cyclotomics is that the cyclotomic structure of traditional NTRU
hasn't enabled any known attacks, beyond some lost bits of security from
the n-fold symmetry. Perhaps all these submissions worrying about the
cyclotomic structure are being unnecessarily paranoid. To be clear: I'm
not saying that any submissions are known to be breakable via the
cyclotomic structure.


4. Regarding the choice of modulus, there are also various attack ideas,
although they haven't killed any targets as high-profile as Gentry's
STOC 2009 FHE system for cyclotomics:

* 2004 Silverman--Smart--Vercauteren proposed an idea for exploiting
reductions mod powers of 2, with the traditional NTRU choice of the
modulus q as a power of 2.

* A series of papers starting with 2014 Eisenträger--Hallgren--Lauter
proposed an idea for exploiting factors of the polynomial modulo q,
and gave some examples where the idea works.

As far as I know, the following statement has been continuously true
since 2014: accurately describing the cases of Ring-LWE that are known
to be broken requires inspecting not just

* the error distribution,
* the size of q, and
* the size of the polynomial,

but also

* the factors of the polynomial modulo q.

Eliminating those factors limits the flexibility of these attacks,
simplifying attack analysis.

In the opposite direction, again the obvious way to argue for HRSS etc.
is to say that so far nobody has been able to use these factorizations
to break traditional NTRU (or newer proposals such as New Hope). Of
course there are cases where polynomial factors have been used to break
real PKE proposals (the x-1 factor in Giophantus; the x-1 factor in
Round5; I wouldn't be surprised if the x^504+1 example in R.EMBLEM is
similarly breakable), but these are glaring examples of polynomials
being reducible for all q. So far the possibility of an irreducible
polynomial having a q-dependent weakness doesn't seem to have caught any
real proposals.


5. I think the speed of multiplication in NTRU HRSS is noteworthy, and
is a major reason for NTRU HRSS being at the top of the benchmark list
for both encapsulation and decapsulation:

* Encapsulation: 36828 ntruhrss701, 45468 sntrup4591761, 72632
kyber512, etc.

* Decapsulation: 68460 ntruhrss701, 72288 kyber512, 94744
sntrup4591761, etc.

These are cycle counts on the standard Haswell benchmarking platform for
submissions aiming for security against chosen-ciphertext attacks. I
wouldn't be surprised if the merged proposal stays at the top (SXY has
slightly more work in decapsulation but there's now less hashing). On
the other hand, this speed requires a power-of-2 modulus.


6. To summarize, within small lattice-based systems, NTRU Prime and
NTRU Encrypt/NTRU HRSS have different and apparently incompatible goals.
All of the design choices in NTRU Prime are prioritizing the simplicity
of security analysis, even if this means sacrificing some speed.


7. Beyond this basic difference in goals, there's a discrepancy in CCA
conversion details, which I think can be easily resolved:

* I think everyone should be using "implicit rejection" as in
Persichetti/HHK/SXY, where the KEM failure oracle is eliminated in
favor of pseudorandom outputs.

* Beyond this, I think there's value in a 32-byte Dent hash. Section
17 of

https://cr.yp.to/papers/tightkem-20180528.pdf

has some arguments for this, and here's an additional argument
(which I learned from Lange): checking a hash right after
decryption seems to make it easier to stop side-channel attacks
against reencryption.

I don't think the much longer hash in HRSS is useful. This longer hash
isn't in the merged proposal.


8. Finally, here's the promised example of how two options can require
two different security analyses with different complications.

Here are two different approaches to obtaining tight ROM CCA2 proofs:

* Quotient NTRU: Streamlined NTRU Prime produces a deterministic PKE
as a natural side effect of rounding. This enables proofs from
OW-CPA. The merged proposal also does a little more work than HRSS
(as suggested by SXY) to produce a deterministic PKE.

* Product NTRU: NTRU LPRime uses the "T" transform to obtain a
deterministic PKE, so tight ROM CCA2 proofs have to
* assume IND-CPA rather than just OW-CPA or
* make a more complicated assumption of OW-CPA after "T".
As far as I know, all LPR-type submissions (and non-ring
submissions such as Frodo) have similar issues.

People often decompose the PKE-breaking problem, whether it's OW-CPA or
IND-CPA, into a key problem and a ciphertext problem:

* For Quotient NTRU, breaking the OW-CPA problem requires (1)
distinguishing the key from random or (2) solving the random-key
version of the OW-CPA problem.

* For Product NTRU, breaking the IND-CPA problem requires (1)
distinguishing the key from random or (2) solving the random-key
version of the IND-CPA problem.

These decompositions aren't two-way---e.g., a key distinguisher doesn't
necessarily break the original OW-CPA problem---but _if_ we've studied
the individual problems _and_ we think that those problems are hard then
these decompositions tell us that the PKEs are secure.

A closer look at the problem details shows several different problems
that aren't easy to analyze. Comparing these problems shows potential
security losses for Quotient NTRU, and potential security losses for
Product NTRU:

* The Quotient NTRU key problem is homogeneous where the Product NTRU
key problem is not. Maybe this makes the Quotient NTRU key problem
easier to break. There are known attacks on variants with extreme
parameters.

Stehle and Steinfeld proposed parameters with a proof of hardness
of the Quotient NTRU key problem, but this requires large keys.

* The Product NTRU ciphertext problem gives the attacker more
information than the Quotient NTRU ciphertext problem. Maybe this
makes the Product NTRU ciphertext problem easier to break. Again
there are known attacks on variants with extreme parameters (many
more "samples").

There are theorems relating both of these problems (and the Product
NTRU key problem) to certain lattice problems. However, even under
the questionable assumption that those problems are hard, the
theorems are so loose that they say nothing about small-key
parameters. Security analysis has to consider attacks against the
problems with the real parameters---otherwise NIST could end up
standardizing a weak system that nobody actually tried breaking!

* The Product NTRU ciphertext problem is an IND problem, as noted
above, while the Quotient NTRU ciphertext problem is an OW problem.
Maybe this makes the Product NTRU ciphertext problem easier to
break, as in some other IND-vs-OW situations. Again there are no
tight theorems.

The bottom line is that the attack analyses have complications both
ways. As noted in the NTRU Prime submission, it isn't at all clear which
way is more secure.

Of course, getting rid of unnecessary options is the most important way
to simplify security analysis, but at this point it's really hard to
guess whether further study of attacks will end up with a better picture
for Quotient NTRU or for Product NTRU. More research is required.

For comparison, consider, e.g., eliminating decryption failures. This is
a pure win from the perspective of the simplicity of attack analysis.
Decryption failures raise complicated questions of (1) how often the
attacker can trigger these failures, and (2) how this affects security.
Decryption failures don't create any compensating simplifications in the
attack analysis. Decryption failures have some performance benefit, but
that's a different issue.

---Dan
signature.asc

EL HASSANE LAAJI

unread,
Dec 1, 2018, 12:55:26 PM12/1/18
to pqc-...@list.nist.gov


---------- Message transféré ----------
De : EL HASSANE LAAJI <e.l...@ump.ac.ma>
Date : samedi 1 décembre 2018
Objet : [pqc-forum] Announcement of NTRU-HRSS-KEM and NTRUEncrypt merger
À : John Schanck <jsch...@uwaterloo.ca>


Hello 
I'm pleased to inform you that i increase the speed performance of NTRUencrypt1024, more +500, with my oun multiplication algorithm rather than using NTT algorithm.
If you want to merge your release i will be pleased to collaborate with you.
Best Regards.
Hassan from Mohammed First University Morocco.
--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
Visit this group at https://groups.google.com/a/list.nist.gov/group/pqc-forum/.

Mike Hamburg

unread,
Dec 2, 2018, 2:31:27 AM12/2/18
to D. J. Bernstein, pqc-...@list.nist.gov
Changing the title because this response isn’t about NTRU HRSS.

> On Dec 1, 2018, at 4:33 AM, D. J. Bernstein <d...@cr.yp.to> wrote:
>
> There are several other lattice submissions that can protect against the
> possibility of cyclotomics being broken, but NTRU Prime is by far the
> most efficient. Here are representative examples of public-key size +
> ciphertext size in non-cyclotomic options in lattice submissions:
>
> * Streamlined NTRU Prime 4591^761: 1218 bytes + 1047 bytes.
> * LOTUS 128: 658944 bytes + 1144 bytes.
> * Titanium CCA lite: 14720 bytes + 3008 bytes.
> * Round2 n1 l1: 3455 bytes + 4837 bytes.
> * Frodo 640: 9616 bytes + 9736 bytes.
> * EMBLEM II.c: 10016 bytes + 14792 bytes.
> * Lizard N663: 1390592 bytes + 10896 bytes.
> * Odd Manhattan 128: 1626240 bytes + 180224 bytes.
>
> NTRU Prime is the only lattice submission that eliminates cyclotomic
> structure while still fitting each public key and each ciphertext into
> the guaranteed IPv6 MTU, 1280 bytes.

ThreeBears is also a non-cyclotomic lattice scheme, where for
claimed class II the 804-byte public key and 917-byte ciphertext
each fit into a 1280-byte IPv6 packet. Its computational efficiency
is competitive with NTRU HRSS, NTRU Prime, and even the
cyclotomic schemes.

It also shares NTRU Prime's feature that the ring is a field (and thus,
is mod an irreducible polynomial) and that the field has no nontrivial
automorphisms. This is true both in the obvious sense that Z/NZ
can’t have nontrivial automorphisms, and also in that Q(alpha) for
alpha a root of x^D - x^(D/2) - 1 (D >= 4) doesn't have nontrivial
automorphisms. Thanks to Hart Montgomery and Arnab Roy for
pointing this out.

Unlike NTRU Prime’s field, this field doesn’t have the full S_p Galois
group. I also don’t mean to say that these features make ThreeBears
a more conservative scheme than all the others. Integer MLWE hasn’t
been studied much, but I attempted to offset this by using a field and
by setting a relatively high security target.



ThreeBears has an additional ease-of-analysis feature, which is
common but not universal among the lattice KEMs: it calculates its
coins as H(pk.seed,m) instead of just H(m). This protects against
multi-target issues such as the concern about LAC. I mention this
because I think this technique is cheap and effective enough to be
worth putting in essentially every LPR-like system.

There is a possible exception for perfectly-correct schemes that
transport 256-bit keys, such as NTRU LPRime, but even in this case
there would be a small security benefit: it salts the Dent hash and the
final key, which makes it harder for an attacker to use rainbow tables
to exploit RNG failures.

ThreeBears does not have all the other ease-of-analysis features
in NTRU Prime and HRSS. I’m reluctant to make any major changes
because the system is in a reasonable local optimum, and I don’t want
to present too many parameter sets or a moving target. I will probably
add some kind of implicit rejection if ThreeBears survives to the second
round — I don’t believe it’s required for security, but it removes a footgun.
The other features will probably remain the same or similar, depending
on the state of analysis at the time:

* Probably no Dent hash unless there’s a really compelling side channel
use case — I think the usefulness in proofs is mostly an artifact.

* Hopefully a more rigorous failure analysis. Depending on the results,
possibly a small parameter change to reduce (or increase?) the failure
rate. Probably not perfect correctness, but I’ll calculate the cost of this.



Out of curiosity, have the NTRU prime folks (or anyone else) calculated
the efficiency difference between the current x^p - x - 1 and a ThreeBears
style ring like x^p - x^((p-1)/2) - 1? You would represent this ring as eg
x^((1+p)/2) - 1 - x^((1-p)/2), which reduces the error amplification by
something like 25%. This should give a small efficiency boost without
compromising the other nice properties of NTRU (L)Prime.



On an unrelated note, thanks to Dan and the other SUPERCOP
folks for your work on benchmarking. Merging other people’s code is
a lot of effort.

Cheers,
— Mike

John Schanck

unread,
Dec 3, 2018, 11:25:32 AM12/3/18
to pqc-...@list.nist.gov
Dear pqc-forum,

A more complete description of the KEMs that will be included in the
merged NTRU-HRSS / NTRUEncrypt submission is now available as part of a
tech report titled "A Comparison of NTRU Variants."

https://eprint.iacr.org/2018/1174

Note that the parameter sets that I have highlighted in Section 7 are
ones that I personally find interesting -- nothing more. In particular,
there is no guarantee that these parameter sets will be recommended in
any future submission to NIST.

Cheers,
John


* John Schanck <jsch...@uwaterloo.ca> [2018-11-29 19:05:14 -0500]:

D. J. Bernstein

unread,
Dec 3, 2018, 7:22:04 PM12/3/18
to pqc-...@list.nist.gov
Mike Hamburg writes:
> Q(alpha) for alpha a root of x^D - x^(D/2) - 1 (D >= 4) doesn't have
> nontrivial automorphisms.

Not true. If D/2 is even then x |-> -x is a nontrivial automorphism. If
D/2 is odd then x |-> -1/x is a nontrivial automorphism.

There's also a much longer list of automorphisms lurking only slightly
behind the curtain. As stated in the NTRU Prime paper:

... having a Galois group of size, say, 2p means that one can write
down a degree-2p extension field with 2p automorphisms, and one can
then try to apply these automorphisms to build many units,
generalizing the construction of cyclotomic units.

These are automorphisms of a small extension field, not automorphisms of
the original field, but this distinction doesn't matter if computations
of the automorphisms are usable to build short units for an attack.

Have you analyzed whether Gentry ThreeBears, a Gentry-style system using
x^D-x^(D/2)-1, is breakable in this way? Offhand I'd say that (1) the
Galois closure has degree O(D^2), small enough to allow fast
computations; and (2) there are a ton of easily computed short units,
including the golden ratio phi, the (D/2)th cyclotomic units, and
differences such as phi-zeta_{D/2}.

> this field doesn’t have the full S_p Galois group.

We're talking about O(p^2), I think, versus factorial(p). The NTRU Prime
paper requires factorial(p) and explains the advantage as follows:

Having a much larger Galois group means that P will have at most a
small number of roots in any field of reasonable degree. This
eliminates all known methods of efficiently performing computations
with more than a small number of automorphisms.

For O(p^2) this is not an obstacle: computations on elements of the
Galois closure, including all automorphisms, take time polynomial in p.

> ThreeBears is also a non-cyclotomic lattice scheme

We had classified ThreeBears as "integer ring" separately from "lattice"
so it wasn't part of my comparison here. Also, if the obvious attack
strategy against Gentry ThreeBears exploits cyclotomic units, then is it
useful to say "non-cyclotomic"?

More importantly, NTRU Prime explicitly avoids not just cyclotomics, but
all Galois groups of non-maximal size. This didn't matter for the
comparison I was stating, where all systems were either

(1) small and cyclotomic,
(2) small and NTRU Prime, or
(3) much larger,

but if you're going to extend the comparison then you should account for
the full NTRU Prime design criteria.

> ThreeBears has an additional ease-of-analysis feature, which is
> common but not universal among the lattice KEMs: it calculates its
> coins as H(pk.seed,m) instead of just H(m). This protects against
> multi-target issues such as the concern about LAC.

I agree that we've seen examples of decryption-failure attacks that
would have been drastically slowed down by this extra hash input. But
how is this an "ease-of-analysis feature"?

The following approach to multi-target attacks sounds much easier to
analyze: "Our general strategy for handling multi-target attacks is to
aim for a very high single-target security level, and then rely on the
fact that T-target attacks gain at most a factor T. We have not
introduced complications aimed at making multi-target attacks even more
difficult."

That's a quote from the NTRU Prime submission. Obviously this approach
isn't available for submissions that are, for efficiency, cutting things
close with decryption failures.

> Out of curiosity, have the NTRU prime folks (or anyone else)
> calculated the efficiency difference between the current x^p - x - 1
> and a ThreeBears style ring like x^p - x^((p-1)/2) - 1?

The size loss of x^p-x-1 compared to NTRU Classic is measurable but
small. See the interesting graphs just posted by John Schanck:

https://eprint.iacr.org/2018/1174

The loss of x^p-x-1 compared to other trinomials should be even smaller.

My impression is that John didn't account for equivalent keys, which
speed up known attacks against NTRU Classic more than they speed up
known attacks against x^p-x-1. This should somewhat offset the size loss
from higher product weights.

Anyway, to answer your question, the NTRU Prime perspective is that the
risk of further attacks exploiting automorphisms---and the difficulty
that this risk creates for security analysis---outweighs any of these
small efficiency differences. The NTRU Prime submission also skips the
following 2016 suggestion from Dan Shepherd, which might interest you:

* Generate a random vector mod 2 rather than mod 3.
* Hash the vector to select some bits of the vector to negate. This
produces the original range -1,0,1.
* In decapsulation, recover the random vector mod 2, and use the same
hash to recover each -1,0,1.

This complication slightly improves efficiency.

---Dan
signature.asc

Mike Hamburg

unread,
Dec 3, 2018, 9:36:48 PM12/3/18
to D. J. Bernstein, pqc-...@list.nist.gov


> On Dec 3, 2018, at 4:21 PM, D. J. Bernstein <d...@cr.yp.to> wrote:
>
> Mike Hamburg writes:
>> Q(alpha) for alpha a root of x^D - x^(D/2) - 1 (D >= 4) doesn't have
>> nontrivial automorphisms.
>
> Not true. If D/2 is even then x |-> -x is a nontrivial automorphism. If
> D/2 is odd then x |-> -1/x is a nontrivial automorphism.

Whoops, my mistake.

> Have you analyzed whether Gentry ThreeBears, a Gentry-style system using
> x^D-x^(D/2)-1, is breakable in this way? Offhand I'd say that (1) the
> Galois closure has degree O(D^2), small enough to allow fast
> computations; and (2) there are a ton of easily computed short units,
> including the golden ratio phi, the (D/2)th cyclotomic units, and
> differences such as phi-zeta_{D/2}.

I have not, but it might be worth looking into.

> More importantly, NTRU Prime explicitly avoids not just cyclotomics, but
> all Galois groups of non-maximal size. This didn't matter for the
> comparison I was stating, where all systems were either
>
> (1) small and cyclotomic,
> (2) small and NTRU Prime, or
> (3) much larger,
>
> but if you're going to extend the comparison then you should account for
> the full NTRU Prime design criteria.

Of the systems proposed to the NIST process which satisfy the NTRU
Prime design criteria, NTRU Prime is the smallest.

>> ThreeBears has an additional ease-of-analysis feature, which is
>> common but not universal among the lattice KEMs: it calculates its
>> coins as H(pk.seed,m) instead of just H(m). This protects against
>> multi-target issues such as the concern about LAC.
>
> I agree that we've seen examples of decryption-failure attacks that
> would have been drastically slowed down by this extra hash input. But
> how is this an "ease-of-analysis feature"?
>
> The following approach to multi-target attacks sounds much easier to
> analyze: "Our general strategy for handling multi-target attacks is to
> aim for a very high single-target security level, and then rely on the
> fact that T-target attacks gain at most a factor T. We have not
> introduced complications aimed at making multi-target attacks even more
> difficult."
>
> That's a quote from the NTRU Prime submission. Obviously this approach
> isn't available for submissions that are, for efficiency, cutting things
> close with decryption failures.

Since NTRU Prime doesn’t have decryption failures, this feature isn’t required
to thwart a known attack. As I said in my previous email, it is less relevant
to NTRU Prime, but it isn’t completely irrelevant either.

Adding the public key (or its hash or seed) to the hash means that, if an
attacker is calculating many hashes toward some attack, then he is only
going after a specific target (or at most a few, in the unlikely event of a
collision in the hash or seed of the public key). Therefore, you don’t have
to worry about multi-target attacks on the FO mode, just on the underlying
KEM.

The ideal case would be to prove a theorem that says:
Hardness of underlying problem
-> Multi-target CPA security of KEM
-> Multi-target CCA security of KEM+FO
at least in the [Q]ROM, with as little tightness loss as possible. In principle
the FO step could be generic. This is already difficult, but seems much
more difficult if you don’t domain separate the hash. The fact that quantum
FO analysis already incurs a large tightness loss doesn’t help this state of
affairs.

One can certainly also just divide the estimated security by T, but that requires
designing for Class III single-target security (eg) and then using the resulting
system as Class I, or V -> III, and overdesigning this way is unappealing.
NTRU Prime’s evaluation technique is already less conservative than other
systems’ — they would call its parameter set class II or III instead of V. Just
saying that you can divide that by T=2^64 and still be OK is unconvincing —
someone will actually have to do the analysis, and that analysis will be easier
with domain separation.

Alternatively, one can just claim that there are no known multi-target attacks,
but it seems better to have simple and concrete reasons why multi-target
attacks are impossible.

Each feature like this adds complexity and may hurt performance, while
possibly improving security. The same is true for Dent hashes, implicit
rejection, etc. Some such features will be worthwhile and others not, but
I don’t think hashing the public key should be dismissed out of hand as a
pointless complication.

>> Out of curiosity, have the NTRU prime folks (or anyone else)
>> calculated the efficiency difference between the current x^p - x - 1
>> and a ThreeBears style ring like x^p - x^((p-1)/2) - 1?
>
> The size loss of x^p-x-1 compared to NTRU Classic is measurable but
> small. See the interesting graphs just posted by John Schanck:
>
> https://eprint.iacr.org/2018/1174
>
> The loss of x^p-x-1 compared to other trinomials should be even smaller.
>
> My impression is that John didn't account for equivalent keys, which
> speed up known attacks against NTRU Classic more than they speed up
> known attacks against x^p-x-1. This should somewhat offset the size loss
> from higher product weights.
>
> Anyway, to answer your question, the NTRU Prime perspective is that the
> risk of further attacks exploiting automorphisms---and the difficulty
> that this risk creates for security analysis---outweighs any of these
> small efficiency differences.

I’m not up to date on my Galois theory — is it not possible for

Q[x] / (x^p - x^((p-1)/2 - 1)

to have a full Galois group? Or is common that it will have the full S_p but
hard to prove? That field doesn’t have the obvious automorphism you
noticed in ThreeBears, and arithmetic in this ring should be just as easy to
implement as mod x^p - x - 1 (depending on your Karatsuba strategy etc
of course).

> The NTRU Prime submission also skips the
> following 2016 suggestion from Dan Shepherd, which might interest you:
>
> * Generate a random vector mod 2 rather than mod 3.
> * Hash the vector to select some bits of the vector to negate. This
> produces the original range -1,0,1.
> * In decapsulation, recover the random vector mod 2, and use the same
> hash to recover each -1,0,1.
>
> This complication slightly improves efficiency.

That is an interesting little trick, thanks for the reminder.

> —Dan


Cheers,
— Mike

D. J. Bernstein

unread,
Dec 7, 2018, 11:17:27 PM12/7/18
to pqc-...@list.nist.gov
Focusing here on comparing the following two responses to multi-target
attacks:

* Response #1: include a key prefix in the hash used in the CCA2
conversion, a long enough prefix that prefixes of user keys don't
collide.

* Response #2: "aim for a very high single-target security level, and
then rely on the fact that T-target attacks gain at most a factor
T".

My impression is that neither of these will be part of decisions at the
end of round 1---

* https://groups.google.com/a/list.nist.gov/d/msg/pqc-forum/WxRmVPhAENw/I5yhtr9dAgAJ
indicates that NIST is treating CCA2 conversion choices as "minor
changes (tweaks)" for the beginning of round 2.

* NIST also doesn't seem concerned with the exact security levels in
round 1: "We’re not going to kick out a scheme just because they
set their parameters wrong".

---but surely these issues will have to receive more attention as the
post-quantum standardization project progresses. It will be useful for
the community to resolve disputes regarding the relative merits of the
two responses before round-2 tweaks are due. Also, the principles here
have an impact beyond multi-target attacks.

Mike Hamburg writes, regarding Response #2:
> overdesigning this way is unappealing

Response #2 is tremendously appealing for security evaluation, auditing,
and actually solving the security problem (see below), so I presume that
what you're claiming here is a performance problem. Let's look at the
performance numbers.

Say the best single-target attack is a search for one solution in a
space of size 2^k. There are other attack shapes, but this one maximizes
the performance gap that you're alluding to.

Here's the standard list of how big k needs to be for four single-target
security levels:

* Security 2^64 pre-quantum: k=64.
* Security 2^128 pre-quantum: k=128.
* Security 2^64 post-quantum: k=128.
* Security 2^128 post-quantum: k=256.

(Of course this list measures security with naive operation counts.
Depth limits and various quantum overheads mean that, e.g., k=256
actually has a much higher post-quantum security level.)

Now let's use Response #2 to protect against 2^64-target attacks, saying
that a 2^64-target attack implies a 1-target attack with probability
1/2^64. Here's what this does to k:

* Security 2^64 pre-quantum: k=128. This is 2x larger.
* Security 2^128 pre-quantum: k=192. This is 1.5x larger.
* Security 2^64 post-quantum: k=192. This is 1.5x larger.
* Security 2^128 post-quantum: k=320. This is 1.25x larger.

Sure, 2x sounds significant, but that's for an archaic 2^64 pre-quantum
security level. The gap rapidly shrinks as the target security level
increases. Do you really claim that 1.25x is "overdesigning"?

For typical lattice systems the key size scales as something like
k log k, so the 1.25x in search space turns into 1.30x in key size. This
doesn't noticeably change the picture. The gap can be even smaller when
the best attacks have worse probability dropoffs than Grover, but the
simple analysis already shows that the gap is small.

[ regarding Response #1: ]
> Each feature like this adds complexity and may hurt performance, while
> possibly improving security.

There's a metric missing from this list, namely the complexity of a
comprehensive _review_ of security. One way to see the importance of
this metric is as follows.

Is NIST going to end up standardizing---and are users going to end up
deploying---a post-quantum system that turns out to be devastatingly
insecure, allowing feasible attacks? Quite possibly, yes. My assessment
is that the most likely way for this to happen is as follows:

* as a direct result of design decisions, security review of the
standard will be much more complicated than necessary;

* as a result, the limited human resources available for security
review will run out of time to seriously review everything;

* as a result, serious vulnerabilities will be missed simply because
nobody ever looked at the vulnerable part of the standard.

As an analogy, consider the destruction of OCB2 this year, 14 years
after Rogaway's Asiacrypt 2004 security "proof" and 9 years after ISO
standardization. The Inoue--Minematsu OCB2 attack is simple and fast,
but this doesn't mean it was easy to find: OCB2 is only one of many
cryptographic standards, and it has many interacting components that
needed attack analysis.

Compared to serious security review for OCB2, serious security review
for a typical post-quantum system is another big step up in difficulty,
further increasing the risk of security disasters. The difficulty varies
from one system to another, motivating design changes.

Here are some of the things that happen when we identify the simplicity
of security review as an important goal, and systematically pursue this
goal in designing a KEM for CCA2 security.

We eliminate decryption failures in the underlying PKE. This takes away
a huge, painful, error-prone part of the review process. (My tally,
before eprint 2018/1089 and eprint 2018/1172, said that 25% of the
unscathed encryption submissions eliminate decryption failures.)

What about Dent's hash---plaintext confirmation? Try a security review
with this hash, vs. a security review with an earlier FO variant, and
you'll immediately see how the hash simplifies review:

* The PKE allows the attacker to modify ciphertexts to produce new
ciphertexts that are often valid. This is the starting point for
chosen-ciphertext attacks, and we've seen again and again that
analyzing the consequences is complicated.

* Plaintext confirmation makes it very easy to see that the only way
for an attacker to produce a new valid ciphertext is to already
know the plaintext. The hash also gives a tight ROM proof of
IND-CCA2 from OW-CPA. Security review of OW-CPA is simpler than
security review of IND-CPA.

Of course, evaluators have to check the proof, have to think about
whether there can be better non-ROM attacks, have to think about how an
attacker could use the extra information provided by the hash, etc., but
all of this is much simpler than than having to think through
consequences of valid modified ciphertexts for the underlying PKE.

What about implicit rejection? Implicit rejection makes it very easy to
see that an attacker modifying ciphertexts will obtain random-looking
session keys, whether the modified ciphertexts are valid or not, whether
there's plaintext confirmation or not. This also gives a tight ROM proof
of IND-CCA2 from OW-CPA, even simpler than Dent's proof.

Understanding whether it's best to do plaintext confirmation, or
implicit rejection, or both, requires a closer look at the details. My
current assessment---given the analysis in the tightkem paper, plus the
extra argument that I mentioned regarding side channels---is that

* plaintext confirmation by itself is good,
* implicit rejection by itself is better, and
* doing both implicit rejection and plaintext confirmation is best,

but it's possible that the ranking of these three options will change
with more research.

> The ideal case would be to prove a theorem that says:
> Hardness of underlying problem
> -> Multi-target CPA security of KEM
> -> Multi-target CCA security of KEM+FO
> at least in the [Q]ROM, with as little tightness loss as possible.

It's not at all clear that a tight QROM proof exists. We don't know how
to generate such a proof from Response #1. More to the point, we don't
even know how to use Response #1 to guarantee that multi-target attacks
against the final KEM are tightly equivalent to single-target attacks.

More importantly, even if we did have a proof, there would be a much
bigger problem for security reviewers. Cryptographic systems are full of
constructions that haven't been reviewed for multi-target security. When
review does happen, the reviewers produce one attack after another. See,
e.g.,

https://blog.cr.yp.to/20151120-batchattacks.html
https://eprint.iacr.org/2016/360

for various examples of multi-target attacks against NIST cryptographic
standards---attacks faster than all known single-target attacks. The
result of further review isn't going to be confidence in multi-target
security; it will be a continuing list of further multi-target attacks.

A KEM is one tiny piece of an end-user cryptographic system. Response #1
is looking at this tiny piece, and adding a complication that stops some
multi-target attacks against that piece ("stops" meaning that they are
no more effective than single-target attacks). Maybe at some point we'll
have a proof that this stops all possible multi-target attacks against
this tiny piece. But what about the rest of the system, with tons of
other multi-target weaknesses and potential multi-target weaknesses?

As an example, the New Hope submission claims that it is "rather
uncomfortable to have the security of all connections rely on a single
instance of a lattice problem" since the multiple targets might be
broken by a "massive precomputation". It claims that one can "avoid"
this "pitfall" by having each user generate a separate parameter "a".
But the new ideal-lattice attack algorithm from

https://nitaj.users.lmno.cnrs.fr/Caen2018/Alice_Pellet_MaryCaen2018.pdf

speeds up attacks against _every_ target after a single massive lattice
precomputation. The attack focuses on Ideal-SVP, but the supposed
hardness of Ideal-SVP is the main justification for the claimed hardness
of Ring-LWE. I'd guess that the precomputation is quantitatively larger
than a typical attack, but the New Hope argument isn't quantitative. I'm
not saying that the attack precisely disproves the New Hope multi-target
claim; I'm simply saying that the claim isn't proven and could be wrong.

If a multi-target lattice attack is faster than a single-target lattice
attack, then what exactly is gained by the extra hashing in Response #1?
More broadly, when people combine KEMs with many other components to
produce complete end-user cryptographic systems, there will surely be
many different multi-target weaknesses. What's the strategy to remove
all these weaknesses, and how is a security evaluator supposed to gain
confidence in this approach?

Response #2, unlike Response #1, is a complete solution to the problem:
assume that there will be multi-target attacks, and choose the security
level to protect against the maximum possible effect of those attacks.
This eliminates the hard problem of evaluating multi-target security.
The security evaluator can safely focus on single-target attacks.

I understand how Response #1 can feel like progress. I've even advocated
an analogous complication---but that was in a situation where

* it was easy to see how the complication eliminated multi-target
attacks against a complete end-user cryptographic system;
* this turned into a simple, easily reviewable, proof of multi-target
security for the complete system; and
* the target security level was only 2^128 pre-quantum.

This KEM hash is a very different story. It is, at best, addressing a
much smaller part of a system that has many other multi-target
weaknesses. I doubt that anyone has ever constructed a similar system
with a proof of multi-target security, even if a proof is possible for
this small piece. Furthermore, we're talking about post-quantum KEMs
aiming for a much higher security level, so the performance impact of
Response #2 is relatively small.

> NTRU Prime’s evaluation technique is already less conservative than
> other systems’ — they would call its parameter set class II or III
> instead of V. Just saying that you can divide that by T=2^64 and
> still be OK is unconvincing

Let me see if I understand, quantitatively, what you're claiming here.
You're starting from 2^155 pre-quantum security for Streamlined NTRU
Prime 4591^761, dividing by 2^64 for a 2^64-target attack, and saying
that 2^91 is not OK? Or you're doing something similar starting from
2^140 post-quantum security?

The starting points 2^155 and 2^140 are wild underestimates of the cost
of known single-target attacks. Checking the literature shows, as stated
in my undisputed summary from February, that each of these numbers comes
from

* taking a _small piece_ of an attack,
* dividing this piece into _expensive operations_,
* _asymptotically_ counting the number of these operations, and
* replacing o(1) in the asymptotics with 0.

Note that replacing o(1) with 0 can produce _overestimates_; there's no
guarantee that the o(1) is positive. The reason we think that the
overall effect of the above four steps is a wild underestimate is that
some people have done much more work for much more serious estimates of
attack costs, consistently finding much higher attack costs than the
underestimates indicate (although there are still many unresolved
questions about the details).

This also means that someone doing a security review has to check the
serious estimates (and worry about the unresolved questions). The wild
underestimates are certainly much simpler _formulas_, but this doesn't
mean that they simplify _security evaluation_.

Of course underestimating security levels has the general effect of
increasing security parameters. This could protect users against future
attack speedups. You call this "conservative", suggesting that it's a
good approach. I have two questions and a comment:

* How do you think this approach is different from "overdesigning",
which a moment ago you called "unappealing"? What's the principle
for deciding what's "overdesigning" and what's "conservative"?

* If you think that using a wild underestimate is good, do you think
that using an even wilder underestimate is better? What's the
principle for deciding when to stop?

* There's a different approach, which I'll call the "accurate
conservative" approach, that (1) clearly produces better results
and (2) doesn't use these underestimates.

The accurate conservative approach is as follows:

* Accurately evaluate the costs of the attacks.
* On the basis of this evaluation, choose the highest-security
parameters that users can afford.

The accurate conservative approach, by definition, is maximizing the
users' security against the best attacks considered.

For comparison, you seem to be advocating what I'll call the "inaccurate
conservative" approach, which replaces the accurate evaluations with
something less accurate---in particular, simplified formulas that turn
out to be underestimates. This replacement generally _damages_ security,
as explained in Appendix B.1.2 of

https://cr.yp.to/papers.html#nonuniform

(cited in the NTRU Prime submission under "Warning: underestimates are
dangerous"), and it can't improve security.

The previous paragraph is talking about the damage done to security
against _current_ attacks. The best bet we have is that this will also
damage security against _future_ attacks. Of course it's possible that
advances in attacks will magically turn out to be better correlated with
the current underestimates than with accurate evaluations of today's
attack costs---but what's the argument for betting on this possibility?

"I have an argument: underestimate U of the security level is smaller
than accurate estimate A, and future attacks will also be faster than
A!" ---To see that this argument is nonsense, consider a more radical
underestimate such as A/3. (If A/3 isn't below U, replace it with some
even smaller function of A.) The argument claims that replacing A with U
is an improvement, and that replacing U with A/3 is an improvement---but
A/3 gives exactly the same parameter choices as A, so the argument has
directly contradicted itself.

There are cases where the literature looks more closely at the details
of attacks and makes well-informed assessments of risks of various types
of speedups. Superficially, this might sound like recommending
underestimates---but there's an essential difference in methodology.
Saying "we want a smaller estimate because smaller is good", or "we want
a simpler formula because simpler is good", is very different from
saying "we see a risk of the following type of attack improvement and we
want to accurately evaluate the impact". Often the conclusion is that
the impact is so devastating that proper risk management requires
avoiding some types of cryptosystems. Consider, e.g., the statement

It is anticipated that these techniques can be used to produce
collisions for MD5 and perhaps also for RIPEMD

in https://link.springer.com/content/pdf/10.1007%2F3-540-60865-6_44.pdf
twenty years ago.

There are also cases where we can't reasonably evaluate security against
some types of attacks since they're beyond the attack model that the
community normally considers. Timing attacks were like this twenty years
ago, with very high risks that people claiming timing-attack security
were disastrously wrong:

Table lookups: This instruction is not susceptible to a timing attack
...
--- Daemen and Rijmen, Resistance against implementation attacks: a
comparative study of the AES proposals, 1999
https://web.archive.org/web/20000816072451/http://csrc.nist.gov:80/encryption/aes/round1/conf2/papers/daemen.pdf

Table lookup: not vulnerable to timing attacks.
--- NIST, Report on the development of the Advanced Encryption
Standard (AES), 2001

Multi-target attacks are like this today. They're a fun research topic,
and maybe in twenty years we'll know how to stop multi-target attacks;
but most components of today's post-quantum cryptosystems were designed
by people who weren't even thinking about, let alone trying to stop,
multi-target attacks. Response #2 recognizes this, gives us confidence
that we're protected, and doesn't cost much in the context of
high-security post-quantum cryptography.

---Dan
signature.asc

Mike Hamburg

unread,
Dec 8, 2018, 3:24:15 PM12/8/18
to D. J. Bernstein, pqc-...@list.nist.gov
Hi Dan,

I’m sorry that I don’t have time to address every point of your essay.
Below I have picked out a few points that seem most relevant to the
discussion of domain separated hashes in the CCA2 conversion.

> On Dec 7, 2018, at 8:17 PM, D. J. Bernstein <d...@cr.yp.to> wrote:
>
> Focusing here on comparing the following two responses to multi-target
> attacks:
>
> * Response #1: include a key prefix in the hash used in the CCA2
> conversion, a long enough prefix that prefixes of user keys don't
> collide.
>
> * Response #2: "aim for a very high single-target security level, and
> then rely on the fact that T-target attacks gain at most a factor
> T".
> […]
> Mike Hamburg writes, regarding Response #2:
>> overdesigning this way is unappealing
> […]
> Sure, 2x sounds significant, but that's for an archaic 2^64 pre-quantum
> security level. The gap rapidly shrinks as the target security level
> increases. Do you really claim that 1.25x is "overdesigning”?

> [later in your reply]
> Response #2, unlike Response #1, is a complete solution to the problem:
> assume that there will be multi-target attacks, and choose the security
> level to protect against the maximum possible effect of those attacks.
> This eliminates the hard problem of evaluating multi-target security.
> The security evaluator can safely focus on single-target attacks.

I would further categorize response #2 into two categories.

Response #2a: Nobody actually needs Class V security as NIST
recommends it (2^256 classical work even to find one target of many).
The difference between 256 bits and any realistic attack is enough to
cover any likely multi-target attack.

This is how I would categorize NTRU Prime’s multi-target defense.

You describe this as if “aim[ing] for a very high single-target security
level” is exclusive of Response #1, or somehow obviates it. But this
isn’t true. The security margin between 2^256 and reality is there to
counter many possible threats, and multi-target attacks are only one
of them. You also have to consider improvements in lattice reduction,
lack of tightness in the FO transform (almost universal in the QROM),
lack of tightness in higher-level protocols using the KEM, and other
threats. As a result, almost every submission “aim[s] for a very high
single-target security level”.



Response #2b: If we want 2^256 multi-target security, then aim for
2^320 single-target. I didn’t notice any schemes doing this — some
made higher security claims or estimates, but I didn’t see anyone
claiming this as their multi-target defense.

And yes, I would call this “overdesigning”. Overdesigning can be a
virtue, and I see that is the purpose of Class V security in the first place:
it’s a hedge against unknown (expected or unexpected) attacks.

But adding 64 extra bits of security against lattice attacks, specifically to
counter an attack that probably doesn’t apply in full, seems unappealing
to me, because it comes at a significant performance cost.

> [ regarding Response #1: ]
>> Each feature like this adds complexity and may hurt performance, while
>> possibly improving security.
>
> There's a metric missing from this list, namely the complexity of a
> comprehensive _review_ of security. One way to see the importance of
> this metric is as follows.

Response #1 is domain-separating the hash function. This is a
long-standing security technique and has formal, informal and
practical benefits:

* Formally, in other contexts, it’s often required in security proofs.
Formally, it has a clear benefit in security proofs for KEMs.

* Informally, it means that if an attacker is computing lots of hashes, he
has chosen one single target per hash. This reduces the set of attacks
that must be considered.

* Practically, it is part of several systems’ defense against CCA attacks,
and the lack of this measure contributed to borderline-feasible CCA
attacks on other systems, all in this very standardization process.
I am aware that NTRU Prime isn’t one of those systems.

Furthermore, domain separation is really simple. It costs no bandwidth,
little to no extra memory, very few extra cycles (tens to thousands
depending on whether it pushes you into a new hash block), very little
code, no extra DPA concern, and little complexity in the spec. And it
clearly doesn’t harm security.

As a result, I simply do not agree that domain separation is a mere
“complication” — on the contrary, I think it helps to focus analysis.
I recognize that full, tight, end-to-end proofs of the effectiveness of
domain separation for KEMs (or all systems that contain KEMs? Is
that the requirement??) are not currently known, and may never be
known, but I expect that domain separation will figure in proofs of
security for components of the system (or perhaps the entire KEM).

> I understand how Response #1 can feel like progress. I've even advocated
> an analogous complication---but that was in a situation where
>
> * it was easy to see how the complication eliminated multi-target
> attacks against a complete end-user cryptographic system;
> * this turned into a simple, easily reviewable, proof of multi-target
> security for the complete system; and
> * the target security level was only 2^128 pre-quantum.

Are you referring to Ed25519 here? You advocated domain-separating
the hash in 2011 by saying:

“…; the use of [the public key] A [in the hash] is an inexpensive way to
alleviate concerns that several public keys could be attacked
simultaneously;”.

This sounds an awful lot like my informal justification for hashing the
seed above, rather than foresight that it would “[turn] into a simple,
easily reviewable, proof of multi-target security for the complete
system”.

> This KEM hash is a very different story. It is, at best, addressing a
> much smaller part of a system that has many other multi-target
> weaknesses. I doubt that anyone has ever constructed a similar system
> with a proof of multi-target security, even if a proof is possible for
> this small piece. Furthermore, we're talking about post-quantum KEMs
> aiming for a much higher security level, so the performance impact of
> Response #2 is relatively small.

I don’t understand this argument either. It’s a smaller ratio than at
lower security levels, but surely it has a higher absolute performance
impact?

>> NTRU Prime’s evaluation technique is already less conservative than
>> other systems’ — they would call its parameter set class II or III
>> instead of V. Just saying that you can divide that by T=2^64 and
>> still be OK is unconvincing
>
> Let me see if I understand, quantitatively, what you're claiming here.
> You're starting from 2^155 pre-quantum security for Streamlined NTRU
> Prime 4591^761, dividing by 2^64 for a 2^64-target attack, and saying
> that 2^91 is not OK? Or you're doing something similar starting from
> 2^140 post-quantum security?

Quantitatively, the requested specification of Class V is 2^256 effort to
break one key or ciphertext out of many.

Starting from your estimate of 2^256 single-target security, which is based
on your best estimates of real performance of real attacks currently known
today, you cannot simply subtract 64 bits and claim that you have met
the specification. Instead, you can argue that the specification is silly,
and have done, but this doesn’t help an auditor who wishes to determine
whether you’ve met that specification.

Furthermore, let’s consider that for realistic systems, Class III security
(multi-target, as specified) is certainly enough. As I understand it this
is really what you’re arguing here anyway when you say that Class V
multi-target is silly. If an auditor wishes to determine that you’ve met Class
III, then he or she will need to either:

* Evaluate the possibility of multi-target attacks -- the evaluation will likely
be easier if the hash is domain-separated; or
* (Response #2) Check that your claimed Class V instance actually
achieves its claimed 2^256 single-target security, and therefore reaches
Class III multi-target.

Response #2 is easier for an auditor to accept if your system includes
some security margin, instead of being designed to hit exactly 256-bit
security against currently-known attacks. It is therefore more consistent
with a system that has a security margin with respect to its claims, than
with one that is designed without a security margin.

NTRU Prime is different from other proposals in that it includes only
one parameter set (per problem type), with little margin between the
best estimate for the security of that parameter set and Class V. I’m
not arguing that this is a bad decision, only that it doesn’t make the
multi-target evaluation easier. It also seems inconsistent with your
strategy to “accurately evaluate the costs of the attacks” and then
“choose the highest-security parameters that users can afford”, since
different users can afford different parameters.

[…]
> Of course underestimating security levels has the general effect of
> increasing security parameters. This could protect users against future
> attack speedups. You call this "conservative", suggesting that it's a
> good approach. I have two questions and a comment:

I am not advocating for underestimating security, and I hope that my
argument above clarifies this. That is a completely separate topic of
conversation.

> * How do you think this approach is different from "overdesigning",
> which a moment ago you called "unappealing"? What's the principle
> for deciding what's "overdesigning" and what's "conservative”?

It is overdesigning. Overdesigning by an additional 64 bits, just to avoid
including any multi-target defenses, is what’s unappealing.

Cheers,
— Mike

D. J. Bernstein

unread,
Dec 14, 2018, 9:46:10 PM12/14/18
to pqc-...@list.nist.gov
Changing subject line to emphasize the focus on multi-target security.
Here are four examples of how hard it is to

(1) prove,
(2) evaluate claims of, and
(3) actually achieve

tight multi-target security (i.e., multi-target security as high as
single-target security) for deployable encryption systems. For
simplicity I'll focus on pre-quantum security levels here, but similar
comments apply to post-quantum security levels.

Example 1: Dave is a designer pulling together components designed for
2^256 single-user security. The user data is encrypted with AES-256-CTR.
The 256-bit session keys are produced by a lattice KEM designed for
2^256 security. Of course more components are needed for authentication
etc., but these details will be enough to explain the example.

Dave then learns about 2^64-target attacks that "break" the system in
time "only" 2^192. Somehow Dave gets the crazy idea that this number is
a problem. The craziness of this idea isn't my main point here; let's
assume that NIST is requiring 2^256 security against 2^64-target attacks
(even though NIST's current standards are full of failures to achieve
this), and that the security reviewer wants to check for this.

One response---making the security reviewer's job easy---is to simply
multiply each internal 256 by an extra 1.25, scaling up the internal
components appropriately, leaving a 2^64 buffer beyond 2^256 security.
It's hard to find users who will care about the performance difference.

But Dave doesn't do this. Dave has heard that it's "best practice" to
"randomize" all the internal components, with the goal of making
multi-target attacks as hard as single-target attacks:

* Dave sticks the public key, and everything else he can think of,
into the hash that generates the session key.

* Dave chooses a random 96-bit IV for AES-256-CTR.

* Et cetera.

But do these complications actually achieve the goal of tight
multi-target security for Dave's encryption system? Or is this yet
another entry in the long line of failed cryptographic design ideas?

We could gain some confidence from seeing experienced cryptanalysts
seriously trying and failing to break Dave's encryption system. But
nobody is even trying. Most of the cryptanalysts have their hands full
finding security problems in simpler, smaller systems and components.
Occasionally cryptanalysts tackle pieces of HTTPS because it's so widely
deployed, but this doesn't mean they're going to look at Dave's system.

So Dave hopes to have a _proof_ of tight multi-target security. But he
doesn't have a proof. Here's one of many reasons why.

Often users encrypt structured plaintexts: e.g., ASCII plaintexts where
each byte has top bit 0. Assume that AES-256 has W>0 incredibly weak
keys that, in counter mode starting from any IV, turn these structured
plaintexts into a quickly detectable ciphertext pattern---e.g., let's
say the top output bit in each block is correlated with a counter bit.
I'd be surprised by such a weakness, but we certainly don't have a proof
that no such weakness exists. Also assume that this is the only weakness
in AES-256.

For single-user security, this weakness produces a fast attack with
success probability about W/2^256. For U-user security, the same
weakness produces a fast attack with probability about UW/2^256. That's
the worst possible multi-user degradation factor. Contrary to Dave's
intuition, randomizing the IV does _not_ stop this attack.

Of course, this also means that the multi-user security of Dave's
encryption system as a whole is limited to 2^256/UW, so Dave _isn't_
obtaining 2^256 security against 2^64-target attacks. Also, the goal of
having multi-target security as high as single-target security hasn't
been achieved, unless Dave's system is "broken" in some other way that
limits single-target security to 2^256/UW, contrary to the original
objective of 2^256 security. Any supposed proof of tight multi-target
security for Dave's encryption system as a whole is going to have to
deal with this proof obstacle (and much more---see below).

As part of this post-quantum standardization project, NIST recommended
AES-GCM with a randomized nonce---even though this randomization is not
required in the relevant NIST standards. NIST claimed incorrectly that
eprint 2017/435 proves tight multi-target security for this form of
AES-GCM. I pointed out the above obstacle to a proof.

More than a year later, we again seem to be faced with a claim that
Dave's approach "has a clear benefit in security proofs" for encryption.
This claim is not based on any explanation of how a tight multi-target
security proof is supposed to avoid the above obstacle.

Example 2: For ECC, tight random self-reductions give a proof that
breaking _one of many_ keys is as difficult as breaking the first key;
but there are still big speedups for breaking _more_ of the keys. Does
this mean that the one-of-many security notions are asking the wrong
question? Similarly, even if there's just one key, should we be looking
at the cost of breaking one of many ciphertexts, or the cost of breaking
more of the ciphertexts? Real-world attackers of course collect all the
keys they can, _and_ all the ciphertexts they can.

Example 3: For more conventional multiplicative-group discrete logs, the
occasional work on batch attacks keeps producing changes in the
_exponents_ of attacks, and nobody in the area would claim to be
confident regarding the best possible exponents, never mind the more
sophisticated question of concrete costs. Differences in security levels
of different primes are even more poorly understood.

Example 4: For lattices, the research community can't even manage to pin
down single-target security levels, never mind the more sophisticated
questions of the security levels of multiple ciphertexts and multiple
keys. The worst-case-to-average-case reductions are much weaker than the
ECC worst-case-to-average-case reductions.

There's much more to say about these examples, and there are many other
examples of multi-target security being below single-target security.
See my blog post https://blog.cr.yp.to/20151120-batchattacks.html for an
introduction to this difficult research area, including several examples
of serious errors in the literature.

Mike Hamburg writes:
> Informally, it means that if an attacker is computing lots of
> hashes, he has chosen one single target per hash. This reduces the
> set of attacks that must be considered.

This is barely clipping the edge of a vast field of multi-target
attacks. Response #2---"aim for a very high single-target security
level, and then rely on the fact that T-target attacks gain at most a
factor T"---allows the reviewer to skip the entire field.

> the lack of this measure contributed to borderline-feasible CCA
> attacks on other systems, all in this very standardization process.

You're talking about attacks against systems that used nonzero
decryption-failure rates and that never justified the claim that those
rates are safe. This is an extremely dangerous practice that can easily
cause any amount of damage to security. It makes no sense to blame this
on multi-target options: the maximum possible multi-target protection
can't turn this handling of decryption failures into something safe.

Here are two ways to handle decryption failures safely:

(A) Eliminate them. Require an easy proof (unlike, say, the proof
originally claimed by LIMA). Check the proof carefully.

(B) Alternatively, if you really need slightly smaller ciphertexts:

* Figure out how to convincingly calculate very small nonzero
failure rates, taking account of the difficulties shown in the
literature (most recently https://eprint.iacr.org/2018/1172).

* Analyze attacks exploiting nonzero failure rates (most recently
https://eprint.iacr.org/2018/1089), and make sure your rates
are so tiny that these attacks are at an acceptable security
level. Note that the evaluation of an acceptable security level
depends on the multi-target issues that we're discussing.

* Make sure that the rates are small enough for the known ROM
theorems to provide an acceptable security level. Do _not_
tolerate speculation that the theorems will be quantitatively
improved. Check proofs carefully---which is not easy for these
theorems. Consider, e.g., https://cr.yp.to/papers.html#tightkem,
which gives counterexamples to two of the claimed theorems;
what exactly are the proof errors that accounted for this, and
how do these interact with decryption failures?

Evidently B is much more painful for the security reviewer than A---and
then imagine how much worse B becomes if you also have to think through
multi-target attacks at every step!

> You describe this as if “aim[ing] for a very high single-target security
> level” is exclusive of Response #1, or somehow obviates it.

You started by describing Response #1 as an "ease-of-analysis feature".
This is not correct. Response #2 makes security evaluation much easier.

It's certainly possible to do both Response #1 and Response #2. I never
claimed or suggested otherwise. I also don't see how this is responsive
to anything I'm saying. The combination doesn't make the reviewer's job
easier than simply doing Response #2.

Could Response #1 improve security compared to doing nothing? Perhaps.
Could the #1+#2 combination improve security compared to #2? Perhaps.
But the evidence for these alleged security improvements is flimsy, and
leaves any skeptical security reviewer with a torrent of questions about
multi-target security that are not going to be answered in the timeframe
of this standardization project.

> The security margin between 2^256 and reality is there to
> counter many possible threats, and multi-target attacks are only one
> of them. You also have to consider improvements in lattice reduction,
> lack of tightness in the FO transform (almost universal in the QROM),
> lack of tightness in higher-level protocols using the KEM, and other
> threats.

It's important to consider each threat from the perspective of the time
spent for thorough security review. I explained in my previous message
how this is connected to the risk of NIST standardizing something that's
devastatingly insecure.

For example, even if the reviewer sees enough evidence of cryptanalysts
settling on an OW-CPA security level, how is the reviewer supposed to
gain confidence in the QROM KEM security level? The lack of tightness
for QROM proofs is quantitatively so severe that the proofs say nothing
about typical parameter choices. There are certainly important cases
(e.g., preimages) where QROM attacks are much faster than ROM attacks,
so the reviewer has to worry that the tight ROM proofs are blind to much
faster QROM attacks. We really need cryptanalysts studying the QROM KEM
security, but there are very few experts on quantum cryptanalysis.

Multi-target threats are special because we have a simple, cheap,
totally convincing way to get all of their complications out of the
security-analysis picture: simply ask the reviewer to check for a very
high single-target security level.

You're arguing that we need a buffer of B bits to protect against other
threats. You want B to be large given the number of different risks. So
figure out which B gives you enough confidence about all these risks,
and then turn it into B+64 to _confidently_ protect against multi-target
attacks too. The performance gap becomes less and less noticeable as B
increases.

> I would further categorize response #2 into two categories.
> Response #2a: Nobody actually needs Class V security as NIST
> recommends it (2^256 classical work even to find one target of many).

Which recommendation are you referring to?

As noted above, NIST's standards are full of failures to achieve this
security level. There's very little above 2^256 single-target security,
and there are tons of multi-target attack vectors, some of which are
clearly exploitable and many more of which haven't been studied.

In NIST's final call for post-quantum proposals, the "security strength
categories" are defined with respect to the "relevant security
definition", which is "IND-CCA2 security" for the KEMs we're talking
about. IND-CCA2 security is a single-target security notion.

Some readers will understand you to be saying that NIST recommends
category V. I'm not aware of any NIST statement to this effect. NIST has
repeatedly used phrases such as "probably overkill" in referring to
category V.

NIST lists tight multi-target security in a separate section of
"additional security properties" that are claimed to be "desirable". My
understanding is that their argument for this is as follows:

* Starting from single-target security 2^128, multi-target attack
speedups are a problem.

* We can't throw this security level away, so we have to complicate
everything to get rid of every multi-target speedup we find.

* Starting from single-target security 2^256, we need the same
complications, since otherwise people will be confused about why
the 256-bit standards are simpler than the 128-bit standards.

Without bothering to evaluate the general merits of this argument, let's
simply look at what the argument means for a post-quantum proposal that
starts from a higher security level:

* First step: Sure, there could be U-target speedups, perhaps even as
large as a factor U.

* Second step: Um, the proposal _did_ throw the low security level
away. Is there a problem here?

* Third step: The proposal doesn't have the complications, so there's
nothing to be confused about.

This sort of proposal is just fine for quantitative security, and is a
big win for the ease of security analysis.

> This sounds an awful lot like my informal justification for hashing
> the seed above

There's a superficial level at which every let's-hash-another-input
argument sounds alike. However, hashing every input that people think of
creates bloated standards and _definitely_ interferes with security
review. One has to think more carefully about the details to see that
some of the inputs are useful and that most of them are not.

---Dan
signature.asc

Mike Hamburg

unread,
Dec 14, 2018, 11:32:22 PM12/14/18
to D. J. Bernstein, pqc-...@list.nist.gov
OK. You’ve convinced me, or at least exhausted me. Hashing the public key or its seed is only:

* A required security feature if your system exhibits decryption failures; and
* A useful feature to reduce multi-target concerns if your system has any parameter sets aimed at class <= III, give or take.

But in systems that have only class V security targets and no decryption failures, it’s a mere complication that doesn’t hurt security and only mitigates attacks in irrelevant corner cases.

Do I have that right?

— Mike
Reply all
Reply to author
Forward
0 new messages