462 views

Skip to first unread message

Nov 29, 2018, 7:05:23 PM11/29/18

to pqc-...@list.nist.gov

Dear pqc-forum,

I am pleased to announce that the NTRU-HRSS-KEM and NTRUEncrypt teams

plan to merge their submissions.

In short, the merged submission will include the variant of

NTRU-HRSS-KEM that was proposed by Saito, Xagawa, and Yamakawa in [1].

(The changes are limited to the CCA conversion). It will also include a

variant of NTRUEncrypt that mirrors this construction as closely as

possible. The ntruhrss701 parameter set will be included. The NTRU-443

and NTRU-743 parameter sets will be replaced with parameter sets that

provide perfect correctness.

Here is a more detailed sketch of our planned merger.

* Design decisions that follow NTRU-HRSS-KEM:

- The parameter sets that we propose will all provide perfect

correctness.

- We will include parameter sets that are compatible with sampling

f, g, r, and m from the uniform distribution.

- We will eliminate the need for invertibility checks during key

generation.

- We will ensure that public keys and ciphertexts map to zero under

the "evaluate at 1" homomorphism.

* Design decisions that follow NTRUEncrypt:

- We will include the option of sampling f, g, r, and m from sets of

fixed-type polynomials (i.e. with a prescribed number of +1, -1,

and 0 coefficients).

* Features of NTRU-HRSS-KEM that will _not_ be included:

- The length-preserving plaintext confirmation hash.

(An artifact of the old CCA conversion.)

- The technique of sampling r from the hash of m, and the prescribed

sampling routine that was needed for the CCA conversion.

* Features of NTRUEncrypt that will _not_ be included:

- Parameters with decryption failures.

- The direct construction of a CCA PKE using the NAEP transformation.

- The ss-ntru-pke and ss-ntru-kem constructions.

- Private keys of the form 1+pF.

Best,

John (on behalf of the NTRU-HRSS-KEM and NTRUEncrypt teams)

[1] https://eprint.iacr.org/2017/1005

I am pleased to announce that the NTRU-HRSS-KEM and NTRUEncrypt teams

plan to merge their submissions.

In short, the merged submission will include the variant of

NTRU-HRSS-KEM that was proposed by Saito, Xagawa, and Yamakawa in [1].

(The changes are limited to the CCA conversion). It will also include a

variant of NTRUEncrypt that mirrors this construction as closely as

possible. The ntruhrss701 parameter set will be included. The NTRU-443

and NTRU-743 parameter sets will be replaced with parameter sets that

provide perfect correctness.

Here is a more detailed sketch of our planned merger.

* Design decisions that follow NTRU-HRSS-KEM:

- The parameter sets that we propose will all provide perfect

correctness.

- We will include parameter sets that are compatible with sampling

f, g, r, and m from the uniform distribution.

- We will eliminate the need for invertibility checks during key

generation.

- We will ensure that public keys and ciphertexts map to zero under

the "evaluate at 1" homomorphism.

* Design decisions that follow NTRUEncrypt:

- We will include the option of sampling f, g, r, and m from sets of

fixed-type polynomials (i.e. with a prescribed number of +1, -1,

and 0 coefficients).

* Features of NTRU-HRSS-KEM that will _not_ be included:

- The length-preserving plaintext confirmation hash.

(An artifact of the old CCA conversion.)

- The technique of sampling r from the hash of m, and the prescribed

sampling routine that was needed for the CCA conversion.

* Features of NTRUEncrypt that will _not_ be included:

- Parameters with decryption failures.

- The direct construction of a CCA PKE using the NAEP transformation.

- The ss-ntru-pke and ss-ntru-kem constructions.

- Private keys of the form 1+pF.

Best,

John (on behalf of the NTRU-HRSS-KEM and NTRUEncrypt teams)

[1] https://eprint.iacr.org/2017/1005

Dec 1, 2018, 7:33:39 AM12/1/18

to pqc-...@list.nist.gov

In case people are interested, here's a brief comparison of (1) the NTRU

Prime submission and (2) the merged NTRU Encrypt/NTRU HRSS proposal. I'm

speaking for myself here.

1. The main point of the NTRU Prime submission is to reduce the attack

surface for small lattice-based systems. Here are five aspects of this:

Encrypt HRSS merged Prime feature

------- ---- ------ ----- ----------------------------------------------

no yes yes yes avoid decryption failures

no yes yes yes irreducible polynomial

no no no yes avoid nontrivial number-field automorphisms

no no no yes prime modulus q

no no no yes the ring is a field

HRSS already eliminated decryption failures, and already eliminated the

x-1 factor from traditional NTRU (by multiplying various polynomials by

x-1). The merged proposal preserves these features. However, the table

shows important differences in the polynomial choice (cyclotomic vs.

x^n-x-1), and in various aspects of the choice of q. See below.

2. There are further differences in options where every choice has

security concerns---there isn't a choice that clearly simplifies attack

analysis:

Encrypt HRSS merged Prime feature

------- ---- ------ ----- ----------------------------------------------

yes no yes yes options with fixed-weight multipliers

no yes yes no options with uniform multipliers

------- ---- ------ ----- ----------------------------------------------

yes yes yes no options with random errors/"RLWE" (Noisy NTRU)

no no no yes options with rounding/"RLWR" (Rounded NTRU)

------- ---- ------ ----- ----------------------------------------------

yes yes yes yes "NTRU" options (Quotient NTRU)

no no no yes "LPR"/"RLWE"/"RLWR" options (Product NTRU)

Of course submissions offering two options always argue that both

options are of interest, and submissions picking one option will argue

that this option is better. We've already seen arguments that

* fixed-weight multipliers produce smaller sizes,

* uniform multipliers produce better speeds,

* rounding produces smaller sizes,

* Quotient NTRU has better encapsulation and decapsulation speeds,

* Quotient NTRU has smaller ciphertext sizes,

* Product NTRU has better key-generation speeds, and

* Product NTRU has smaller key sizes.

I'm not saying that these choices don't _affect_ security analysis; I'm

saying that it isn't clear which choice _simplifies_ security analysis.

The bottom of this message runs through an example.

3. The biggest point of debate is the choice of cyclotomic vs. x^n-x-1.

The NTRU Prime submission

* points to various attacks exploiting automorphisms---

* the quantum break of Gentry's original STOC 2009 FHE system

for cyclotomics,

* the non-quantum break of an analogous multiquadratic system,

and

* the quantum leap halfway from exponential to polynomial in

approximation factors broken for cyclotomic Ideal-SVP

---as evidence that the security impact of automorphisms has not

been adequately studied;

* uses the idea (proposed before all of these attacks!) to switch

from a cyclotomic field to a number field with a large Galois

group, so that automorphism computations are hard; and

* shows that this switch is compatible with excellent performance.

There are several other lattice submissions that can protect against the

possibility of cyclotomics being broken, but NTRU Prime is by far the

most efficient. Here are representative examples of public-key size +

ciphertext size in non-cyclotomic options in lattice submissions:

* Streamlined NTRU Prime 4591^761: 1218 bytes + 1047 bytes.

* LOTUS 128: 658944 bytes + 1144 bytes.

* Titanium CCA lite: 14720 bytes + 3008 bytes.

* Round2 n1 l1: 3455 bytes + 4837 bytes.

* Frodo 640: 9616 bytes + 9736 bytes.

* EMBLEM II.c: 10016 bytes + 14792 bytes.

* Lizard N663: 1390592 bytes + 10896 bytes.

* Odd Manhattan 128: 1626240 bytes + 180224 bytes.

NTRU Prime is the only lattice submission that eliminates cyclotomic

structure while still fitting each public key and each ciphertext into

the guaranteed IPv6 MTU, 1280 bytes.

(I'm not saying that these other submissions are pointless. Titanium

argues that it can protect against cyclotomics being broken _and_ the

NTRU Prime ring being somehow broken, as long as there's some other

polynomial that isn't broken. The other submissions listed above argue

that they can protect against _all_ polynomials being broken, as long as

unstructured lattices aren't broken. Another protection option is stated

in the NTRU Prime paper: "one has to ask whether using the increased

costs in other ways, such as larger lattice dimensions, would produce a

larger security benefit".)

In the opposite direction, the obvious way to argue for submissions

using cyclotomics is that the cyclotomic structure of traditional NTRU

hasn't enabled any known attacks, beyond some lost bits of security from

the n-fold symmetry. Perhaps all these submissions worrying about the

cyclotomic structure are being unnecessarily paranoid. To be clear: I'm

not saying that any submissions are known to be breakable via the

cyclotomic structure.

4. Regarding the choice of modulus, there are also various attack ideas,

although they haven't killed any targets as high-profile as Gentry's

STOC 2009 FHE system for cyclotomics:

* 2004 Silverman--Smart--Vercauteren proposed an idea for exploiting

reductions mod powers of 2, with the traditional NTRU choice of the

modulus q as a power of 2.

* A series of papers starting with 2014 Eisenträger--Hallgren--Lauter

proposed an idea for exploiting factors of the polynomial modulo q,

and gave some examples where the idea works.

As far as I know, the following statement has been continuously true

since 2014: accurately describing the cases of Ring-LWE that are known

to be broken requires inspecting not just

* the error distribution,

* the size of q, and

* the size of the polynomial,

but also

* the factors of the polynomial modulo q.

Eliminating those factors limits the flexibility of these attacks,

simplifying attack analysis.

In the opposite direction, again the obvious way to argue for HRSS etc.

is to say that so far nobody has been able to use these factorizations

to break traditional NTRU (or newer proposals such as New Hope). Of

course there are cases where polynomial factors have been used to break

real PKE proposals (the x-1 factor in Giophantus; the x-1 factor in

Round5; I wouldn't be surprised if the x^504+1 example in R.EMBLEM is

similarly breakable), but these are glaring examples of polynomials

being reducible for all q. So far the possibility of an irreducible

polynomial having a q-dependent weakness doesn't seem to have caught any

real proposals.

5. I think the speed of multiplication in NTRU HRSS is noteworthy, and

is a major reason for NTRU HRSS being at the top of the benchmark list

for both encapsulation and decapsulation:

* Encapsulation: 36828 ntruhrss701, 45468 sntrup4591761, 72632

kyber512, etc.

* Decapsulation: 68460 ntruhrss701, 72288 kyber512, 94744

sntrup4591761, etc.

These are cycle counts on the standard Haswell benchmarking platform for

submissions aiming for security against chosen-ciphertext attacks. I

wouldn't be surprised if the merged proposal stays at the top (SXY has

slightly more work in decapsulation but there's now less hashing). On

the other hand, this speed requires a power-of-2 modulus.

6. To summarize, within small lattice-based systems, NTRU Prime and

NTRU Encrypt/NTRU HRSS have different and apparently incompatible goals.

All of the design choices in NTRU Prime are prioritizing the simplicity

of security analysis, even if this means sacrificing some speed.

7. Beyond this basic difference in goals, there's a discrepancy in CCA

conversion details, which I think can be easily resolved:

* I think everyone should be using "implicit rejection" as in

Persichetti/HHK/SXY, where the KEM failure oracle is eliminated in

favor of pseudorandom outputs.

* Beyond this, I think there's value in a 32-byte Dent hash. Section

17 of

https://cr.yp.to/papers/tightkem-20180528.pdf

has some arguments for this, and here's an additional argument

(which I learned from Lange): checking a hash right after

decryption seems to make it easier to stop side-channel attacks

against reencryption.

I don't think the much longer hash in HRSS is useful. This longer hash

isn't in the merged proposal.

8. Finally, here's the promised example of how two options can require

two different security analyses with different complications.

Here are two different approaches to obtaining tight ROM CCA2 proofs:

* Quotient NTRU: Streamlined NTRU Prime produces a deterministic PKE

as a natural side effect of rounding. This enables proofs from

OW-CPA. The merged proposal also does a little more work than HRSS

(as suggested by SXY) to produce a deterministic PKE.

* Product NTRU: NTRU LPRime uses the "T" transform to obtain a

deterministic PKE, so tight ROM CCA2 proofs have to

* assume IND-CPA rather than just OW-CPA or

* make a more complicated assumption of OW-CPA after "T".

As far as I know, all LPR-type submissions (and non-ring

submissions such as Frodo) have similar issues.

People often decompose the PKE-breaking problem, whether it's OW-CPA or

IND-CPA, into a key problem and a ciphertext problem:

* For Quotient NTRU, breaking the OW-CPA problem requires (1)

distinguishing the key from random or (2) solving the random-key

version of the OW-CPA problem.

* For Product NTRU, breaking the IND-CPA problem requires (1)

distinguishing the key from random or (2) solving the random-key

version of the IND-CPA problem.

These decompositions aren't two-way---e.g., a key distinguisher doesn't

necessarily break the original OW-CPA problem---but _if_ we've studied

the individual problems _and_ we think that those problems are hard then

these decompositions tell us that the PKEs are secure.

A closer look at the problem details shows several different problems

that aren't easy to analyze. Comparing these problems shows potential

security losses for Quotient NTRU, and potential security losses for

Product NTRU:

* The Quotient NTRU key problem is homogeneous where the Product NTRU

key problem is not. Maybe this makes the Quotient NTRU key problem

easier to break. There are known attacks on variants with extreme

parameters.

Stehle and Steinfeld proposed parameters with a proof of hardness

of the Quotient NTRU key problem, but this requires large keys.

* The Product NTRU ciphertext problem gives the attacker more

information than the Quotient NTRU ciphertext problem. Maybe this

makes the Product NTRU ciphertext problem easier to break. Again

there are known attacks on variants with extreme parameters (many

more "samples").

There are theorems relating both of these problems (and the Product

NTRU key problem) to certain lattice problems. However, even under

the questionable assumption that those problems are hard, the

theorems are so loose that they say nothing about small-key

parameters. Security analysis has to consider attacks against the

problems with the real parameters---otherwise NIST could end up

standardizing a weak system that nobody actually tried breaking!

* The Product NTRU ciphertext problem is an IND problem, as noted

above, while the Quotient NTRU ciphertext problem is an OW problem.

Maybe this makes the Product NTRU ciphertext problem easier to

break, as in some other IND-vs-OW situations. Again there are no

tight theorems.

The bottom line is that the attack analyses have complications both

ways. As noted in the NTRU Prime submission, it isn't at all clear which

way is more secure.

Of course, getting rid of unnecessary options is the most important way

to simplify security analysis, but at this point it's really hard to

guess whether further study of attacks will end up with a better picture

for Quotient NTRU or for Product NTRU. More research is required.

For comparison, consider, e.g., eliminating decryption failures. This is

a pure win from the perspective of the simplicity of attack analysis.

Decryption failures raise complicated questions of (1) how often the

attacker can trigger these failures, and (2) how this affects security.

Decryption failures don't create any compensating simplifications in the

attack analysis. Decryption failures have some performance benefit, but

that's a different issue.

---Dan

Prime submission and (2) the merged NTRU Encrypt/NTRU HRSS proposal. I'm

speaking for myself here.

1. The main point of the NTRU Prime submission is to reduce the attack

surface for small lattice-based systems. Here are five aspects of this:

Encrypt HRSS merged Prime feature

------- ---- ------ ----- ----------------------------------------------

no yes yes yes avoid decryption failures

no yes yes yes irreducible polynomial

no no no yes avoid nontrivial number-field automorphisms

no no no yes prime modulus q

no no no yes the ring is a field

HRSS already eliminated decryption failures, and already eliminated the

x-1 factor from traditional NTRU (by multiplying various polynomials by

x-1). The merged proposal preserves these features. However, the table

shows important differences in the polynomial choice (cyclotomic vs.

x^n-x-1), and in various aspects of the choice of q. See below.

2. There are further differences in options where every choice has

security concerns---there isn't a choice that clearly simplifies attack

analysis:

Encrypt HRSS merged Prime feature

------- ---- ------ ----- ----------------------------------------------

yes no yes yes options with fixed-weight multipliers

no yes yes no options with uniform multipliers

------- ---- ------ ----- ----------------------------------------------

yes yes yes no options with random errors/"RLWE" (Noisy NTRU)

no no no yes options with rounding/"RLWR" (Rounded NTRU)

------- ---- ------ ----- ----------------------------------------------

yes yes yes yes "NTRU" options (Quotient NTRU)

no no no yes "LPR"/"RLWE"/"RLWR" options (Product NTRU)

Of course submissions offering two options always argue that both

options are of interest, and submissions picking one option will argue

that this option is better. We've already seen arguments that

* fixed-weight multipliers produce smaller sizes,

* uniform multipliers produce better speeds,

* rounding produces smaller sizes,

* Quotient NTRU has better encapsulation and decapsulation speeds,

* Quotient NTRU has smaller ciphertext sizes,

* Product NTRU has better key-generation speeds, and

* Product NTRU has smaller key sizes.

I'm not saying that these choices don't _affect_ security analysis; I'm

saying that it isn't clear which choice _simplifies_ security analysis.

The bottom of this message runs through an example.

3. The biggest point of debate is the choice of cyclotomic vs. x^n-x-1.

The NTRU Prime submission

* points to various attacks exploiting automorphisms---

* the quantum break of Gentry's original STOC 2009 FHE system

for cyclotomics,

* the non-quantum break of an analogous multiquadratic system,

and

* the quantum leap halfway from exponential to polynomial in

approximation factors broken for cyclotomic Ideal-SVP

---as evidence that the security impact of automorphisms has not

been adequately studied;

* uses the idea (proposed before all of these attacks!) to switch

from a cyclotomic field to a number field with a large Galois

group, so that automorphism computations are hard; and

* shows that this switch is compatible with excellent performance.

There are several other lattice submissions that can protect against the

possibility of cyclotomics being broken, but NTRU Prime is by far the

most efficient. Here are representative examples of public-key size +

ciphertext size in non-cyclotomic options in lattice submissions:

* Streamlined NTRU Prime 4591^761: 1218 bytes + 1047 bytes.

* LOTUS 128: 658944 bytes + 1144 bytes.

* Titanium CCA lite: 14720 bytes + 3008 bytes.

* Round2 n1 l1: 3455 bytes + 4837 bytes.

* Frodo 640: 9616 bytes + 9736 bytes.

* EMBLEM II.c: 10016 bytes + 14792 bytes.

* Lizard N663: 1390592 bytes + 10896 bytes.

* Odd Manhattan 128: 1626240 bytes + 180224 bytes.

NTRU Prime is the only lattice submission that eliminates cyclotomic

structure while still fitting each public key and each ciphertext into

the guaranteed IPv6 MTU, 1280 bytes.

(I'm not saying that these other submissions are pointless. Titanium

argues that it can protect against cyclotomics being broken _and_ the

NTRU Prime ring being somehow broken, as long as there's some other

polynomial that isn't broken. The other submissions listed above argue

that they can protect against _all_ polynomials being broken, as long as

unstructured lattices aren't broken. Another protection option is stated

in the NTRU Prime paper: "one has to ask whether using the increased

costs in other ways, such as larger lattice dimensions, would produce a

larger security benefit".)

In the opposite direction, the obvious way to argue for submissions

using cyclotomics is that the cyclotomic structure of traditional NTRU

hasn't enabled any known attacks, beyond some lost bits of security from

the n-fold symmetry. Perhaps all these submissions worrying about the

cyclotomic structure are being unnecessarily paranoid. To be clear: I'm

not saying that any submissions are known to be breakable via the

cyclotomic structure.

4. Regarding the choice of modulus, there are also various attack ideas,

although they haven't killed any targets as high-profile as Gentry's

STOC 2009 FHE system for cyclotomics:

* 2004 Silverman--Smart--Vercauteren proposed an idea for exploiting

reductions mod powers of 2, with the traditional NTRU choice of the

modulus q as a power of 2.

* A series of papers starting with 2014 Eisenträger--Hallgren--Lauter

proposed an idea for exploiting factors of the polynomial modulo q,

and gave some examples where the idea works.

As far as I know, the following statement has been continuously true

since 2014: accurately describing the cases of Ring-LWE that are known

to be broken requires inspecting not just

* the error distribution,

* the size of q, and

* the size of the polynomial,

but also

* the factors of the polynomial modulo q.

Eliminating those factors limits the flexibility of these attacks,

simplifying attack analysis.

In the opposite direction, again the obvious way to argue for HRSS etc.

is to say that so far nobody has been able to use these factorizations

to break traditional NTRU (or newer proposals such as New Hope). Of

course there are cases where polynomial factors have been used to break

real PKE proposals (the x-1 factor in Giophantus; the x-1 factor in

Round5; I wouldn't be surprised if the x^504+1 example in R.EMBLEM is

similarly breakable), but these are glaring examples of polynomials

being reducible for all q. So far the possibility of an irreducible

polynomial having a q-dependent weakness doesn't seem to have caught any

real proposals.

5. I think the speed of multiplication in NTRU HRSS is noteworthy, and

is a major reason for NTRU HRSS being at the top of the benchmark list

for both encapsulation and decapsulation:

* Encapsulation: 36828 ntruhrss701, 45468 sntrup4591761, 72632

kyber512, etc.

* Decapsulation: 68460 ntruhrss701, 72288 kyber512, 94744

sntrup4591761, etc.

These are cycle counts on the standard Haswell benchmarking platform for

submissions aiming for security against chosen-ciphertext attacks. I

wouldn't be surprised if the merged proposal stays at the top (SXY has

slightly more work in decapsulation but there's now less hashing). On

the other hand, this speed requires a power-of-2 modulus.

6. To summarize, within small lattice-based systems, NTRU Prime and

NTRU Encrypt/NTRU HRSS have different and apparently incompatible goals.

All of the design choices in NTRU Prime are prioritizing the simplicity

of security analysis, even if this means sacrificing some speed.

7. Beyond this basic difference in goals, there's a discrepancy in CCA

conversion details, which I think can be easily resolved:

* I think everyone should be using "implicit rejection" as in

Persichetti/HHK/SXY, where the KEM failure oracle is eliminated in

favor of pseudorandom outputs.

* Beyond this, I think there's value in a 32-byte Dent hash. Section

17 of

https://cr.yp.to/papers/tightkem-20180528.pdf

has some arguments for this, and here's an additional argument

(which I learned from Lange): checking a hash right after

decryption seems to make it easier to stop side-channel attacks

against reencryption.

I don't think the much longer hash in HRSS is useful. This longer hash

isn't in the merged proposal.

8. Finally, here's the promised example of how two options can require

two different security analyses with different complications.

Here are two different approaches to obtaining tight ROM CCA2 proofs:

* Quotient NTRU: Streamlined NTRU Prime produces a deterministic PKE

as a natural side effect of rounding. This enables proofs from

OW-CPA. The merged proposal also does a little more work than HRSS

(as suggested by SXY) to produce a deterministic PKE.

* Product NTRU: NTRU LPRime uses the "T" transform to obtain a

deterministic PKE, so tight ROM CCA2 proofs have to

* assume IND-CPA rather than just OW-CPA or

* make a more complicated assumption of OW-CPA after "T".

As far as I know, all LPR-type submissions (and non-ring

submissions such as Frodo) have similar issues.

People often decompose the PKE-breaking problem, whether it's OW-CPA or

IND-CPA, into a key problem and a ciphertext problem:

* For Quotient NTRU, breaking the OW-CPA problem requires (1)

distinguishing the key from random or (2) solving the random-key

version of the OW-CPA problem.

* For Product NTRU, breaking the IND-CPA problem requires (1)

distinguishing the key from random or (2) solving the random-key

version of the IND-CPA problem.

These decompositions aren't two-way---e.g., a key distinguisher doesn't

necessarily break the original OW-CPA problem---but _if_ we've studied

the individual problems _and_ we think that those problems are hard then

these decompositions tell us that the PKEs are secure.

A closer look at the problem details shows several different problems

that aren't easy to analyze. Comparing these problems shows potential

security losses for Quotient NTRU, and potential security losses for

Product NTRU:

* The Quotient NTRU key problem is homogeneous where the Product NTRU

key problem is not. Maybe this makes the Quotient NTRU key problem

easier to break. There are known attacks on variants with extreme

parameters.

Stehle and Steinfeld proposed parameters with a proof of hardness

of the Quotient NTRU key problem, but this requires large keys.

* The Product NTRU ciphertext problem gives the attacker more

information than the Quotient NTRU ciphertext problem. Maybe this

makes the Product NTRU ciphertext problem easier to break. Again

there are known attacks on variants with extreme parameters (many

more "samples").

There are theorems relating both of these problems (and the Product

NTRU key problem) to certain lattice problems. However, even under

the questionable assumption that those problems are hard, the

theorems are so loose that they say nothing about small-key

parameters. Security analysis has to consider attacks against the

problems with the real parameters---otherwise NIST could end up

standardizing a weak system that nobody actually tried breaking!

* The Product NTRU ciphertext problem is an IND problem, as noted

above, while the Quotient NTRU ciphertext problem is an OW problem.

Maybe this makes the Product NTRU ciphertext problem easier to

break, as in some other IND-vs-OW situations. Again there are no

tight theorems.

The bottom line is that the attack analyses have complications both

ways. As noted in the NTRU Prime submission, it isn't at all clear which

way is more secure.

Of course, getting rid of unnecessary options is the most important way

to simplify security analysis, but at this point it's really hard to

guess whether further study of attacks will end up with a better picture

for Quotient NTRU or for Product NTRU. More research is required.

For comparison, consider, e.g., eliminating decryption failures. This is

a pure win from the perspective of the simplicity of attack analysis.

Decryption failures raise complicated questions of (1) how often the

attacker can trigger these failures, and (2) how this affects security.

Decryption failures don't create any compensating simplifications in the

attack analysis. Decryption failures have some performance benefit, but

that's a different issue.

---Dan

Dec 1, 2018, 12:55:26 PM12/1/18

to pqc-...@list.nist.gov

---------- Message transféré ----------

De :

Date : samedi 1 décembre 2018

Objet : [pqc-forum] Announcement of NTRU-HRSS-KEM and NTRUEncrypt merger

À : John Schanck <jsch...@uwaterloo.ca>

Hello

I'm pleased to inform you that i increase the speed performance of NTRUencrypt1024, more +500, with my oun multiplication algorithm rather than using NTT algorithm.

If you want to merge your release i will be pleased to collaborate with you.

Best Regards.

Hassan from Mohammed First University Morocco.

--

You received this message because you are subscribed to the Google Groups "pqc-forum" group.

To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Visit this group at https://groups.google.com/a/list.nist.gov/group/pqc-forum/.

Dec 2, 2018, 2:31:27 AM12/2/18

to D. J. Bernstein, pqc-...@list.nist.gov

Changing the title because this response isn’t about NTRU HRSS.

> On Dec 1, 2018, at 4:33 AM, D. J. Bernstein <d...@cr.yp.to> wrote:

>

> There are several other lattice submissions that can protect against the

> possibility of cyclotomics being broken, but NTRU Prime is by far the

> most efficient. Here are representative examples of public-key size +

> ciphertext size in non-cyclotomic options in lattice submissions:

>

> * Streamlined NTRU Prime 4591^761: 1218 bytes + 1047 bytes.

> * LOTUS 128: 658944 bytes + 1144 bytes.

> * Titanium CCA lite: 14720 bytes + 3008 bytes.

> * Round2 n1 l1: 3455 bytes + 4837 bytes.

> * Frodo 640: 9616 bytes + 9736 bytes.

> * EMBLEM II.c: 10016 bytes + 14792 bytes.

> * Lizard N663: 1390592 bytes + 10896 bytes.

> * Odd Manhattan 128: 1626240 bytes + 180224 bytes.

>

> NTRU Prime is the only lattice submission that eliminates cyclotomic

> structure while still fitting each public key and each ciphertext into

> the guaranteed IPv6 MTU, 1280 bytes.

ThreeBears is also a non-cyclotomic lattice scheme, where for

claimed class II the 804-byte public key and 917-byte ciphertext

each fit into a 1280-byte IPv6 packet. Its computational efficiency

is competitive with NTRU HRSS, NTRU Prime, and even the

cyclotomic schemes.

It also shares NTRU Prime's feature that the ring is a field (and thus,

is mod an irreducible polynomial) and that the field has no nontrivial

automorphisms. This is true both in the obvious sense that Z/NZ

can’t have nontrivial automorphisms, and also in that Q(alpha) for

alpha a root of x^D - x^(D/2) - 1 (D >= 4) doesn't have nontrivial

automorphisms. Thanks to Hart Montgomery and Arnab Roy for

pointing this out.

Unlike NTRU Prime’s field, this field doesn’t have the full S_p Galois

group. I also don’t mean to say that these features make ThreeBears

a more conservative scheme than all the others. Integer MLWE hasn’t

been studied much, but I attempted to offset this by using a field and

by setting a relatively high security target.

ThreeBears has an additional ease-of-analysis feature, which is

common but not universal among the lattice KEMs: it calculates its

coins as H(pk.seed,m) instead of just H(m). This protects against

multi-target issues such as the concern about LAC. I mention this

because I think this technique is cheap and effective enough to be

worth putting in essentially every LPR-like system.

There is a possible exception for perfectly-correct schemes that

transport 256-bit keys, such as NTRU LPRime, but even in this case

there would be a small security benefit: it salts the Dent hash and the

final key, which makes it harder for an attacker to use rainbow tables

to exploit RNG failures.

ThreeBears does not have all the other ease-of-analysis features

in NTRU Prime and HRSS. I’m reluctant to make any major changes

because the system is in a reasonable local optimum, and I don’t want

to present too many parameter sets or a moving target. I will probably

add some kind of implicit rejection if ThreeBears survives to the second

round — I don’t believe it’s required for security, but it removes a footgun.

The other features will probably remain the same or similar, depending

on the state of analysis at the time:

* Probably no Dent hash unless there’s a really compelling side channel

use case — I think the usefulness in proofs is mostly an artifact.

* Hopefully a more rigorous failure analysis. Depending on the results,

possibly a small parameter change to reduce (or increase?) the failure

rate. Probably not perfect correctness, but I’ll calculate the cost of this.

Out of curiosity, have the NTRU prime folks (or anyone else) calculated

the efficiency difference between the current x^p - x - 1 and a ThreeBears

style ring like x^p - x^((p-1)/2) - 1? You would represent this ring as eg

x^((1+p)/2) - 1 - x^((1-p)/2), which reduces the error amplification by

something like 25%. This should give a small efficiency boost without

compromising the other nice properties of NTRU (L)Prime.

On an unrelated note, thanks to Dan and the other SUPERCOP

folks for your work on benchmarking. Merging other people’s code is

a lot of effort.

Cheers,

— Mike

> On Dec 1, 2018, at 4:33 AM, D. J. Bernstein <d...@cr.yp.to> wrote:

>

> There are several other lattice submissions that can protect against the

> possibility of cyclotomics being broken, but NTRU Prime is by far the

> most efficient. Here are representative examples of public-key size +

> ciphertext size in non-cyclotomic options in lattice submissions:

>

> * Streamlined NTRU Prime 4591^761: 1218 bytes + 1047 bytes.

> * LOTUS 128: 658944 bytes + 1144 bytes.

> * Titanium CCA lite: 14720 bytes + 3008 bytes.

> * Round2 n1 l1: 3455 bytes + 4837 bytes.

> * Frodo 640: 9616 bytes + 9736 bytes.

> * EMBLEM II.c: 10016 bytes + 14792 bytes.

> * Lizard N663: 1390592 bytes + 10896 bytes.

> * Odd Manhattan 128: 1626240 bytes + 180224 bytes.

>

> NTRU Prime is the only lattice submission that eliminates cyclotomic

> structure while still fitting each public key and each ciphertext into

> the guaranteed IPv6 MTU, 1280 bytes.

claimed class II the 804-byte public key and 917-byte ciphertext

each fit into a 1280-byte IPv6 packet. Its computational efficiency

is competitive with NTRU HRSS, NTRU Prime, and even the

cyclotomic schemes.

It also shares NTRU Prime's feature that the ring is a field (and thus,

is mod an irreducible polynomial) and that the field has no nontrivial

automorphisms. This is true both in the obvious sense that Z/NZ

can’t have nontrivial automorphisms, and also in that Q(alpha) for

alpha a root of x^D - x^(D/2) - 1 (D >= 4) doesn't have nontrivial

automorphisms. Thanks to Hart Montgomery and Arnab Roy for

pointing this out.

Unlike NTRU Prime’s field, this field doesn’t have the full S_p Galois

group. I also don’t mean to say that these features make ThreeBears

a more conservative scheme than all the others. Integer MLWE hasn’t

been studied much, but I attempted to offset this by using a field and

by setting a relatively high security target.

ThreeBears has an additional ease-of-analysis feature, which is

common but not universal among the lattice KEMs: it calculates its

coins as H(pk.seed,m) instead of just H(m). This protects against

multi-target issues such as the concern about LAC. I mention this

because I think this technique is cheap and effective enough to be

worth putting in essentially every LPR-like system.

There is a possible exception for perfectly-correct schemes that

transport 256-bit keys, such as NTRU LPRime, but even in this case

there would be a small security benefit: it salts the Dent hash and the

final key, which makes it harder for an attacker to use rainbow tables

to exploit RNG failures.

ThreeBears does not have all the other ease-of-analysis features

in NTRU Prime and HRSS. I’m reluctant to make any major changes

because the system is in a reasonable local optimum, and I don’t want

to present too many parameter sets or a moving target. I will probably

add some kind of implicit rejection if ThreeBears survives to the second

round — I don’t believe it’s required for security, but it removes a footgun.

The other features will probably remain the same or similar, depending

on the state of analysis at the time:

* Probably no Dent hash unless there’s a really compelling side channel

use case — I think the usefulness in proofs is mostly an artifact.

* Hopefully a more rigorous failure analysis. Depending on the results,

possibly a small parameter change to reduce (or increase?) the failure

rate. Probably not perfect correctness, but I’ll calculate the cost of this.

Out of curiosity, have the NTRU prime folks (or anyone else) calculated

the efficiency difference between the current x^p - x - 1 and a ThreeBears

style ring like x^p - x^((p-1)/2) - 1? You would represent this ring as eg

x^((1+p)/2) - 1 - x^((1-p)/2), which reduces the error amplification by

something like 25%. This should give a small efficiency boost without

compromising the other nice properties of NTRU (L)Prime.

On an unrelated note, thanks to Dan and the other SUPERCOP

folks for your work on benchmarking. Merging other people’s code is

a lot of effort.

Cheers,

— Mike

Dec 3, 2018, 11:25:32 AM12/3/18

to pqc-...@list.nist.gov

Dear pqc-forum,

A more complete description of the KEMs that will be included in the

merged NTRU-HRSS / NTRUEncrypt submission is now available as part of a

tech report titled "A Comparison of NTRU Variants."

https://eprint.iacr.org/2018/1174

Note that the parameter sets that I have highlighted in Section 7 are

ones that I personally find interesting -- nothing more. In particular,

there is no guarantee that these parameter sets will be recommended in

any future submission to NIST.

Cheers,

John

* John Schanck <jsch...@uwaterloo.ca> [2018-11-29 19:05:14 -0500]:

A more complete description of the KEMs that will be included in the

merged NTRU-HRSS / NTRUEncrypt submission is now available as part of a

tech report titled "A Comparison of NTRU Variants."

https://eprint.iacr.org/2018/1174

Note that the parameter sets that I have highlighted in Section 7 are

ones that I personally find interesting -- nothing more. In particular,

there is no guarantee that these parameter sets will be recommended in

any future submission to NIST.

Cheers,

John

* John Schanck <jsch...@uwaterloo.ca> [2018-11-29 19:05:14 -0500]:

Dec 3, 2018, 7:22:04 PM12/3/18

to pqc-...@list.nist.gov

Mike Hamburg writes:

> Q(alpha) for alpha a root of x^D - x^(D/2) - 1 (D >= 4) doesn't have

> nontrivial automorphisms.

Not true. If D/2 is even then x |-> -x is a nontrivial automorphism. If
> Q(alpha) for alpha a root of x^D - x^(D/2) - 1 (D >= 4) doesn't have

> nontrivial automorphisms.

D/2 is odd then x |-> -1/x is a nontrivial automorphism.

There's also a much longer list of automorphisms lurking only slightly

behind the curtain. As stated in the NTRU Prime paper:

... having a Galois group of size, say, 2p means that one can write

down a degree-2p extension field with 2p automorphisms, and one can

then try to apply these automorphisms to build many units,

generalizing the construction of cyclotomic units.

These are automorphisms of a small extension field, not automorphisms of

the original field, but this distinction doesn't matter if computations

of the automorphisms are usable to build short units for an attack.

Have you analyzed whether Gentry ThreeBears, a Gentry-style system using

x^D-x^(D/2)-1, is breakable in this way? Offhand I'd say that (1) the

Galois closure has degree O(D^2), small enough to allow fast

computations; and (2) there are a ton of easily computed short units,

including the golden ratio phi, the (D/2)th cyclotomic units, and

differences such as phi-zeta_{D/2}.

> this field doesn’t have the full S_p Galois group.

paper requires factorial(p) and explains the advantage as follows:

Having a much larger Galois group means that P will have at most a

small number of roots in any field of reasonable degree. This

eliminates all known methods of efficiently performing computations

with more than a small number of automorphisms.

For O(p^2) this is not an obstacle: computations on elements of the

Galois closure, including all automorphisms, take time polynomial in p.

> ThreeBears is also a non-cyclotomic lattice scheme

so it wasn't part of my comparison here. Also, if the obvious attack

strategy against Gentry ThreeBears exploits cyclotomic units, then is it

useful to say "non-cyclotomic"?

More importantly, NTRU Prime explicitly avoids not just cyclotomics, but

all Galois groups of non-maximal size. This didn't matter for the

comparison I was stating, where all systems were either

(1) small and cyclotomic,

(2) small and NTRU Prime, or

(3) much larger,

but if you're going to extend the comparison then you should account for

the full NTRU Prime design criteria.

> ThreeBears has an additional ease-of-analysis feature, which is

> common but not universal among the lattice KEMs: it calculates its

> coins as H(pk.seed,m) instead of just H(m). This protects against

> multi-target issues such as the concern about LAC.

would have been drastically slowed down by this extra hash input. But

how is this an "ease-of-analysis feature"?

The following approach to multi-target attacks sounds much easier to

analyze: "Our general strategy for handling multi-target attacks is to

aim for a very high single-target security level, and then rely on the

fact that T-target attacks gain at most a factor T. We have not

introduced complications aimed at making multi-target attacks even more

difficult."

That's a quote from the NTRU Prime submission. Obviously this approach

isn't available for submissions that are, for efficiency, cutting things

close with decryption failures.

> Out of curiosity, have the NTRU prime folks (or anyone else)

> calculated the efficiency difference between the current x^p - x - 1

> and a ThreeBears style ring like x^p - x^((p-1)/2) - 1?

small. See the interesting graphs just posted by John Schanck:

https://eprint.iacr.org/2018/1174

The loss of x^p-x-1 compared to other trinomials should be even smaller.

My impression is that John didn't account for equivalent keys, which

speed up known attacks against NTRU Classic more than they speed up

known attacks against x^p-x-1. This should somewhat offset the size loss

from higher product weights.

Anyway, to answer your question, the NTRU Prime perspective is that the

risk of further attacks exploiting automorphisms---and the difficulty

that this risk creates for security analysis---outweighs any of these

small efficiency differences. The NTRU Prime submission also skips the

following 2016 suggestion from Dan Shepherd, which might interest you:

* Generate a random vector mod 2 rather than mod 3.

* Hash the vector to select some bits of the vector to negate. This

produces the original range -1,0,1.

* In decapsulation, recover the random vector mod 2, and use the same

hash to recover each -1,0,1.

This complication slightly improves efficiency.

---Dan

Dec 3, 2018, 9:36:48 PM12/3/18

to D. J. Bernstein, pqc-...@list.nist.gov

> On Dec 3, 2018, at 4:21 PM, D. J. Bernstein <d...@cr.yp.to> wrote:

>

> Mike Hamburg writes:

>> Q(alpha) for alpha a root of x^D - x^(D/2) - 1 (D >= 4) doesn't have

>> nontrivial automorphisms.

>

> Not true. If D/2 is even then x |-> -x is a nontrivial automorphism. If

> D/2 is odd then x |-> -1/x is a nontrivial automorphism.

> Have you analyzed whether Gentry ThreeBears, a Gentry-style system using

> x^D-x^(D/2)-1, is breakable in this way? Offhand I'd say that (1) the

> Galois closure has degree O(D^2), small enough to allow fast

> computations; and (2) there are a ton of easily computed short units,

> including the golden ratio phi, the (D/2)th cyclotomic units, and

> differences such as phi-zeta_{D/2}.

> More importantly, NTRU Prime explicitly avoids not just cyclotomics, but

> all Galois groups of non-maximal size. This didn't matter for the

> comparison I was stating, where all systems were either

>

> (1) small and cyclotomic,

> (2) small and NTRU Prime, or

> (3) much larger,

>

> but if you're going to extend the comparison then you should account for

> the full NTRU Prime design criteria.

Prime design criteria, NTRU Prime is the smallest.

>> ThreeBears has an additional ease-of-analysis feature, which is

>> common but not universal among the lattice KEMs: it calculates its

>> coins as H(pk.seed,m) instead of just H(m). This protects against

>> multi-target issues such as the concern about LAC.

>

> I agree that we've seen examples of decryption-failure attacks that

> would have been drastically slowed down by this extra hash input. But

> how is this an "ease-of-analysis feature"?

>

> The following approach to multi-target attacks sounds much easier to

> analyze: "Our general strategy for handling multi-target attacks is to

> aim for a very high single-target security level, and then rely on the

> fact that T-target attacks gain at most a factor T. We have not

> introduced complications aimed at making multi-target attacks even more

> difficult."

>

> That's a quote from the NTRU Prime submission. Obviously this approach

> isn't available for submissions that are, for efficiency, cutting things

> close with decryption failures.

to thwart a known attack. As I said in my previous email, it is less relevant

to NTRU Prime, but it isn’t completely irrelevant either.

Adding the public key (or its hash or seed) to the hash means that, if an

attacker is calculating many hashes toward some attack, then he is only

going after a specific target (or at most a few, in the unlikely event of a

collision in the hash or seed of the public key). Therefore, you don’t have

to worry about multi-target attacks on the FO mode, just on the underlying

KEM.

The ideal case would be to prove a theorem that says:

Hardness of underlying problem

-> Multi-target CPA security of KEM

-> Multi-target CCA security of KEM+FO

at least in the [Q]ROM, with as little tightness loss as possible. In principle

the FO step could be generic. This is already difficult, but seems much

more difficult if you don’t domain separate the hash. The fact that quantum

FO analysis already incurs a large tightness loss doesn’t help this state of

affairs.

One can certainly also just divide the estimated security by T, but that requires

designing for Class III single-target security (eg) and then using the resulting

system as Class I, or V -> III, and overdesigning this way is unappealing.

NTRU Prime’s evaluation technique is already less conservative than other

systems’ — they would call its parameter set class II or III instead of V. Just

saying that you can divide that by T=2^64 and still be OK is unconvincing —

someone will actually have to do the analysis, and that analysis will be easier

with domain separation.

Alternatively, one can just claim that there are no known multi-target attacks,

but it seems better to have simple and concrete reasons why multi-target

attacks are impossible.

Each feature like this adds complexity and may hurt performance, while

possibly improving security. The same is true for Dent hashes, implicit

rejection, etc. Some such features will be worthwhile and others not, but

I don’t think hashing the public key should be dismissed out of hand as a

pointless complication.

>> Out of curiosity, have the NTRU prime folks (or anyone else)

>> calculated the efficiency difference between the current x^p - x - 1

>> and a ThreeBears style ring like x^p - x^((p-1)/2) - 1?

>

> The size loss of x^p-x-1 compared to NTRU Classic is measurable but

> small. See the interesting graphs just posted by John Schanck:

>

> https://eprint.iacr.org/2018/1174

>

> The loss of x^p-x-1 compared to other trinomials should be even smaller.

>

> My impression is that John didn't account for equivalent keys, which

> speed up known attacks against NTRU Classic more than they speed up

> known attacks against x^p-x-1. This should somewhat offset the size loss

> from higher product weights.

>

> Anyway, to answer your question, the NTRU Prime perspective is that the

> risk of further attacks exploiting automorphisms---and the difficulty

> that this risk creates for security analysis---outweighs any of these

> small efficiency differences.

Q[x] / (x^p - x^((p-1)/2 - 1)

to have a full Galois group? Or is common that it will have the full S_p but

hard to prove? That field doesn’t have the obvious automorphism you

noticed in ThreeBears, and arithmetic in this ring should be just as easy to

implement as mod x^p - x - 1 (depending on your Karatsuba strategy etc

of course).

> The NTRU Prime submission also skips the

> following 2016 suggestion from Dan Shepherd, which might interest you:

>

> * Generate a random vector mod 2 rather than mod 3.

> * Hash the vector to select some bits of the vector to negate. This

> produces the original range -1,0,1.

> * In decapsulation, recover the random vector mod 2, and use the same

> hash to recover each -1,0,1.

>

> This complication slightly improves efficiency.

> —Dan

Cheers,

— Mike

Dec 7, 2018, 11:17:27 PM12/7/18

to pqc-...@list.nist.gov

Focusing here on comparing the following two responses to multi-target

attacks:

* Response #1: include a key prefix in the hash used in the CCA2

conversion, a long enough prefix that prefixes of user keys don't

collide.

* Response #2: "aim for a very high single-target security level, and

end of round 1---

* https://groups.google.com/a/list.nist.gov/d/msg/pqc-forum/WxRmVPhAENw/I5yhtr9dAgAJ

indicates that NIST is treating CCA2 conversion choices as "minor

changes (tweaks)" for the beginning of round 2.

* NIST also doesn't seem concerned with the exact security levels in

round 1: "We’re not going to kick out a scheme just because they

set their parameters wrong".

---but surely these issues will have to receive more attention as the

post-quantum standardization project progresses. It will be useful for

the community to resolve disputes regarding the relative merits of the

two responses before round-2 tweaks are due. Also, the principles here

have an impact beyond multi-target attacks.

Mike Hamburg writes, regarding Response #2:

> overdesigning this way is unappealing

Response #2 is tremendously appealing for security evaluation, auditing,

and actually solving the security problem (see below), so I presume that

what you're claiming here is a performance problem. Let's look at the

performance numbers.

Say the best single-target attack is a search for one solution in a

space of size 2^k. There are other attack shapes, but this one maximizes

the performance gap that you're alluding to.

Here's the standard list of how big k needs to be for four single-target

security levels:

* Security 2^64 pre-quantum: k=64.

* Security 2^128 pre-quantum: k=128.

* Security 2^64 post-quantum: k=128.

* Security 2^128 post-quantum: k=256.

(Of course this list measures security with naive operation counts.

Depth limits and various quantum overheads mean that, e.g., k=256

actually has a much higher post-quantum security level.)

Now let's use Response #2 to protect against 2^64-target attacks, saying

that a 2^64-target attack implies a 1-target attack with probability

1/2^64. Here's what this does to k:

* Security 2^64 pre-quantum: k=128. This is 2x larger.

* Security 2^128 pre-quantum: k=192. This is 1.5x larger.

* Security 2^64 post-quantum: k=192. This is 1.5x larger.

* Security 2^128 post-quantum: k=320. This is 1.25x larger.

Sure, 2x sounds significant, but that's for an archaic 2^64 pre-quantum

security level. The gap rapidly shrinks as the target security level

increases. Do you really claim that 1.25x is "overdesigning"?

For typical lattice systems the key size scales as something like

k log k, so the 1.25x in search space turns into 1.30x in key size. This

doesn't noticeably change the picture. The gap can be even smaller when

the best attacks have worse probability dropoffs than Grover, but the

simple analysis already shows that the gap is small.

[ regarding Response #1: ]

comprehensive _review_ of security. One way to see the importance of

this metric is as follows.

Is NIST going to end up standardizing---and are users going to end up

deploying---a post-quantum system that turns out to be devastatingly

insecure, allowing feasible attacks? Quite possibly, yes. My assessment

is that the most likely way for this to happen is as follows:

* as a direct result of design decisions, security review of the

standard will be much more complicated than necessary;

* as a result, the limited human resources available for security

review will run out of time to seriously review everything;

* as a result, serious vulnerabilities will be missed simply because

nobody ever looked at the vulnerable part of the standard.

As an analogy, consider the destruction of OCB2 this year, 14 years

after Rogaway's Asiacrypt 2004 security "proof" and 9 years after ISO

standardization. The Inoue--Minematsu OCB2 attack is simple and fast,

but this doesn't mean it was easy to find: OCB2 is only one of many

cryptographic standards, and it has many interacting components that

needed attack analysis.

Compared to serious security review for OCB2, serious security review

for a typical post-quantum system is another big step up in difficulty,

further increasing the risk of security disasters. The difficulty varies

from one system to another, motivating design changes.

Here are some of the things that happen when we identify the simplicity

of security review as an important goal, and systematically pursue this

goal in designing a KEM for CCA2 security.

We eliminate decryption failures in the underlying PKE. This takes away

a huge, painful, error-prone part of the review process. (My tally,

before eprint 2018/1089 and eprint 2018/1172, said that 25% of the

unscathed encryption submissions eliminate decryption failures.)

What about Dent's hash---plaintext confirmation? Try a security review

with this hash, vs. a security review with an earlier FO variant, and

you'll immediately see how the hash simplifies review:

* The PKE allows the attacker to modify ciphertexts to produce new

ciphertexts that are often valid. This is the starting point for

chosen-ciphertext attacks, and we've seen again and again that

analyzing the consequences is complicated.

* Plaintext confirmation makes it very easy to see that the only way

for an attacker to produce a new valid ciphertext is to already

know the plaintext. The hash also gives a tight ROM proof of

IND-CCA2 from OW-CPA. Security review of OW-CPA is simpler than

security review of IND-CPA.

Of course, evaluators have to check the proof, have to think about

whether there can be better non-ROM attacks, have to think about how an

attacker could use the extra information provided by the hash, etc., but

all of this is much simpler than than having to think through

consequences of valid modified ciphertexts for the underlying PKE.

What about implicit rejection? Implicit rejection makes it very easy to

see that an attacker modifying ciphertexts will obtain random-looking

session keys, whether the modified ciphertexts are valid or not, whether

there's plaintext confirmation or not. This also gives a tight ROM proof

of IND-CCA2 from OW-CPA, even simpler than Dent's proof.

Understanding whether it's best to do plaintext confirmation, or

implicit rejection, or both, requires a closer look at the details. My

current assessment---given the analysis in the tightkem paper, plus the

extra argument that I mentioned regarding side channels---is that

* plaintext confirmation by itself is good,

* implicit rejection by itself is better, and

* doing both implicit rejection and plaintext confirmation is best,

but it's possible that the ranking of these three options will change

with more research.

> The ideal case would be to prove a theorem that says:

> Hardness of underlying problem

> -> Multi-target CPA security of KEM

> -> Multi-target CCA security of KEM+FO

> at least in the [Q]ROM, with as little tightness loss as possible.

It's not at all clear that a tight QROM proof exists. We don't know how

to generate such a proof from Response #1. More to the point, we don't

even know how to use Response #1 to guarantee that multi-target attacks

against the final KEM are tightly equivalent to single-target attacks.

More importantly, even if we did have a proof, there would be a much

bigger problem for security reviewers. Cryptographic systems are full of

constructions that haven't been reviewed for multi-target security. When

review does happen, the reviewers produce one attack after another. See,

e.g.,

https://blog.cr.yp.to/20151120-batchattacks.html

https://eprint.iacr.org/2016/360

for various examples of multi-target attacks against NIST cryptographic

standards---attacks faster than all known single-target attacks. The

result of further review isn't going to be confidence in multi-target

security; it will be a continuing list of further multi-target attacks.

A KEM is one tiny piece of an end-user cryptographic system. Response #1

is looking at this tiny piece, and adding a complication that stops some

multi-target attacks against that piece ("stops" meaning that they are

no more effective than single-target attacks). Maybe at some point we'll

have a proof that this stops all possible multi-target attacks against

this tiny piece. But what about the rest of the system, with tons of

other multi-target weaknesses and potential multi-target weaknesses?

As an example, the New Hope submission claims that it is "rather

uncomfortable to have the security of all connections rely on a single

instance of a lattice problem" since the multiple targets might be

broken by a "massive precomputation". It claims that one can "avoid"

this "pitfall" by having each user generate a separate parameter "a".

But the new ideal-lattice attack algorithm from

https://nitaj.users.lmno.cnrs.fr/Caen2018/Alice_Pellet_MaryCaen2018.pdf

speeds up attacks against _every_ target after a single massive lattice

precomputation. The attack focuses on Ideal-SVP, but the supposed

hardness of Ideal-SVP is the main justification for the claimed hardness

of Ring-LWE. I'd guess that the precomputation is quantitatively larger

than a typical attack, but the New Hope argument isn't quantitative. I'm

not saying that the attack precisely disproves the New Hope multi-target

claim; I'm simply saying that the claim isn't proven and could be wrong.

If a multi-target lattice attack is faster than a single-target lattice

attack, then what exactly is gained by the extra hashing in Response #1?

More broadly, when people combine KEMs with many other components to

produce complete end-user cryptographic systems, there will surely be

many different multi-target weaknesses. What's the strategy to remove

all these weaknesses, and how is a security evaluator supposed to gain

confidence in this approach?

Response #2, unlike Response #1, is a complete solution to the problem:

assume that there will be multi-target attacks, and choose the security

level to protect against the maximum possible effect of those attacks.

This eliminates the hard problem of evaluating multi-target security.

The security evaluator can safely focus on single-target attacks.

I understand how Response #1 can feel like progress. I've even advocated

an analogous complication---but that was in a situation where

* it was easy to see how the complication eliminated multi-target

attacks against a complete end-user cryptographic system;

* this turned into a simple, easily reviewable, proof of multi-target

security for the complete system; and

* the target security level was only 2^128 pre-quantum.

This KEM hash is a very different story. It is, at best, addressing a

much smaller part of a system that has many other multi-target

weaknesses. I doubt that anyone has ever constructed a similar system

with a proof of multi-target security, even if a proof is possible for

this small piece. Furthermore, we're talking about post-quantum KEMs

aiming for a much higher security level, so the performance impact of

Response #2 is relatively small.

> NTRU Prime’s evaluation technique is already less conservative than

> other systems’ — they would call its parameter set class II or III

> instead of V. Just saying that you can divide that by T=2^64 and

> still be OK is unconvincing

Let me see if I understand, quantitatively, what you're claiming here.

You're starting from 2^155 pre-quantum security for Streamlined NTRU

Prime 4591^761, dividing by 2^64 for a 2^64-target attack, and saying

that 2^91 is not OK? Or you're doing something similar starting from

2^140 post-quantum security?

The starting points 2^155 and 2^140 are wild underestimates of the cost

of known single-target attacks. Checking the literature shows, as stated

in my undisputed summary from February, that each of these numbers comes

from

* taking a _small piece_ of an attack,

* dividing this piece into _expensive operations_,

* _asymptotically_ counting the number of these operations, and

* replacing o(1) in the asymptotics with 0.

Note that replacing o(1) with 0 can produce _overestimates_; there's no

guarantee that the o(1) is positive. The reason we think that the

overall effect of the above four steps is a wild underestimate is that

some people have done much more work for much more serious estimates of

attack costs, consistently finding much higher attack costs than the

underestimates indicate (although there are still many unresolved

questions about the details).

This also means that someone doing a security review has to check the

serious estimates (and worry about the unresolved questions). The wild

underestimates are certainly much simpler _formulas_, but this doesn't

mean that they simplify _security evaluation_.

Of course underestimating security levels has the general effect of

increasing security parameters. This could protect users against future

attack speedups. You call this "conservative", suggesting that it's a

good approach. I have two questions and a comment:

* How do you think this approach is different from "overdesigning",

which a moment ago you called "unappealing"? What's the principle

for deciding what's "overdesigning" and what's "conservative"?

* If you think that using a wild underestimate is good, do you think

that using an even wilder underestimate is better? What's the

principle for deciding when to stop?

* There's a different approach, which I'll call the "accurate

conservative" approach, that (1) clearly produces better results

and (2) doesn't use these underestimates.

The accurate conservative approach is as follows:

* Accurately evaluate the costs of the attacks.

* On the basis of this evaluation, choose the highest-security

parameters that users can afford.

The accurate conservative approach, by definition, is maximizing the

users' security against the best attacks considered.

For comparison, you seem to be advocating what I'll call the "inaccurate

conservative" approach, which replaces the accurate evaluations with

something less accurate---in particular, simplified formulas that turn

out to be underestimates. This replacement generally _damages_ security,

as explained in Appendix B.1.2 of

https://cr.yp.to/papers.html#nonuniform

(cited in the NTRU Prime submission under "Warning: underestimates are

dangerous"), and it can't improve security.

The previous paragraph is talking about the damage done to security

against _current_ attacks. The best bet we have is that this will also

damage security against _future_ attacks. Of course it's possible that

advances in attacks will magically turn out to be better correlated with

the current underestimates than with accurate evaluations of today's

attack costs---but what's the argument for betting on this possibility?

"I have an argument: underestimate U of the security level is smaller

than accurate estimate A, and future attacks will also be faster than

A!" ---To see that this argument is nonsense, consider a more radical

underestimate such as A/3. (If A/3 isn't below U, replace it with some

even smaller function of A.) The argument claims that replacing A with U

is an improvement, and that replacing U with A/3 is an improvement---but

A/3 gives exactly the same parameter choices as A, so the argument has

directly contradicted itself.

There are cases where the literature looks more closely at the details

of attacks and makes well-informed assessments of risks of various types

of speedups. Superficially, this might sound like recommending

underestimates---but there's an essential difference in methodology.

Saying "we want a smaller estimate because smaller is good", or "we want

a simpler formula because simpler is good", is very different from

saying "we see a risk of the following type of attack improvement and we

want to accurately evaluate the impact". Often the conclusion is that

the impact is so devastating that proper risk management requires

avoiding some types of cryptosystems. Consider, e.g., the statement

It is anticipated that these techniques can be used to produce

collisions for MD5 and perhaps also for RIPEMD

in https://link.springer.com/content/pdf/10.1007%2F3-540-60865-6_44.pdf

twenty years ago.

There are also cases where we can't reasonably evaluate security against

some types of attacks since they're beyond the attack model that the

community normally considers. Timing attacks were like this twenty years

ago, with very high risks that people claiming timing-attack security

were disastrously wrong:

Table lookups: This instruction is not susceptible to a timing attack

...

--- Daemen and Rijmen, Resistance against implementation attacks: a

comparative study of the AES proposals, 1999

https://web.archive.org/web/20000816072451/http://csrc.nist.gov:80/encryption/aes/round1/conf2/papers/daemen.pdf

Table lookup: not vulnerable to timing attacks.

--- NIST, Report on the development of the Advanced Encryption

Standard (AES), 2001

Multi-target attacks are like this today. They're a fun research topic,

and maybe in twenty years we'll know how to stop multi-target attacks;

but most components of today's post-quantum cryptosystems were designed

by people who weren't even thinking about, let alone trying to stop,

multi-target attacks. Response #2 recognizes this, gives us confidence

that we're protected, and doesn't cost much in the context of

high-security post-quantum cryptography.

---Dan

attacks:

* Response #1: include a key prefix in the hash used in the CCA2

conversion, a long enough prefix that prefixes of user keys don't

collide.

* Response #2: "aim for a very high single-target security level, and

then rely on the fact that T-target attacks gain at most a factor

T".

My impression is that neither of these will be part of decisions at the
T".

end of round 1---

* https://groups.google.com/a/list.nist.gov/d/msg/pqc-forum/WxRmVPhAENw/I5yhtr9dAgAJ

indicates that NIST is treating CCA2 conversion choices as "minor

changes (tweaks)" for the beginning of round 2.

* NIST also doesn't seem concerned with the exact security levels in

round 1: "We’re not going to kick out a scheme just because they

set their parameters wrong".

---but surely these issues will have to receive more attention as the

post-quantum standardization project progresses. It will be useful for

the community to resolve disputes regarding the relative merits of the

two responses before round-2 tweaks are due. Also, the principles here

have an impact beyond multi-target attacks.

Mike Hamburg writes, regarding Response #2:

> overdesigning this way is unappealing

Response #2 is tremendously appealing for security evaluation, auditing,

and actually solving the security problem (see below), so I presume that

what you're claiming here is a performance problem. Let's look at the

performance numbers.

Say the best single-target attack is a search for one solution in a

space of size 2^k. There are other attack shapes, but this one maximizes

the performance gap that you're alluding to.

Here's the standard list of how big k needs to be for four single-target

security levels:

* Security 2^64 pre-quantum: k=64.

* Security 2^128 pre-quantum: k=128.

* Security 2^64 post-quantum: k=128.

* Security 2^128 post-quantum: k=256.

(Of course this list measures security with naive operation counts.

Depth limits and various quantum overheads mean that, e.g., k=256

actually has a much higher post-quantum security level.)

Now let's use Response #2 to protect against 2^64-target attacks, saying

that a 2^64-target attack implies a 1-target attack with probability

1/2^64. Here's what this does to k:

* Security 2^64 pre-quantum: k=128. This is 2x larger.

* Security 2^128 pre-quantum: k=192. This is 1.5x larger.

* Security 2^64 post-quantum: k=192. This is 1.5x larger.

* Security 2^128 post-quantum: k=320. This is 1.25x larger.

Sure, 2x sounds significant, but that's for an archaic 2^64 pre-quantum

security level. The gap rapidly shrinks as the target security level

increases. Do you really claim that 1.25x is "overdesigning"?

For typical lattice systems the key size scales as something like

k log k, so the 1.25x in search space turns into 1.30x in key size. This

doesn't noticeably change the picture. The gap can be even smaller when

the best attacks have worse probability dropoffs than Grover, but the

simple analysis already shows that the gap is small.

[ regarding Response #1: ]

> Each feature like this adds complexity and may hurt performance, while

> possibly improving security.

There's a metric missing from this list, namely the complexity of a
> possibly improving security.

comprehensive _review_ of security. One way to see the importance of

this metric is as follows.

Is NIST going to end up standardizing---and are users going to end up

deploying---a post-quantum system that turns out to be devastatingly

insecure, allowing feasible attacks? Quite possibly, yes. My assessment

is that the most likely way for this to happen is as follows:

* as a direct result of design decisions, security review of the

standard will be much more complicated than necessary;

* as a result, the limited human resources available for security

review will run out of time to seriously review everything;

* as a result, serious vulnerabilities will be missed simply because

nobody ever looked at the vulnerable part of the standard.

As an analogy, consider the destruction of OCB2 this year, 14 years

after Rogaway's Asiacrypt 2004 security "proof" and 9 years after ISO

standardization. The Inoue--Minematsu OCB2 attack is simple and fast,

but this doesn't mean it was easy to find: OCB2 is only one of many

cryptographic standards, and it has many interacting components that

needed attack analysis.

Compared to serious security review for OCB2, serious security review

for a typical post-quantum system is another big step up in difficulty,

further increasing the risk of security disasters. The difficulty varies

from one system to another, motivating design changes.

Here are some of the things that happen when we identify the simplicity

of security review as an important goal, and systematically pursue this

goal in designing a KEM for CCA2 security.

We eliminate decryption failures in the underlying PKE. This takes away

a huge, painful, error-prone part of the review process. (My tally,

before eprint 2018/1089 and eprint 2018/1172, said that 25% of the

unscathed encryption submissions eliminate decryption failures.)

What about Dent's hash---plaintext confirmation? Try a security review

with this hash, vs. a security review with an earlier FO variant, and

you'll immediately see how the hash simplifies review:

* The PKE allows the attacker to modify ciphertexts to produce new

ciphertexts that are often valid. This is the starting point for

chosen-ciphertext attacks, and we've seen again and again that

analyzing the consequences is complicated.

* Plaintext confirmation makes it very easy to see that the only way

for an attacker to produce a new valid ciphertext is to already

know the plaintext. The hash also gives a tight ROM proof of

IND-CCA2 from OW-CPA. Security review of OW-CPA is simpler than

security review of IND-CPA.

Of course, evaluators have to check the proof, have to think about

whether there can be better non-ROM attacks, have to think about how an

attacker could use the extra information provided by the hash, etc., but

all of this is much simpler than than having to think through

consequences of valid modified ciphertexts for the underlying PKE.

What about implicit rejection? Implicit rejection makes it very easy to

see that an attacker modifying ciphertexts will obtain random-looking

session keys, whether the modified ciphertexts are valid or not, whether

there's plaintext confirmation or not. This also gives a tight ROM proof

of IND-CCA2 from OW-CPA, even simpler than Dent's proof.

Understanding whether it's best to do plaintext confirmation, or

implicit rejection, or both, requires a closer look at the details. My

current assessment---given the analysis in the tightkem paper, plus the

extra argument that I mentioned regarding side channels---is that

* plaintext confirmation by itself is good,

* implicit rejection by itself is better, and

* doing both implicit rejection and plaintext confirmation is best,

but it's possible that the ranking of these three options will change

with more research.

> The ideal case would be to prove a theorem that says:

> Hardness of underlying problem

> -> Multi-target CPA security of KEM

> -> Multi-target CCA security of KEM+FO

> at least in the [Q]ROM, with as little tightness loss as possible.

to generate such a proof from Response #1. More to the point, we don't

even know how to use Response #1 to guarantee that multi-target attacks

against the final KEM are tightly equivalent to single-target attacks.

More importantly, even if we did have a proof, there would be a much

bigger problem for security reviewers. Cryptographic systems are full of

constructions that haven't been reviewed for multi-target security. When

review does happen, the reviewers produce one attack after another. See,

e.g.,

https://blog.cr.yp.to/20151120-batchattacks.html

https://eprint.iacr.org/2016/360

for various examples of multi-target attacks against NIST cryptographic

standards---attacks faster than all known single-target attacks. The

result of further review isn't going to be confidence in multi-target

security; it will be a continuing list of further multi-target attacks.

A KEM is one tiny piece of an end-user cryptographic system. Response #1

is looking at this tiny piece, and adding a complication that stops some

multi-target attacks against that piece ("stops" meaning that they are

no more effective than single-target attacks). Maybe at some point we'll

have a proof that this stops all possible multi-target attacks against

this tiny piece. But what about the rest of the system, with tons of

other multi-target weaknesses and potential multi-target weaknesses?

As an example, the New Hope submission claims that it is "rather

uncomfortable to have the security of all connections rely on a single

instance of a lattice problem" since the multiple targets might be

broken by a "massive precomputation". It claims that one can "avoid"

this "pitfall" by having each user generate a separate parameter "a".

But the new ideal-lattice attack algorithm from

https://nitaj.users.lmno.cnrs.fr/Caen2018/Alice_Pellet_MaryCaen2018.pdf

speeds up attacks against _every_ target after a single massive lattice

precomputation. The attack focuses on Ideal-SVP, but the supposed

hardness of Ideal-SVP is the main justification for the claimed hardness

of Ring-LWE. I'd guess that the precomputation is quantitatively larger

than a typical attack, but the New Hope argument isn't quantitative. I'm

not saying that the attack precisely disproves the New Hope multi-target

claim; I'm simply saying that the claim isn't proven and could be wrong.

If a multi-target lattice attack is faster than a single-target lattice

attack, then what exactly is gained by the extra hashing in Response #1?

More broadly, when people combine KEMs with many other components to

produce complete end-user cryptographic systems, there will surely be

many different multi-target weaknesses. What's the strategy to remove

all these weaknesses, and how is a security evaluator supposed to gain

confidence in this approach?

Response #2, unlike Response #1, is a complete solution to the problem:

assume that there will be multi-target attacks, and choose the security

level to protect against the maximum possible effect of those attacks.

This eliminates the hard problem of evaluating multi-target security.

The security evaluator can safely focus on single-target attacks.

I understand how Response #1 can feel like progress. I've even advocated

an analogous complication---but that was in a situation where

* it was easy to see how the complication eliminated multi-target

attacks against a complete end-user cryptographic system;

* this turned into a simple, easily reviewable, proof of multi-target

security for the complete system; and

* the target security level was only 2^128 pre-quantum.

This KEM hash is a very different story. It is, at best, addressing a

much smaller part of a system that has many other multi-target

weaknesses. I doubt that anyone has ever constructed a similar system

with a proof of multi-target security, even if a proof is possible for

this small piece. Furthermore, we're talking about post-quantum KEMs

aiming for a much higher security level, so the performance impact of

Response #2 is relatively small.

> NTRU Prime’s evaluation technique is already less conservative than

> other systems’ — they would call its parameter set class II or III

> instead of V. Just saying that you can divide that by T=2^64 and

> still be OK is unconvincing

You're starting from 2^155 pre-quantum security for Streamlined NTRU

Prime 4591^761, dividing by 2^64 for a 2^64-target attack, and saying

that 2^91 is not OK? Or you're doing something similar starting from

2^140 post-quantum security?

The starting points 2^155 and 2^140 are wild underestimates of the cost

of known single-target attacks. Checking the literature shows, as stated

in my undisputed summary from February, that each of these numbers comes

from

* taking a _small piece_ of an attack,

* dividing this piece into _expensive operations_,

* _asymptotically_ counting the number of these operations, and

* replacing o(1) in the asymptotics with 0.

Note that replacing o(1) with 0 can produce _overestimates_; there's no

guarantee that the o(1) is positive. The reason we think that the

overall effect of the above four steps is a wild underestimate is that

some people have done much more work for much more serious estimates of

attack costs, consistently finding much higher attack costs than the

underestimates indicate (although there are still many unresolved

questions about the details).

This also means that someone doing a security review has to check the

serious estimates (and worry about the unresolved questions). The wild

underestimates are certainly much simpler _formulas_, but this doesn't

mean that they simplify _security evaluation_.

Of course underestimating security levels has the general effect of

increasing security parameters. This could protect users against future

attack speedups. You call this "conservative", suggesting that it's a

good approach. I have two questions and a comment:

* How do you think this approach is different from "overdesigning",

which a moment ago you called "unappealing"? What's the principle

for deciding what's "overdesigning" and what's "conservative"?

* If you think that using a wild underestimate is good, do you think

that using an even wilder underestimate is better? What's the

principle for deciding when to stop?

* There's a different approach, which I'll call the "accurate

conservative" approach, that (1) clearly produces better results

and (2) doesn't use these underestimates.

The accurate conservative approach is as follows:

* Accurately evaluate the costs of the attacks.

* On the basis of this evaluation, choose the highest-security

parameters that users can afford.

The accurate conservative approach, by definition, is maximizing the

users' security against the best attacks considered.

For comparison, you seem to be advocating what I'll call the "inaccurate

conservative" approach, which replaces the accurate evaluations with

something less accurate---in particular, simplified formulas that turn

out to be underestimates. This replacement generally _damages_ security,

as explained in Appendix B.1.2 of

https://cr.yp.to/papers.html#nonuniform

(cited in the NTRU Prime submission under "Warning: underestimates are

dangerous"), and it can't improve security.

The previous paragraph is talking about the damage done to security

against _current_ attacks. The best bet we have is that this will also

damage security against _future_ attacks. Of course it's possible that

advances in attacks will magically turn out to be better correlated with

the current underestimates than with accurate evaluations of today's

attack costs---but what's the argument for betting on this possibility?

"I have an argument: underestimate U of the security level is smaller

than accurate estimate A, and future attacks will also be faster than

A!" ---To see that this argument is nonsense, consider a more radical

underestimate such as A/3. (If A/3 isn't below U, replace it with some

even smaller function of A.) The argument claims that replacing A with U

is an improvement, and that replacing U with A/3 is an improvement---but

A/3 gives exactly the same parameter choices as A, so the argument has

directly contradicted itself.

There are cases where the literature looks more closely at the details

of attacks and makes well-informed assessments of risks of various types

of speedups. Superficially, this might sound like recommending

underestimates---but there's an essential difference in methodology.

Saying "we want a smaller estimate because smaller is good", or "we want

a simpler formula because simpler is good", is very different from

saying "we see a risk of the following type of attack improvement and we

want to accurately evaluate the impact". Often the conclusion is that

the impact is so devastating that proper risk management requires

avoiding some types of cryptosystems. Consider, e.g., the statement

It is anticipated that these techniques can be used to produce

collisions for MD5 and perhaps also for RIPEMD

in https://link.springer.com/content/pdf/10.1007%2F3-540-60865-6_44.pdf

twenty years ago.

There are also cases where we can't reasonably evaluate security against

some types of attacks since they're beyond the attack model that the

community normally considers. Timing attacks were like this twenty years

ago, with very high risks that people claiming timing-attack security

were disastrously wrong:

Table lookups: This instruction is not susceptible to a timing attack

...

--- Daemen and Rijmen, Resistance against implementation attacks: a

comparative study of the AES proposals, 1999

https://web.archive.org/web/20000816072451/http://csrc.nist.gov:80/encryption/aes/round1/conf2/papers/daemen.pdf

Table lookup: not vulnerable to timing attacks.

--- NIST, Report on the development of the Advanced Encryption

Standard (AES), 2001

Multi-target attacks are like this today. They're a fun research topic,

and maybe in twenty years we'll know how to stop multi-target attacks;

but most components of today's post-quantum cryptosystems were designed

by people who weren't even thinking about, let alone trying to stop,

multi-target attacks. Response #2 recognizes this, gives us confidence

that we're protected, and doesn't cost much in the context of

high-security post-quantum cryptography.

---Dan

Dec 8, 2018, 3:24:15 PM12/8/18

to D. J. Bernstein, pqc-...@list.nist.gov

Hi Dan,

I’m sorry that I don’t have time to address every point of your essay.

Below I have picked out a few points that seem most relevant to the

discussion of domain separated hashes in the CCA2 conversion.

> On Dec 7, 2018, at 8:17 PM, D. J. Bernstein <d...@cr.yp.to> wrote:

>

> Focusing here on comparing the following two responses to multi-target

> attacks:

>

> * Response #1: include a key prefix in the hash used in the CCA2

> conversion, a long enough prefix that prefixes of user keys don't

> collide.

>

> * Response #2: "aim for a very high single-target security level, and

> then rely on the fact that T-target attacks gain at most a factor

> T".

> […]

Response #2a: Nobody actually needs Class V security as NIST

recommends it (2^256 classical work even to find one target of many).

The difference between 256 bits and any realistic attack is enough to

cover any likely multi-target attack.

This is how I would categorize NTRU Prime’s multi-target defense.

You describe this as if “aim[ing] for a very high single-target security

level” is exclusive of Response #1, or somehow obviates it. But this

isn’t true. The security margin between 2^256 and reality is there to

counter many possible threats, and multi-target attacks are only one

of them. You also have to consider improvements in lattice reduction,

lack of tightness in the FO transform (almost universal in the QROM),

lack of tightness in higher-level protocols using the KEM, and other

threats. As a result, almost every submission “aim[s] for a very high

single-target security level”.

Response #2b: If we want 2^256 multi-target security, then aim for

2^320 single-target. I didn’t notice any schemes doing this — some

made higher security claims or estimates, but I didn’t see anyone

claiming this as their multi-target defense.

And yes, I would call this “overdesigning”. Overdesigning can be a

virtue, and I see that is the purpose of Class V security in the first place:

it’s a hedge against unknown (expected or unexpected) attacks.

But adding 64 extra bits of security against lattice attacks, specifically to

counter an attack that probably doesn’t apply in full, seems unappealing

to me, because it comes at a significant performance cost.

> [ regarding Response #1: ]

>> Each feature like this adds complexity and may hurt performance, while

>> possibly improving security.

>

> There's a metric missing from this list, namely the complexity of a

> comprehensive _review_ of security. One way to see the importance of

> this metric is as follows.

Response #1 is domain-separating the hash function. This is a

long-standing security technique and has formal, informal and

practical benefits:

* Formally, in other contexts, it’s often required in security proofs.

Formally, it has a clear benefit in security proofs for KEMs.

* Informally, it means that if an attacker is computing lots of hashes, he

has chosen one single target per hash. This reduces the set of attacks

that must be considered.

* Practically, it is part of several systems’ defense against CCA attacks,

and the lack of this measure contributed to borderline-feasible CCA

attacks on other systems, all in this very standardization process.

I am aware that NTRU Prime isn’t one of those systems.

Furthermore, domain separation is really simple. It costs no bandwidth,

little to no extra memory, very few extra cycles (tens to thousands

depending on whether it pushes you into a new hash block), very little

code, no extra DPA concern, and little complexity in the spec. And it

clearly doesn’t harm security.

As a result, I simply do not agree that domain separation is a mere

“complication” — on the contrary, I think it helps to focus analysis.

I recognize that full, tight, end-to-end proofs of the effectiveness of

domain separation for KEMs (or all systems that contain KEMs? Is

that the requirement??) are not currently known, and may never be

known, but I expect that domain separation will figure in proofs of

security for components of the system (or perhaps the entire KEM).

> I understand how Response #1 can feel like progress. I've even advocated

> an analogous complication---but that was in a situation where

>

> * it was easy to see how the complication eliminated multi-target

> attacks against a complete end-user cryptographic system;

> * this turned into a simple, easily reviewable, proof of multi-target

> security for the complete system; and

> * the target security level was only 2^128 pre-quantum.

Are you referring to Ed25519 here? You advocated domain-separating

the hash in 2011 by saying:

“…; the use of [the public key] A [in the hash] is an inexpensive way to

alleviate concerns that several public keys could be attacked

simultaneously;”.

This sounds an awful lot like my informal justification for hashing the

seed above, rather than foresight that it would “[turn] into a simple,

> This KEM hash is a very different story. It is, at best, addressing a

> much smaller part of a system that has many other multi-target

> weaknesses. I doubt that anyone has ever constructed a similar system

> with a proof of multi-target security, even if a proof is possible for

> this small piece. Furthermore, we're talking about post-quantum KEMs

> aiming for a much higher security level, so the performance impact of

> Response #2 is relatively small.

I don’t understand this argument either. It’s a smaller ratio than at

lower security levels, but surely it has a higher absolute performance

impact?

>> NTRU Prime’s evaluation technique is already less conservative than

>> other systems’ — they would call its parameter set class II or III

>> instead of V. Just saying that you can divide that by T=2^64 and

>> still be OK is unconvincing

>

> Let me see if I understand, quantitatively, what you're claiming here.

> You're starting from 2^155 pre-quantum security for Streamlined NTRU

> Prime 4591^761, dividing by 2^64 for a 2^64-target attack, and saying

> that 2^91 is not OK? Or you're doing something similar starting from

> 2^140 post-quantum security?

Quantitatively, the requested specification of Class V is 2^256 effort to

break one key or ciphertext out of many.

Starting from your estimate of 2^256 single-target security, which is based

on your best estimates of real performance of real attacks currently known

today, you cannot simply subtract 64 bits and claim that you have met

the specification. Instead, you can argue that the specification is silly,

and have done, but this doesn’t help an auditor who wishes to determine

whether you’ve met that specification.

Furthermore, let’s consider that for realistic systems, Class III security

(multi-target, as specified) is certainly enough. As I understand it this

is really what you’re arguing here anyway when you say that Class V

multi-target is silly. If an auditor wishes to determine that you’ve met Class

III, then he or she will need to either:

* Evaluate the possibility of multi-target attacks -- the evaluation will likely

be easier if the hash is domain-separated; or

* (Response #2) Check that your claimed Class V instance actually

achieves its claimed 2^256 single-target security, and therefore reaches

Class III multi-target.

Response #2 is easier for an auditor to accept if your system includes

some security margin, instead of being designed to hit exactly 256-bit

security against currently-known attacks. It is therefore more consistent

with a system that has a security margin with respect to its claims, than

with one that is designed without a security margin.

NTRU Prime is different from other proposals in that it includes only

one parameter set (per problem type), with little margin between the

best estimate for the security of that parameter set and Class V. I’m

not arguing that this is a bad decision, only that it doesn’t make the

multi-target evaluation easier. It also seems inconsistent with your

strategy to “accurately evaluate the costs of the attacks” and then

“choose the highest-security parameters that users can afford”, since

different users can afford different parameters.

[…]

argument above clarifies this. That is a completely separate topic of

conversation.

> * How do you think this approach is different from "overdesigning",

> which a moment ago you called "unappealing"? What's the principle

> for deciding what's "overdesigning" and what's "conservative”?

It is overdesigning. Overdesigning by an additional 64 bits, just to avoid

including any multi-target defenses, is what’s unappealing.

Cheers,

— Mike

I’m sorry that I don’t have time to address every point of your essay.

Below I have picked out a few points that seem most relevant to the

discussion of domain separated hashes in the CCA2 conversion.

> On Dec 7, 2018, at 8:17 PM, D. J. Bernstein <d...@cr.yp.to> wrote:

>

> Focusing here on comparing the following two responses to multi-target

> attacks:

>

> * Response #1: include a key prefix in the hash used in the CCA2

> conversion, a long enough prefix that prefixes of user keys don't

> collide.

>

> * Response #2: "aim for a very high single-target security level, and

> then rely on the fact that T-target attacks gain at most a factor

> T".

> Mike Hamburg writes, regarding Response #2:

>> overdesigning this way is unappealing

> […]
>> overdesigning this way is unappealing

> Sure, 2x sounds significant, but that's for an archaic 2^64 pre-quantum

> security level. The gap rapidly shrinks as the target security level

> increases. Do you really claim that 1.25x is "overdesigning”?

> [later in your reply]
> security level. The gap rapidly shrinks as the target security level

> increases. Do you really claim that 1.25x is "overdesigning”?

> Response #2, unlike Response #1, is a complete solution to the problem:

> assume that there will be multi-target attacks, and choose the security

> level to protect against the maximum possible effect of those attacks.

> This eliminates the hard problem of evaluating multi-target security.

> The security evaluator can safely focus on single-target attacks.

I would further categorize response #2 into two categories.
> assume that there will be multi-target attacks, and choose the security

> level to protect against the maximum possible effect of those attacks.

> This eliminates the hard problem of evaluating multi-target security.

> The security evaluator can safely focus on single-target attacks.

Response #2a: Nobody actually needs Class V security as NIST

recommends it (2^256 classical work even to find one target of many).

The difference between 256 bits and any realistic attack is enough to

cover any likely multi-target attack.

This is how I would categorize NTRU Prime’s multi-target defense.

You describe this as if “aim[ing] for a very high single-target security

level” is exclusive of Response #1, or somehow obviates it. But this

isn’t true. The security margin between 2^256 and reality is there to

counter many possible threats, and multi-target attacks are only one

of them. You also have to consider improvements in lattice reduction,

lack of tightness in the FO transform (almost universal in the QROM),

lack of tightness in higher-level protocols using the KEM, and other

threats. As a result, almost every submission “aim[s] for a very high

single-target security level”.

Response #2b: If we want 2^256 multi-target security, then aim for

2^320 single-target. I didn’t notice any schemes doing this — some

made higher security claims or estimates, but I didn’t see anyone

claiming this as their multi-target defense.

And yes, I would call this “overdesigning”. Overdesigning can be a

virtue, and I see that is the purpose of Class V security in the first place:

it’s a hedge against unknown (expected or unexpected) attacks.

But adding 64 extra bits of security against lattice attacks, specifically to

counter an attack that probably doesn’t apply in full, seems unappealing

to me, because it comes at a significant performance cost.

> [ regarding Response #1: ]

>> Each feature like this adds complexity and may hurt performance, while

>> possibly improving security.

>

> There's a metric missing from this list, namely the complexity of a

> comprehensive _review_ of security. One way to see the importance of

> this metric is as follows.

long-standing security technique and has formal, informal and

practical benefits:

* Formally, in other contexts, it’s often required in security proofs.

Formally, it has a clear benefit in security proofs for KEMs.

* Informally, it means that if an attacker is computing lots of hashes, he

has chosen one single target per hash. This reduces the set of attacks

that must be considered.

* Practically, it is part of several systems’ defense against CCA attacks,

and the lack of this measure contributed to borderline-feasible CCA

attacks on other systems, all in this very standardization process.

I am aware that NTRU Prime isn’t one of those systems.

Furthermore, domain separation is really simple. It costs no bandwidth,

little to no extra memory, very few extra cycles (tens to thousands

depending on whether it pushes you into a new hash block), very little

code, no extra DPA concern, and little complexity in the spec. And it

clearly doesn’t harm security.

As a result, I simply do not agree that domain separation is a mere

“complication” — on the contrary, I think it helps to focus analysis.

I recognize that full, tight, end-to-end proofs of the effectiveness of

domain separation for KEMs (or all systems that contain KEMs? Is

that the requirement??) are not currently known, and may never be

known, but I expect that domain separation will figure in proofs of

security for components of the system (or perhaps the entire KEM).

> I understand how Response #1 can feel like progress. I've even advocated

> an analogous complication---but that was in a situation where

>

> * it was easy to see how the complication eliminated multi-target

> attacks against a complete end-user cryptographic system;

> * this turned into a simple, easily reviewable, proof of multi-target

> security for the complete system; and

> * the target security level was only 2^128 pre-quantum.

the hash in 2011 by saying:

“…; the use of [the public key] A [in the hash] is an inexpensive way to

alleviate concerns that several public keys could be attacked

simultaneously;”.

This sounds an awful lot like my informal justification for hashing the

seed above, rather than foresight that it would “[turn] into a simple,

easily reviewable, proof of multi-target security for the complete

system”.
> This KEM hash is a very different story. It is, at best, addressing a

> much smaller part of a system that has many other multi-target

> weaknesses. I doubt that anyone has ever constructed a similar system

> with a proof of multi-target security, even if a proof is possible for

> this small piece. Furthermore, we're talking about post-quantum KEMs

> aiming for a much higher security level, so the performance impact of

> Response #2 is relatively small.

lower security levels, but surely it has a higher absolute performance

impact?

>> NTRU Prime’s evaluation technique is already less conservative than

>> other systems’ — they would call its parameter set class II or III

>> instead of V. Just saying that you can divide that by T=2^64 and

>> still be OK is unconvincing

>

> Let me see if I understand, quantitatively, what you're claiming here.

> You're starting from 2^155 pre-quantum security for Streamlined NTRU

> Prime 4591^761, dividing by 2^64 for a 2^64-target attack, and saying

> that 2^91 is not OK? Or you're doing something similar starting from

> 2^140 post-quantum security?

break one key or ciphertext out of many.

Starting from your estimate of 2^256 single-target security, which is based

on your best estimates of real performance of real attacks currently known

today, you cannot simply subtract 64 bits and claim that you have met

the specification. Instead, you can argue that the specification is silly,

and have done, but this doesn’t help an auditor who wishes to determine

whether you’ve met that specification.

Furthermore, let’s consider that for realistic systems, Class III security

(multi-target, as specified) is certainly enough. As I understand it this

is really what you’re arguing here anyway when you say that Class V

multi-target is silly. If an auditor wishes to determine that you’ve met Class

III, then he or she will need to either:

* Evaluate the possibility of multi-target attacks -- the evaluation will likely

be easier if the hash is domain-separated; or

* (Response #2) Check that your claimed Class V instance actually

achieves its claimed 2^256 single-target security, and therefore reaches

Class III multi-target.

Response #2 is easier for an auditor to accept if your system includes

some security margin, instead of being designed to hit exactly 256-bit

security against currently-known attacks. It is therefore more consistent

with a system that has a security margin with respect to its claims, than

with one that is designed without a security margin.

NTRU Prime is different from other proposals in that it includes only

one parameter set (per problem type), with little margin between the

best estimate for the security of that parameter set and Class V. I’m

not arguing that this is a bad decision, only that it doesn’t make the

multi-target evaluation easier. It also seems inconsistent with your

strategy to “accurately evaluate the costs of the attacks” and then

“choose the highest-security parameters that users can afford”, since

different users can afford different parameters.

[…]

> Of course underestimating security levels has the general effect of

> increasing security parameters. This could protect users against future

> attack speedups. You call this "conservative", suggesting that it's a

> good approach. I have two questions and a comment:

I am not advocating for underestimating security, and I hope that my
> increasing security parameters. This could protect users against future

> attack speedups. You call this "conservative", suggesting that it's a

> good approach. I have two questions and a comment:

argument above clarifies this. That is a completely separate topic of

conversation.

> * How do you think this approach is different from "overdesigning",

> which a moment ago you called "unappealing"? What's the principle

> for deciding what's "overdesigning" and what's "conservative”?

including any multi-target defenses, is what’s unappealing.

Cheers,

— Mike

Dec 14, 2018, 9:46:10 PM12/14/18

to pqc-...@list.nist.gov

Changing subject line to emphasize the focus on multi-target security.

Here are four examples of how hard it is to

(1) prove,

(2) evaluate claims of, and

(3) actually achieve

tight multi-target security (i.e., multi-target security as high as

single-target security) for deployable encryption systems. For

simplicity I'll focus on pre-quantum security levels here, but similar

comments apply to post-quantum security levels.

Example 1: Dave is a designer pulling together components designed for

2^256 single-user security. The user data is encrypted with AES-256-CTR.

The 256-bit session keys are produced by a lattice KEM designed for

2^256 security. Of course more components are needed for authentication

etc., but these details will be enough to explain the example.

Dave then learns about 2^64-target attacks that "break" the system in

time "only" 2^192. Somehow Dave gets the crazy idea that this number is

a problem. The craziness of this idea isn't my main point here; let's

assume that NIST is requiring 2^256 security against 2^64-target attacks

(even though NIST's current standards are full of failures to achieve

this), and that the security reviewer wants to check for this.

One response---making the security reviewer's job easy---is to simply

multiply each internal 256 by an extra 1.25, scaling up the internal

components appropriately, leaving a 2^64 buffer beyond 2^256 security.

It's hard to find users who will care about the performance difference.

But Dave doesn't do this. Dave has heard that it's "best practice" to

"randomize" all the internal components, with the goal of making

multi-target attacks as hard as single-target attacks:

* Dave sticks the public key, and everything else he can think of,

into the hash that generates the session key.

* Dave chooses a random 96-bit IV for AES-256-CTR.

* Et cetera.

But do these complications actually achieve the goal of tight

multi-target security for Dave's encryption system? Or is this yet

another entry in the long line of failed cryptographic design ideas?

We could gain some confidence from seeing experienced cryptanalysts

seriously trying and failing to break Dave's encryption system. But

nobody is even trying. Most of the cryptanalysts have their hands full

finding security problems in simpler, smaller systems and components.

Occasionally cryptanalysts tackle pieces of HTTPS because it's so widely

deployed, but this doesn't mean they're going to look at Dave's system.

So Dave hopes to have a _proof_ of tight multi-target security. But he

doesn't have a proof. Here's one of many reasons why.

Often users encrypt structured plaintexts: e.g., ASCII plaintexts where

each byte has top bit 0. Assume that AES-256 has W>0 incredibly weak

keys that, in counter mode starting from any IV, turn these structured

plaintexts into a quickly detectable ciphertext pattern---e.g., let's

say the top output bit in each block is correlated with a counter bit.

I'd be surprised by such a weakness, but we certainly don't have a proof

that no such weakness exists. Also assume that this is the only weakness

in AES-256.

For single-user security, this weakness produces a fast attack with

success probability about W/2^256. For U-user security, the same

weakness produces a fast attack with probability about UW/2^256. That's

the worst possible multi-user degradation factor. Contrary to Dave's

intuition, randomizing the IV does _not_ stop this attack.

Of course, this also means that the multi-user security of Dave's

encryption system as a whole is limited to 2^256/UW, so Dave _isn't_

obtaining 2^256 security against 2^64-target attacks. Also, the goal of

having multi-target security as high as single-target security hasn't

been achieved, unless Dave's system is "broken" in some other way that

limits single-target security to 2^256/UW, contrary to the original

objective of 2^256 security. Any supposed proof of tight multi-target

security for Dave's encryption system as a whole is going to have to

deal with this proof obstacle (and much more---see below).

As part of this post-quantum standardization project, NIST recommended

AES-GCM with a randomized nonce---even though this randomization is not

required in the relevant NIST standards. NIST claimed incorrectly that

eprint 2017/435 proves tight multi-target security for this form of

AES-GCM. I pointed out the above obstacle to a proof.

More than a year later, we again seem to be faced with a claim that

Dave's approach "has a clear benefit in security proofs" for encryption.

This claim is not based on any explanation of how a tight multi-target

security proof is supposed to avoid the above obstacle.

Example 2: For ECC, tight random self-reductions give a proof that

breaking _one of many_ keys is as difficult as breaking the first key;

but there are still big speedups for breaking _more_ of the keys. Does

this mean that the one-of-many security notions are asking the wrong

question? Similarly, even if there's just one key, should we be looking

at the cost of breaking one of many ciphertexts, or the cost of breaking

more of the ciphertexts? Real-world attackers of course collect all the

keys they can, _and_ all the ciphertexts they can.

Example 3: For more conventional multiplicative-group discrete logs, the

occasional work on batch attacks keeps producing changes in the

_exponents_ of attacks, and nobody in the area would claim to be

confident regarding the best possible exponents, never mind the more

sophisticated question of concrete costs. Differences in security levels

of different primes are even more poorly understood.

Example 4: For lattices, the research community can't even manage to pin

down single-target security levels, never mind the more sophisticated

questions of the security levels of multiple ciphertexts and multiple

keys. The worst-case-to-average-case reductions are much weaker than the

ECC worst-case-to-average-case reductions.

There's much more to say about these examples, and there are many other

examples of multi-target security being below single-target security.

See my blog post https://blog.cr.yp.to/20151120-batchattacks.html for an

introduction to this difficult research area, including several examples

of serious errors in the literature.

Mike Hamburg writes:

> Informally, it means that if an attacker is computing lots of

> hashes, he has chosen one single target per hash. This reduces the

> set of attacks that must be considered.

This is barely clipping the edge of a vast field of multi-target

attacks. Response #2---"aim for a very high single-target security

> the lack of this measure contributed to borderline-feasible CCA

> attacks on other systems, all in this very standardization process.

You're talking about attacks against systems that used nonzero

decryption-failure rates and that never justified the claim that those

rates are safe. This is an extremely dangerous practice that can easily

cause any amount of damage to security. It makes no sense to blame this

on multi-target options: the maximum possible multi-target protection

can't turn this handling of decryption failures into something safe.

Here are two ways to handle decryption failures safely:

(A) Eliminate them. Require an easy proof (unlike, say, the proof

originally claimed by LIMA). Check the proof carefully.

(B) Alternatively, if you really need slightly smaller ciphertexts:

* Figure out how to convincingly calculate very small nonzero

failure rates, taking account of the difficulties shown in the

literature (most recently https://eprint.iacr.org/2018/1172).

* Analyze attacks exploiting nonzero failure rates (most recently

https://eprint.iacr.org/2018/1089), and make sure your rates

are so tiny that these attacks are at an acceptable security

level. Note that the evaluation of an acceptable security level

depends on the multi-target issues that we're discussing.

* Make sure that the rates are small enough for the known ROM

theorems to provide an acceptable security level. Do _not_

tolerate speculation that the theorems will be quantitatively

improved. Check proofs carefully---which is not easy for these

theorems. Consider, e.g., https://cr.yp.to/papers.html#tightkem,

which gives counterexamples to two of the claimed theorems;

what exactly are the proof errors that accounted for this, and

how do these interact with decryption failures?

Evidently B is much more painful for the security reviewer than A---and

then imagine how much worse B becomes if you also have to think through

multi-target attacks at every step!

> You describe this as if “aim[ing] for a very high single-target security

> level” is exclusive of Response #1, or somehow obviates it.

You started by describing Response #1 as an "ease-of-analysis feature".

This is not correct. Response #2 makes security evaluation much easier.

It's certainly possible to do both Response #1 and Response #2. I never

claimed or suggested otherwise. I also don't see how this is responsive

to anything I'm saying. The combination doesn't make the reviewer's job

easier than simply doing Response #2.

Could Response #1 improve security compared to doing nothing? Perhaps.

Could the #1+#2 combination improve security compared to #2? Perhaps.

But the evidence for these alleged security improvements is flimsy, and

leaves any skeptical security reviewer with a torrent of questions about

multi-target security that are not going to be answered in the timeframe

of this standardization project.

> The security margin between 2^256 and reality is there to

> counter many possible threats, and multi-target attacks are only one

> of them. You also have to consider improvements in lattice reduction,

> lack of tightness in the FO transform (almost universal in the QROM),

> lack of tightness in higher-level protocols using the KEM, and other

> threats.

It's important to consider each threat from the perspective of the time

spent for thorough security review. I explained in my previous message

how this is connected to the risk of NIST standardizing something that's

devastatingly insecure.

For example, even if the reviewer sees enough evidence of cryptanalysts

settling on an OW-CPA security level, how is the reviewer supposed to

gain confidence in the QROM KEM security level? The lack of tightness

for QROM proofs is quantitatively so severe that the proofs say nothing

about typical parameter choices. There are certainly important cases

(e.g., preimages) where QROM attacks are much faster than ROM attacks,

so the reviewer has to worry that the tight ROM proofs are blind to much

faster QROM attacks. We really need cryptanalysts studying the QROM KEM

security, but there are very few experts on quantum cryptanalysis.

Multi-target threats are special because we have a simple, cheap,

totally convincing way to get all of their complications out of the

security-analysis picture: simply ask the reviewer to check for a very

high single-target security level.

You're arguing that we need a buffer of B bits to protect against other

threats. You want B to be large given the number of different risks. So

figure out which B gives you enough confidence about all these risks,

and then turn it into B+64 to _confidently_ protect against multi-target

attacks too. The performance gap becomes less and less noticeable as B

increases.

> I would further categorize response #2 into two categories.

> Response #2a: Nobody actually needs Class V security as NIST

> recommends it (2^256 classical work even to find one target of many).

Which recommendation are you referring to?

As noted above, NIST's standards are full of failures to achieve this

security level. There's very little above 2^256 single-target security,

and there are tons of multi-target attack vectors, some of which are

clearly exploitable and many more of which haven't been studied.

In NIST's final call for post-quantum proposals, the "security strength

categories" are defined with respect to the "relevant security

definition", which is "IND-CCA2 security" for the KEMs we're talking

about. IND-CCA2 security is a single-target security notion.

Some readers will understand you to be saying that NIST recommends

category V. I'm not aware of any NIST statement to this effect. NIST has

repeatedly used phrases such as "probably overkill" in referring to

category V.

NIST lists tight multi-target security in a separate section of

"additional security properties" that are claimed to be "desirable". My

understanding is that their argument for this is as follows:

* Starting from single-target security 2^128, multi-target attack

speedups are a problem.

* We can't throw this security level away, so we have to complicate

everything to get rid of every multi-target speedup we find.

* Starting from single-target security 2^256, we need the same

complications, since otherwise people will be confused about why

the 256-bit standards are simpler than the 128-bit standards.

Without bothering to evaluate the general merits of this argument, let's

simply look at what the argument means for a post-quantum proposal that

starts from a higher security level:

* First step: Sure, there could be U-target speedups, perhaps even as

large as a factor U.

* Second step: Um, the proposal _did_ throw the low security level

away. Is there a problem here?

* Third step: The proposal doesn't have the complications, so there's

nothing to be confused about.

This sort of proposal is just fine for quantitative security, and is a

big win for the ease of security analysis.

> This sounds an awful lot like my informal justification for hashing

> the seed above

There's a superficial level at which every let's-hash-another-input

argument sounds alike. However, hashing every input that people think of

creates bloated standards and _definitely_ interferes with security

review. One has to think more carefully about the details to see that

some of the inputs are useful and that most of them are not.

---Dan

Here are four examples of how hard it is to

(1) prove,

(2) evaluate claims of, and

(3) actually achieve

tight multi-target security (i.e., multi-target security as high as

single-target security) for deployable encryption systems. For

simplicity I'll focus on pre-quantum security levels here, but similar

comments apply to post-quantum security levels.

Example 1: Dave is a designer pulling together components designed for

2^256 single-user security. The user data is encrypted with AES-256-CTR.

The 256-bit session keys are produced by a lattice KEM designed for

2^256 security. Of course more components are needed for authentication

etc., but these details will be enough to explain the example.

Dave then learns about 2^64-target attacks that "break" the system in

time "only" 2^192. Somehow Dave gets the crazy idea that this number is

a problem. The craziness of this idea isn't my main point here; let's

assume that NIST is requiring 2^256 security against 2^64-target attacks

(even though NIST's current standards are full of failures to achieve

this), and that the security reviewer wants to check for this.

One response---making the security reviewer's job easy---is to simply

multiply each internal 256 by an extra 1.25, scaling up the internal

components appropriately, leaving a 2^64 buffer beyond 2^256 security.

It's hard to find users who will care about the performance difference.

But Dave doesn't do this. Dave has heard that it's "best practice" to

"randomize" all the internal components, with the goal of making

multi-target attacks as hard as single-target attacks:

* Dave sticks the public key, and everything else he can think of,

into the hash that generates the session key.

* Dave chooses a random 96-bit IV for AES-256-CTR.

* Et cetera.

But do these complications actually achieve the goal of tight

multi-target security for Dave's encryption system? Or is this yet

another entry in the long line of failed cryptographic design ideas?

We could gain some confidence from seeing experienced cryptanalysts

seriously trying and failing to break Dave's encryption system. But

nobody is even trying. Most of the cryptanalysts have their hands full

finding security problems in simpler, smaller systems and components.

Occasionally cryptanalysts tackle pieces of HTTPS because it's so widely

deployed, but this doesn't mean they're going to look at Dave's system.

So Dave hopes to have a _proof_ of tight multi-target security. But he

doesn't have a proof. Here's one of many reasons why.

Often users encrypt structured plaintexts: e.g., ASCII plaintexts where

each byte has top bit 0. Assume that AES-256 has W>0 incredibly weak

keys that, in counter mode starting from any IV, turn these structured

plaintexts into a quickly detectable ciphertext pattern---e.g., let's

say the top output bit in each block is correlated with a counter bit.

I'd be surprised by such a weakness, but we certainly don't have a proof

that no such weakness exists. Also assume that this is the only weakness

in AES-256.

For single-user security, this weakness produces a fast attack with

success probability about W/2^256. For U-user security, the same

weakness produces a fast attack with probability about UW/2^256. That's

the worst possible multi-user degradation factor. Contrary to Dave's

intuition, randomizing the IV does _not_ stop this attack.

Of course, this also means that the multi-user security of Dave's

encryption system as a whole is limited to 2^256/UW, so Dave _isn't_

obtaining 2^256 security against 2^64-target attacks. Also, the goal of

having multi-target security as high as single-target security hasn't

been achieved, unless Dave's system is "broken" in some other way that

limits single-target security to 2^256/UW, contrary to the original

objective of 2^256 security. Any supposed proof of tight multi-target

security for Dave's encryption system as a whole is going to have to

deal with this proof obstacle (and much more---see below).

As part of this post-quantum standardization project, NIST recommended

AES-GCM with a randomized nonce---even though this randomization is not

required in the relevant NIST standards. NIST claimed incorrectly that

eprint 2017/435 proves tight multi-target security for this form of

AES-GCM. I pointed out the above obstacle to a proof.

More than a year later, we again seem to be faced with a claim that

Dave's approach "has a clear benefit in security proofs" for encryption.

This claim is not based on any explanation of how a tight multi-target

security proof is supposed to avoid the above obstacle.

Example 2: For ECC, tight random self-reductions give a proof that

breaking _one of many_ keys is as difficult as breaking the first key;

but there are still big speedups for breaking _more_ of the keys. Does

this mean that the one-of-many security notions are asking the wrong

question? Similarly, even if there's just one key, should we be looking

at the cost of breaking one of many ciphertexts, or the cost of breaking

more of the ciphertexts? Real-world attackers of course collect all the

keys they can, _and_ all the ciphertexts they can.

Example 3: For more conventional multiplicative-group discrete logs, the

occasional work on batch attacks keeps producing changes in the

_exponents_ of attacks, and nobody in the area would claim to be

confident regarding the best possible exponents, never mind the more

sophisticated question of concrete costs. Differences in security levels

of different primes are even more poorly understood.

Example 4: For lattices, the research community can't even manage to pin

down single-target security levels, never mind the more sophisticated

questions of the security levels of multiple ciphertexts and multiple

keys. The worst-case-to-average-case reductions are much weaker than the

ECC worst-case-to-average-case reductions.

There's much more to say about these examples, and there are many other

examples of multi-target security being below single-target security.

See my blog post https://blog.cr.yp.to/20151120-batchattacks.html for an

introduction to this difficult research area, including several examples

of serious errors in the literature.

Mike Hamburg writes:

> Informally, it means that if an attacker is computing lots of

> hashes, he has chosen one single target per hash. This reduces the

> set of attacks that must be considered.

attacks. Response #2---"aim for a very high single-target security

level, and then rely on the fact that T-target attacks gain at most a

factor T"---allows the reviewer to skip the entire field.
> the lack of this measure contributed to borderline-feasible CCA

> attacks on other systems, all in this very standardization process.

decryption-failure rates and that never justified the claim that those

rates are safe. This is an extremely dangerous practice that can easily

cause any amount of damage to security. It makes no sense to blame this

on multi-target options: the maximum possible multi-target protection

can't turn this handling of decryption failures into something safe.

Here are two ways to handle decryption failures safely:

(A) Eliminate them. Require an easy proof (unlike, say, the proof

originally claimed by LIMA). Check the proof carefully.

(B) Alternatively, if you really need slightly smaller ciphertexts:

* Figure out how to convincingly calculate very small nonzero

failure rates, taking account of the difficulties shown in the

literature (most recently https://eprint.iacr.org/2018/1172).

* Analyze attacks exploiting nonzero failure rates (most recently

https://eprint.iacr.org/2018/1089), and make sure your rates

are so tiny that these attacks are at an acceptable security

level. Note that the evaluation of an acceptable security level

depends on the multi-target issues that we're discussing.

* Make sure that the rates are small enough for the known ROM

theorems to provide an acceptable security level. Do _not_

tolerate speculation that the theorems will be quantitatively

improved. Check proofs carefully---which is not easy for these

theorems. Consider, e.g., https://cr.yp.to/papers.html#tightkem,

which gives counterexamples to two of the claimed theorems;

what exactly are the proof errors that accounted for this, and

how do these interact with decryption failures?

Evidently B is much more painful for the security reviewer than A---and

then imagine how much worse B becomes if you also have to think through

multi-target attacks at every step!

> You describe this as if “aim[ing] for a very high single-target security

> level” is exclusive of Response #1, or somehow obviates it.

This is not correct. Response #2 makes security evaluation much easier.

It's certainly possible to do both Response #1 and Response #2. I never

claimed or suggested otherwise. I also don't see how this is responsive

to anything I'm saying. The combination doesn't make the reviewer's job

easier than simply doing Response #2.

Could Response #1 improve security compared to doing nothing? Perhaps.

Could the #1+#2 combination improve security compared to #2? Perhaps.

But the evidence for these alleged security improvements is flimsy, and

leaves any skeptical security reviewer with a torrent of questions about

multi-target security that are not going to be answered in the timeframe

of this standardization project.

> The security margin between 2^256 and reality is there to

> counter many possible threats, and multi-target attacks are only one

> of them. You also have to consider improvements in lattice reduction,

> lack of tightness in the FO transform (almost universal in the QROM),

> lack of tightness in higher-level protocols using the KEM, and other

> threats.

spent for thorough security review. I explained in my previous message

how this is connected to the risk of NIST standardizing something that's

devastatingly insecure.

For example, even if the reviewer sees enough evidence of cryptanalysts

settling on an OW-CPA security level, how is the reviewer supposed to

gain confidence in the QROM KEM security level? The lack of tightness

for QROM proofs is quantitatively so severe that the proofs say nothing

about typical parameter choices. There are certainly important cases

(e.g., preimages) where QROM attacks are much faster than ROM attacks,

so the reviewer has to worry that the tight ROM proofs are blind to much

faster QROM attacks. We really need cryptanalysts studying the QROM KEM

security, but there are very few experts on quantum cryptanalysis.

Multi-target threats are special because we have a simple, cheap,

totally convincing way to get all of their complications out of the

security-analysis picture: simply ask the reviewer to check for a very

high single-target security level.

You're arguing that we need a buffer of B bits to protect against other

threats. You want B to be large given the number of different risks. So

figure out which B gives you enough confidence about all these risks,

and then turn it into B+64 to _confidently_ protect against multi-target

attacks too. The performance gap becomes less and less noticeable as B

increases.

> I would further categorize response #2 into two categories.

> Response #2a: Nobody actually needs Class V security as NIST

> recommends it (2^256 classical work even to find one target of many).

As noted above, NIST's standards are full of failures to achieve this

security level. There's very little above 2^256 single-target security,

and there are tons of multi-target attack vectors, some of which are

clearly exploitable and many more of which haven't been studied.

In NIST's final call for post-quantum proposals, the "security strength

categories" are defined with respect to the "relevant security

definition", which is "IND-CCA2 security" for the KEMs we're talking

about. IND-CCA2 security is a single-target security notion.

Some readers will understand you to be saying that NIST recommends

category V. I'm not aware of any NIST statement to this effect. NIST has

repeatedly used phrases such as "probably overkill" in referring to

category V.

NIST lists tight multi-target security in a separate section of

"additional security properties" that are claimed to be "desirable". My

understanding is that their argument for this is as follows:

* Starting from single-target security 2^128, multi-target attack

speedups are a problem.

* We can't throw this security level away, so we have to complicate

everything to get rid of every multi-target speedup we find.

* Starting from single-target security 2^256, we need the same

complications, since otherwise people will be confused about why

the 256-bit standards are simpler than the 128-bit standards.

Without bothering to evaluate the general merits of this argument, let's

simply look at what the argument means for a post-quantum proposal that

starts from a higher security level:

* First step: Sure, there could be U-target speedups, perhaps even as

large as a factor U.

* Second step: Um, the proposal _did_ throw the low security level

away. Is there a problem here?

* Third step: The proposal doesn't have the complications, so there's

nothing to be confused about.

This sort of proposal is just fine for quantitative security, and is a

big win for the ease of security analysis.

> This sounds an awful lot like my informal justification for hashing

> the seed above

argument sounds alike. However, hashing every input that people think of

creates bloated standards and _definitely_ interferes with security

review. One has to think more carefully about the details to see that

some of the inputs are useful and that most of them are not.

---Dan

Dec 14, 2018, 11:32:22 PM12/14/18

to D. J. Bernstein, pqc-...@list.nist.gov

OK. You’ve convinced me, or at least exhausted me. Hashing the public key or its seed is only:

* A required security feature if your system exhibits decryption failures; and

* A useful feature to reduce multi-target concerns if your system has any parameter sets aimed at class <= III, give or take.

But in systems that have only class V security targets and no decryption failures, it’s a mere complication that doesn’t hurt security and only mitigates attacks in irrelevant corner cases.

Do I have that right?

— Mike

* A required security feature if your system exhibits decryption failures; and

* A useful feature to reduce multi-target concerns if your system has any parameter sets aimed at class <= III, give or take.

But in systems that have only class V security targets and no decryption failures, it’s a mere complication that doesn’t hurt security and only mitigates attacks in irrelevant corner cases.

Do I have that right?

— Mike

Reply all

Reply to author

Forward

0 new messages