876 views

Skip to first unread message

Jun 11, 2021, 12:18:23 PM6/11/21

to pqc-...@list.nist.gov

1. There are theorems _loosely_ saying LWR >= LWE >= SIVP. (Replacing

SIVP by, e.g., BDDwDGS is essentially orthogonal to this message.)

2. I recommend that NISTPQC give negative weight to the loose LWE >=

SIVP theorem, and negative weight to the loose LWR >= SIVP theorem;

i.e., actively downgrade submissions that advertise these theorems.

Rationale for not rewarding these theorems: Even if we assume that no

further advances are possible in SIVP attacks, these theorems are far

too loose to ensure security for the parameter sets proposed for

standardization. Furthermore, as the MQDSS security disaster

illustrates, sometimes attackers can exploit exactly the looseness that

allows a proof to be written down. Systems with this problem are more

likely to be selected if loose theorems are rewarded; given recent

attack progress, it's unreasonable to assume that all attacks will be

found before selection, standardization, and deployment. To summarize,

rewarding such loose theorems increases security risks.

Rationale for giving the theorems negative weight, rather than zero

weight: In general, submissions advertising theorems of this type used a

design process that actively selected designs to allow such theorems.

Rewarding these theorems at the design stage carries security risks

analogous to rewarding these theorems later in NISTPQC; downgrading such

designs is recognizing the dangers of the design process. Furthermore,

distraction is a systemic security risk: time put into these theorems is

time taken away from ensuring the security properties that the call for

submissions asked for. Finally, evaluations have already been corrupted

by years of advertising of these theorems; compensating for this

requires taking action.

The call for submissions stated that "The security provided by a

cryptographic scheme is the most important factor in the evaluation",

and that this would be evaluated based on how well schemes "appear to

provide" various security properties such as IND-CCA2. The call for

submissions said that proofs will be "considered" and "can" accomplish

certain things; this is compatible with giving some types of proofs

negative weight because those proofs add security risks. The call for

submissions also allows NIST to evaluate "the security/performance

tradeoffs involved in a given design approach".

3. The astonishing looseness of an LWE >= SIVP theorem was quantified in

2016 (https://eprint.iacr.org/2016/360). The MQDSS security disaster

mentioned above was announced on pqc-forum in 2019. The continued

advertisements of theorems such as LWE >= SIVP within NISTPQC are, as

far as I can see, not disputing the looseness and security risks.

What precisely is the argument for giving these theorems positive

weight in NISTPQC? I would like to see this argument spelled out and

compared to the security risks described above.

4. I'm particularly puzzled by a recent claim that the theorems loosely

stating LWR >= LWE >= SIVP should be treated as providing positive

assurance for LWR and _more_ assurance for LWE.

If we assume, arguendo, that loose theorems are given positive weight,

then why isn't the scorecard

* LWR >= LWE,

* LWE >= SIVP,

* LWR >= SIVP,

a pure win for LWR over LWE? Yes, LWE has LWE >= SIVP, but LWR has its

own LWR >= SIVP, and it has LWR >= LWE. How can these theorems be

claimed to provide _higher_ assurance for LWE than for LWR?

One possible answer would be that quantitatively looser theorems should

be given lower weight---but a closer look shows that the LWR >= LWE

looseness, quantitatively, isn't nearly as extreme as the LWE >= SIVP

looseness. How can this not end up as a win for LWR?

Even if "LWE >= LWR" were added to the picture, this would simply put

LWE and LWR on the same level, not justifying a claim of _higher_

assurance for LWE. Is someone claiming not just that loose theorems

should be given positive weight, with weight determined by quantifying

looseness, but also that it's somehow possible to prove an LWE >= LWR

theorem that's tighter than the LWR >= LWE theorem?

5. This message focuses on LWR and LWE, which are simpler than RLWR and

RLWE, which in turn are simpler than MLWR and MLWE. I don't think the

complications justify treating the cases differently regarding what I've

written above. However, to simplify the discussion, I would suggest

first settling what will happen with LWR and LWE, and then considering

any arguments for treating other cases differently.

---Dan

Jun 15, 2021, 11:04:04 PM6/15/21

to pqc-forum, D. J. Bernstein, pqc-...@list.nist.gov

I'm not familiar with the relation between reduction theorem and how security parameters are set, please educate me.

1/ If the crypto system is built on top of LWE and LWE >= SIVP, the reduction ">=" loses k-bit then it's natural to set the parameter so that the attack cost against SIVP to be (128 + k)-bit to compensate for the reduction loss. In this scenario, I don't see any problem with large k.

2/ If the reduction theorem LWE >= SIVP is mentioned in the protocol but not used at all to set the parameters, then what's the role of the reduction theorem in setting the parameter or estimating the security of the system?

- Quan

Jun 16, 2021, 1:44:24 AM6/16/21

to Quan Thoi Minh Nguyen, D. J. Bernstein, pqc-...@list.nist.gov

On Tue, Jun 15, 2021 at 8:04 PM Quan Thoi Minh Nguyen

<msunt...@gmail.com> wrote:

>

> I'm not familiar with the relation between reduction theorem and how security parameters are set, please educate me.

>

> 1/ If the crypto system is built on top of LWE and LWE >= SIVP, the reduction ">=" loses k-bit then it's natural to set the parameter so that the attack cost against SIVP to be (128 + k)-bit to compensate for the reduction loss. In this scenario, I don't see any problem with large k.

I think what DJB is pointing to is that often this wasn't done. When
<msunt...@gmail.com> wrote:

>

> I'm not familiar with the relation between reduction theorem and how security parameters are set, please educate me.

>

> 1/ If the crypto system is built on top of LWE and LWE >= SIVP, the reduction ">=" loses k-bit then it's natural to set the parameter so that the attack cost against SIVP to be (128 + k)-bit to compensate for the reduction loss. In this scenario, I don't see any problem with large k.

setting parameters constants and log and log log factors matter, and

an increase in parameter sizes often has a very direct impact on

performance. A weak reduction could be weak like the Fiat-Shamir

transform turning Schnorr signatures to DLog, where very few advocate

setting parameters to take it into account, or it could have several

powerful practical attacks not ruled out by the theorem, until the

sizes grow to make it impractical. This distinction matters.

>

> 2/ If the reduction theorem LWE >= SIVP is mentioned in the protocol but not used at all to set the parameters, then what's the role of the reduction theorem in setting the parameter or estimating the security of the system?

scheme directly, but only the the attack that the theorem suggests and

rules out. Parameter setting should be based on the runtime of the

best known attacks, but that's always a function of the attention a

problem has received. For extra points have the reduction only work

for instances lacking special structure your scheme uses, so that it

doesn't actually say anything about the hardness of the best known

algorithms to attack the particular problems used.

Sincerely,

Watson Ladd

--

Astra mortemque praestare gradatim

Jun 17, 2021, 4:10:35 PM6/17/21

to pqc-forum

Quan Thoi Minh Nguyen writes:

> 1/ If the crypto system is built on top of LWE and LWE >= SIVP, the reduction

> ">=" loses k-bit then it's natural to set the parameter so that the attack cost

> against SIVP to be (128 + k)-bit to compensate for the reduction loss.

Right. As 1996 Bellare--Rogaway wrote, "we must move to a larger
> 1/ If the crypto system is built on top of LWE and LWE >= SIVP, the reduction

> ">=" loses k-bit then it's natural to set the parameter so that the attack cost

> against SIVP to be (128 + k)-bit to compensate for the reduction loss.

security parameter" to use the theorem. 2001 Goldreich reiterated that

"one needs to know the exact complexity of the reduction". 2016

Chatterjee--Koblitz--Menezes--Sarkar initiated the study of the exact

reduction loss for LWE >= SIVP. For full citations and more analysis see

Section 9 of https://cr.yp.to/papers.html#latticeproofs.

Watson Ladd writes:

> I think what DJB is pointing to is that often this wasn't done.

LWE >= SIVP (or various analogous theorems) went through the procedure

of (1) quantifying the LWE >= SIVP reduction loss and (2) choosing

parameters where the theorem applies, given this loss. None of these

submissions will tell you what k is for this theorem.

Existing analysis suggests that it's hopeless to write down a complete

proof applicable to dimensions appearing in NISTPQC (<=1344). I noted in

https://twitter.com/hashbreaker/status/1142855190174928897 a claim made

years ago to a nap.edu committee that dimensions of "a few thousand"

would be enough; as far as I know, this claim remains unproven.

It's also structurally problematic if a submission advertises a theorem

only for large parameters while advertising performance only for small

parameters. Scientifically it's clear that the parameter sets have to be

split for analysis, but realistically the whole submission ends up being

rewarded for the theorem _and_ for the performance simply because these

two different cryptosystems have been given the same name. The situation

is more extreme for LWE >= SIVP: what's being advertised is unquantified

theorems that apply to unspecified parameters.

---Dan

Jun 17, 2021, 4:55:52 PM6/17/21

to pqc-forum

Existing analysis suggests that it's hopeless to write down a complete

proof applicable to dimensions appearing in NISTPQC (<=1344). I noted in

https://twitter.com/hashbreaker/status/1142855190174928897 a claim made

years ago to a nap.edu committee that dimensions of "a few thousand"

would be enough; as far as I know, this claim remains unproven.

Then I guess you don't know about this October 2018 Masters thesis, which analyzes parameters based entirely on known worst-case reductions, and concludes that dimensions n = 1460, 1870 can suffice for 128-bit security against "best known" and "paranoid" quantum algorithms, respectively.

I haven't verified the claims, but the treatment of Regev's reduction looks like it could be tightened up in various ways to get even better parameters. (The original reduction was not at all optimized for runtime tightness.) So, it's clearly not the case that "existing analysis suggests that it's hopeless to write down a complete proof applicable to dimensions appearing in NISTPQC (<=1344)."

Separately: worst-case-hardness theorems like SIVP <= LWE have value beyond the (in)applicability of their headline statements to setting concrete parameters. Much has been written about this (e.g., here and here), and if time permits I'll expand on some of these points.

The take-away message is that such theorems guide us through an enormous, highly fraught design space of "average-case problems" (i.e., probability distributions over lattices, as required for crypto) to ones that appear to be as hard as the worst-case lattice problems they are modeled as (e.g., approx-SVP or BDD). Informally, these problems don't have any "unexpected structural weaknesses" -- unlike the countless proposed average-case problems that lack such theorems, and have been catastrophically broken (without solving any lattice problems in the worst case).

Sincerely yours in cryptography,

Chris

Jun 17, 2021, 6:27:24 PM6/17/21

to pqc-forum

Christopher J Peikert writes:

> Then I guess you don't know about this October 2018 Masters thesis, which

> analyzes parameters based entirely on known worst-case reductions, and

> concludes that dimensions n = 1460, 1870 can suffice for 128-bit security

> against "best known" and "paranoid" quantum algorithms, respectively.

This thread is explicitly about the security of NISTPQC submissions. If
> Then I guess you don't know about this October 2018 Masters thesis, which

> analyzes parameters based entirely on known worst-case reductions, and

> concludes that dimensions n = 1460, 1870 can suffice for 128-bit security

> against "best known" and "paranoid" quantum algorithms, respectively.

there's a claim that dimension-1460 Frodo parameters would be able to

claim 2^128 IND-CCA2 security on the basis stated above, please specify

the parameters and explain the logic for public review.

If, on the other hand, the paragraph above includes an unstated move of

the goalposts, switching to 2^128 security for a _different_ problem

(not IND-CCA2, for example, or not Frodo), please clearly acknowledge

the change of subject.

---Dan

Jun 17, 2021, 7:21:21 PM6/17/21

to pqc-forum

On Thu, Jun 17, 2021 at 6:27 PM D. J. Bernstein <d...@cr.yp.to> wrote:

Christopher J Peikert writes:

> Then I guess you don't know about this October 2018 Masters thesis, which

> analyzes parameters based entirely on known worst-case reductions, and

> concludes that dimensions n = 1460, 1870 can suffice for 128-bit security

> against "best known" and "paranoid" quantum algorithms, respectively.

This thread is explicitly about the security of NISTPQC submissions...

If, on the other hand, the paragraph above includes an unstated move of

the goalposts, switching to 2^128 security for a _different_ problem

(not IND-CCA2, for example, or not Frodo), please clearly acknowledge

the change of subject.

You know that people can read what you wrote, right? I replied directly to these statements of yours:

"Existing analysis suggests that it's hopeless to write down a complete

proof applicable to dimensions appearing in NISTPQC (<=1344). I noted inhttps://twitter.com/hashbreaker/status/1142855190174928897 a claim made

years ago to a nap.edu committee that dimensions of "a few thousand"

would be enough; as far as I know, this claim remains unproven."

I am the one who put forward that hypothesis to the NAP committee, in July 2017. The 2018 Masters thesis claims to establish it, with lots of room to spare: the dimensions are less than 1900. If that analysis is anywhere near correct, then it is not at all "hopeless to write down a complete proof applicable to dimensions appearing in NISTPQC (<=1344)."

My hypothesis referred to the hardness of the LWE problem that can be concluded solely from worst-case hardness theorems, and not to any NISTPQC submissions or parameters. Indeed, it couldn't possibly do so, since the talk preceded the Round 1 deadline (Nov 2017).

But, since "this thread is explicitly about the security of NISTPQC submissions," if you wish to bring up my hypothesis or discuss the claimed proof further, "please clearly acknowledge the change of subject."

Jun 18, 2021, 12:25:41 AM6/18/21

to Christopher J Peikert, pqc-forum

On Thu, Jun 17, 2021 at 4:21 PM Christopher J Peikert

<cpei...@alum.mit.edu> wrote:

>

> On Thu, Jun 17, 2021 at 6:27 PM D. J. Bernstein <d...@cr.yp.to> wrote:

>>

>> Christopher J Peikert writes:

>> > Then I guess you don't know about this October 2018 Masters thesis, which

>> > analyzes parameters based entirely on known worst-case reductions, and

>> > concludes that dimensions n = 1460, 1870 can suffice for 128-bit security

>> > against "best known" and "paranoid" quantum algorithms, respectively.

>>

>> This thread is explicitly about the security of NISTPQC submissions...

>>

>> If, on the other hand, the paragraph above includes an unstated move of

>> the goalposts, switching to 2^128 security for a _different_ problem

>> (not IND-CCA2, for example, or not Frodo), please clearly acknowledge

>> the change of subject.

>

>

> You know that people can read what you wrote, right? I replied directly to these statements of yours:

>

> "Existing analysis suggests that it's hopeless to write down a complete

> proof applicable to dimensions appearing in NISTPQC (<=1344). I noted in

> https://twitter.com/hashbreaker/status/1142855190174928897 a claim made

> years ago to a nap.edu committee that dimensions of "a few thousand"

> would be enough; as far as I know, this claim remains unproven."

>

> I am the one who put forward that hypothesis to the NAP committee, in July 2017. The 2018 Masters thesis claims to establish it, with lots of room to spare: the dimensions are less than 1900. If that analysis is anywhere near correct, then it is not at all "hopeless to write down a complete proof applicable to dimensions appearing in NISTPQC (<=1344)."

>

> My hypothesis referred to the hardness of the LWE problem that can be concluded solely from worst-case hardness theorems, and not to any NISTPQC submissions or parameters. Indeed, it couldn't possibly do so, since the talk preceded the Round 1 deadline (Nov 2017).

But why didn't the e.g. Frodo submission cite the thesis, compute the
<cpei...@alum.mit.edu> wrote:

>

> On Thu, Jun 17, 2021 at 6:27 PM D. J. Bernstein <d...@cr.yp.to> wrote:

>>

>> Christopher J Peikert writes:

>> > Then I guess you don't know about this October 2018 Masters thesis, which

>> > analyzes parameters based entirely on known worst-case reductions, and

>> > concludes that dimensions n = 1460, 1870 can suffice for 128-bit security

>> > against "best known" and "paranoid" quantum algorithms, respectively.

>>

>> This thread is explicitly about the security of NISTPQC submissions...

>>

>> If, on the other hand, the paragraph above includes an unstated move of

>> the goalposts, switching to 2^128 security for a _different_ problem

>> (not IND-CCA2, for example, or not Frodo), please clearly acknowledge

>> the change of subject.

>

>

> You know that people can read what you wrote, right? I replied directly to these statements of yours:

>

> "Existing analysis suggests that it's hopeless to write down a complete

> proof applicable to dimensions appearing in NISTPQC (<=1344). I noted in

> https://twitter.com/hashbreaker/status/1142855190174928897 a claim made

> years ago to a nap.edu committee that dimensions of "a few thousand"

> would be enough; as far as I know, this claim remains unproven."

>

> I am the one who put forward that hypothesis to the NAP committee, in July 2017. The 2018 Masters thesis claims to establish it, with lots of room to spare: the dimensions are less than 1900. If that analysis is anywhere near correct, then it is not at all "hopeless to write down a complete proof applicable to dimensions appearing in NISTPQC (<=1344)."

>

> My hypothesis referred to the hardness of the LWE problem that can be concluded solely from worst-case hardness theorems, and not to any NISTPQC submissions or parameters. Indeed, it couldn't possibly do so, since the talk preceded the Round 1 deadline (Nov 2017).

needed parameters from the worst-case hardness theorem, and propose

those? What the Frodo submission actually does is cite theorems

reducing the BDDwDGS problem to LWE in section 5, and maybe I'm being

too dense right now, but I don't understand why that's helpful in

evaluating the difficulty of BDDwDGS. Seems to me one would like to

argue that LWE hardness implies BDDwDGS hardness.

If a submission actually did that work, it would improve the security

picture for that submission, and likely others. Without that work, the

theorem doesn't say anything relevant to the concrete security

parameters just as the existing CDH to DLOG reductions in groups

aren't that interesting.

Jun 18, 2021, 1:53:07 PM6/18/21

to Watson Ladd, pqc-forum

(As usual, speaking for myself here...)

On Fri, Jun 18, 2021 at 12:25 AM Watson Ladd <watso...@gmail.com> wrote:

On Thu, Jun 17, 2021 at 4:21 PM Christopher J Peikert

<cpei...@alum.mit.edu> wrote:

>

>

> I am the one who put forward that hypothesis to the NAP committee, in July 2017. The 2018 Masters thesis claims to establish it, with lots of room to spare: the dimensions are less than 1900. If that analysis is anywhere near correct, then it is not at all "hopeless to write down a complete proof applicable to dimensions appearing in NISTPQC (<=1344)."

>

> My hypothesis referred to the hardness of the LWE problem that can be concluded solely from worst-case hardness theorems, and not to any NISTPQC submissions or parameters. Indeed, it couldn't possibly do so, since the talk preceded the Round 1 deadline (Nov 2017).

But why didn't the e.g. Frodo submission cite the thesis, compute the

needed parameters from the worst-case hardness theorem, and propose

those?

One small reason is the unidirectional nature of time: the NISTPQC deadline was Nov 2017, and the thesis wasn't posted until almost a year later. (As for later NISTPQC rounds, I personally didn't learn of the thesis until near or after the Round 3 deadline.)

A more important reason is that the FrodoKEM submission doesn't view this task -- i.e., setting full parameters according to a highly time-unoptimized worst-case hardness theorem -- as a practically meaningful exercise. The document says as much, in Section 1.2.2:

"Worst-case/average-case reductions help guide the search for cryptographically hard problems in a large
design space, and offer (at minimum) evidence that the particular distribution of inputs does not introduce
any fundamental structural weaknesses. This is in contrast to several lattice-based proposals that lacked such
reductions, and turned out to be insecure because their distributions made 'planted' secrets easy to recover,
e.g., [121, 95, 34, 45]. Indeed, Micciancio and Regev [94] argue that a reduction from a hard worst-case
problem

'... assures us that attacks on the cryptographic construction are likely to be effective only for
small choices of parameters and not asymptotically. In other words, it assures us that there are no
fundamental flaws in the design of our cryptographic construction... In principle the worst-case
security guarantee can help us in choosing concrete parameters for the cryptosystem, although in
practice this leads to what seems like overly conservative estimates, and ... one often sets the
parameters based on the best known attacks.'"

In other words, FrodoKEM uses worst-case-hardness reductions to identify in the first place **which problem to parameterize** -- namely, LWE with Gaussian error -- and then does so according to (extrapolations of) the best known cryptanalysis.

In lattice cryptography, the importance of first identifying a sound problem to parameterize cannot be overstated. The design space is huge and subtle, and the history is littered with the corpses of ad-hoc designs -- i.e., average-case lattice problems -- that were later discovered to have structural weaknesses rendering them completely broken, or substantially weaker than the generic lattice problems they were seen as comparable to.

By "structural weakness" I mean something exploitable about its special random instances -- e.g., the specific distribution of lattices and/or planted short vectors -- that circumvents the need to solve the more general lattice problem that the problem was originally conceived as -- e.g., an approximate shortest/closest vector problem on generic lattices.

The FrodoKEM submission cites a few historical examples of unexpected structural weaknesses in ad-hoc lattice designs (lacking worst-case hardness theorems), but a full list is much longer. Other examples include:

- almost all early iterations of knapsack cryptography,
- various methods of generating a random lattice with a "short" basis,
- various multilinear map candidates (countless papers),
- subexponential attacks on "binary/small secret" versions of NTRU and LWE, and
- practically efficient attacks on "stretched" NTRU that exploit the presence of many short vectors in an NTRU lattice.

Based on this record, it's fair to conclude that we humans are just not very good at designing or identifying hard average-case lattice problems from scratch. The task is especially fraught when the problem has a "planted" short solution, as needed for public-key encryption/KEMs.

By contrast, problems having worst-case hardness reductions, like SIS and LWE, have long resisted intense scrutiny looking for structural weaknesses -- even for parameters that don't fully conform to the worst-case theorems, or which lack meaningful security bounds from these theorems alone. After a great deal of cryptanalysis, finding suitably short vectors in a random SIS lattice still appears essentially no easier than for a "generic" lattice of matching parameters (determinant, dimension). Similarly, solving LWE with (near-)Gaussian error appears essentially no easier than solving generic BDD with matching parameters.

It's worth emphasizing how remarkable this is. A priori, the definitions of SIS/LWE are highly non-obvious, and while their associated distributions over lattices seem natural enough, it's not clear why they should be better than anything else. But the history of broken proposals shows just how difficult it is to design truly hard average-case lattice problems. I don't think it's a coincidence that SIS/LWE have worst-case hardness theorems (albeit loose, unoptimized ones), and have also stood up so much better to cryptanalysis than to most other proposals. The theorems and the cryptanalysis are mutually supporting and point to very similar conclusions, though in different ways.

I also want to emphasize that without the worst-case hardness theorems, ***we would not have had so much focused study of the broad SIS/LWE problem family***. The theorems have unified most of the community around these problems as a commonly agreed "hard target." It is good to have such scrutiny on a single problem class, and more scrutiny should inspire more confidence. It would make no sense to penalize submissions that use a class that has held up so well against so much cryptanalytic scrutiny, just because it is also supported by worst-case hardness theorems that happen to be time-unoptimized and loose (more on this in a separate message).

What the Frodo submission actually does is cite theorems

reducing the BDDwDGS problem to LWE in section 5, and maybe I'm being

too dense right now, but I don't understand why that's helpful in

evaluating the difficulty of BDDwDGS. Seems to me one would like to

argue that LWE hardness implies BDDwDGS hardness.

This is backwards; the FrodoKEM submission is not interested in arguing that LWE hardness implies BDDwDGS hardness (it's LWE hardness that's desired, after all).

A reduction from problem X to problem Y says that any Y-solver can be transformed (with some overhead) into an X-solver; it follows that if X is hard, then Y is relatedly hard.

Here X=BDDwDGS, and Y=LWE with a certain "wide enough" (Gaussian) error distribution. X=BDDwDGS is a well studied conjectured-hard problem on worst-case lattices (though more study is always welcome). The submission takes this reduction as a rationale for using the associated LWE problem.

Yes, the full reduction from X=BDDwDGS to Y=LWE is quite loose in running time (however, it's quite tight in approximation factor, which is a very important aspect of such reductions). I don't see this as a significant problem, but if one is inclined to improve the tightness, there are various things that can be tried.

I am highly confident that one or more of the following ideas could yield a significantly tighter reduction. It's plausible that they might even yield *meaningful* security bounds for some of FrodoKEM's proposed LWE parameters, based solely on the conjectured hardness of BDDwDGS -- at least for "realistic" LWE adversaries. Of course, this is just my informed intuition, and the details need to be worked out carefully.

1. Try to speed up Regev's classical reduction, which is all that the BDDwDGS-to-LWE reduction uses (so it's already tighter than the full quantum reduction). It was written with an emphasis on modularity and clarity, and certainly wasn't optimized for running time. There are likely ways to save some polynomial factors just by more refined algorithmics and accounting.

2. Observe that "realistic" LWE attackers actually solve search-LWE, not just decision-LWE. Using this, specialize Regev's reduction to search-LWE, cutting out the search-to-decision reduction and associated polynomial overhead.

3. Observe that a great deal of the reduction's overhead is devoted to producing **exactly the right error distribution** (neither too wide nor too narrow), but that "realistic" LWE attackers work just fine (better, even) when the error is narrower. Using this, specialize the reduction to cut out the expensive step that goes from "unknown error rate <= alpha" to "error rate =~ alpha".

In fact, using just ideas 2+3, the reduction for "realistic" adversaries seems to become rather tight, with a cheap one-to-one mapping from BDDwDGS samples to LWE samples.

If a submission actually did that work, it would improve the security

picture for that submission, and likely others. Without that work, the

theorem doesn't say anything relevant to the concrete security

parameters just as the existing CDH to DLOG reductions in groups

aren't that interesting.

Just to be clear: none of the remaining NISTPQC submissions claim that the existing theorems say any such thing; indeed, some (like FrodoKEM) explicitly *disclaim* it. Again, what the FrodoKEM submission says is that it uses the theorems as guidance to choose LWE with sufficiently wide Gaussian error distribution. For concrete parameters, this problem is then cryptanalyzed according to best known/extrapolated attacks.

Jun 18, 2021, 3:11:09 PM6/18/21

to pqc-forum

On Fri, Jun 11, 2021 at 12:18 PM D. J. Bernstein <d...@cr.yp.to> wrote:

1. There are theorems _loosely_ saying LWR >= LWE >= SIVP. (Replacing

SIVP by, e.g., BDDwDGS is essentially orthogonal to this message.)...

Furthermore, as the MQDSS security disaster

illustrates, sometimes attackers can exploit exactly the looseness that

allows a proof to be written down.

I don't think this is a good analogy. A much better analogy is based on the "statistical tightness" of the reductions, not their runtime tightness. But MQDSS's reduction is loose in this respect, while the SIVP<=LWE reduction can be made extremely tight with little effect on the approx-SIVP problem it solves.

The looseness of the MQDSS reduction reflects a probabilistic "bad event" in the random-oracle/forking-lemma analysis, when flattening an interactive protocol using a Fiat-Shamir-like transform. Importantly, the attacker can query the random oracle (hash function) on many inputs to try to induce the bad event, and this is essentially what the attack actually does (see Section 2.1 for a summary).

This is similar to many other loose reductions that have matching attacks: the attacker can interact with some kind of oracle, and may even have some control over what it receives from the cryptosystem. This lets it try to induce whatever bad event that makes the reduction fail, which is a main source of the reduction's looseness.

In the LWE problem, there are no interactive oracles (random or otherwise), and the attacker has no control over the input it receives. It is given a random LWE instance (A, b) and is charged with solving it; that's all. So, it has no way to induce a "bad event," unlike in MQDSS and similar scenarios.

In the SIVP <= LWE reduction, there is a slight *statistical* difference between the true LWE distribution and what the reduction provides to the adversary. It's not inconceivable that an attacker could try to exploit a noticeable difference (though I have no idea how, unless the difference was quite large). However, this kind of exploit can be prevented -- provably so -- by having the reduction simulate the LWE distribution to a close enough statistical distance. (In more detail, one would choose a suitably tiny epsilon>0 for the lattice "smoothing parameter" eta_epsilon.)

Importantly, this "statistical tightness" of the reduction is an almost entirely separate matter from its runtime tightness; it mainly affects the lattice approximation factor solved by the reduction. But even here the dependence is slight, because the factor eta_epsilon appearing in the approximation grows with sqrt(log(1/epsilon)). So, one could target an extremely small statistical distance -- say, 2^{-192} or even 2^{-256} -- and the reduction's approximation would degrade relatively little.

Jun 18, 2021, 3:48:49 PM6/18/21

to pqc-forum

PS: the known LWE <= LWR reductions have rather bad statistical *and* "approximational" tightness: they reduce from LWE with very large modulus q and very small error rate alpha to LWR with the same q and a very large amount of rounding (corresponding to error rate >> alpha).

For example, for the BPR reduction (Theorem 3.2) to get statistical tightness of just 2^{-64} requires a modulus q somewhat larger than 2^{64}, and rounding away more than 64 bits. This is because we need each "rounded off" LWE sample to yield a corresponding correct LWR sample (modulo the statistical tightness), and the probability that it fails to do so is more than 1/q.

Of course, the NISTPQC submissions using LWR variants don't use parameters anywhere near this large, in either the modulus q nor the amount of rounding. Their q's are in the several thousands, and they round away only a few bits or less.

The known LWE <= LWR reductions don't yield any conclusions about such parameters. This is not because of *runtime* looseness (the reductions are actually tight in this respect!), but because of their *statistical* looseness -- and approximational looseness too.

Given that statistically loose reductions have not infrequently been matched with real attacks, I think we should take this issue seriously. In this respect, the SIVP<=LWE reduction is far better than the LWE<=LWR reduction.

Sincerely yours in cryptography,

Chris

Jun 19, 2021, 11:35:29 AM6/19/21

to pqc-forum

"Provable security" normally asks for a proof that a time-T

probability-eps attack against a proposed scheme X implies a time-T'

probability-eps' attack against a "well studied" problem Y, where T' is

close to T and eps' is close to eps.

The literature identifies several ways for this to fail: T' is far above

T; eps' is far below eps; the proof isn't actually for the proposed

scheme; the proof is only regarding a limited class of attacks; Y isn't

secure; the proof is wrong. For each failure mode, the question of

whether the failure occurs is applicable to every reduction, allowing

collection and analysis of data (see https://eprint.iacr.org/2019/1336)

regarding the prevalence and importance of that type of failure.

The quantitative T/eps issues in particular are called "looseness". The

impact of looseness is highlighted in, e.g., 1996 Bellare--Rogaway

(https://web.cs.ucdavis.edu/~rogaway/papers/exact.pdf, page 4) saying

that eps' is much smaller than eps in the RSA-FDH proof, so "we must

move to a larger security parameter"; and 2007 Damgaard (page 2 of

https://users-cs.au.dk/%7Eivan/positionpaper.pdf) saying that if T' is,

say, T^2 then "we need to have larger key sizes" to apply the proof.

Within NISTPQC, MQDSS advertised a loose proof, violated the requirement

from the above papers to choose key sizes accordingly, and was broken.

Round2 advertised a proof for its CCA conversion, but the proof was

incorrect, and Round2 was broken. NISTPQC should respond by adding

procedural protections against loose proofs and unreviewed proofs. This

could have been done earlier, but better late than never.

The last few pqc-forum messages appear to be arguing that "worst-case"

proofs such as LWE >= SIVP, proofs that have been frequently advertised

in NISTPQC but do _not_ say anything about the parameter sets proposed

for standardization, should be given higher weight than

* the MQDSS proof,

* the LWR >= LWE proof, and

* the LWR >= SIVP proof.

Structurally, the argument sounds like it's trying to

(1) identify a criterion for acceptable security proofs,

(2) say that the MQDSS and LWR proofs flunk this criterion, and

(3) say that the LWE >= SIVP proof meets this criterion.

However, there isn't a clear statement of #1, so #2 and #3 become

impossible to evaluate. Clarification of #1 is thus required as a

preliminary step.

We are, for example, told that the large moduli required for the LWR

proofs, larger than the moduli proposed in NISTPQC, are a failure of

"approximational" tightness. What's the definition of "approximational"

for a general reduction? If the answer is something about the ratio

T/eps being much smaller than T'/eps', or something about X not being

the proposed scheme, then why isn't the failure shared by LWE >= SIVP?

For the MQDSS proof, there's an objection that "the attacker can

asking the reader to throw away, e.g., the proof that AES PRF security

implies the security of various standard modes of operation of AES.

There are also other ad-hoc objections, and again it isn't clear what

the evaluation criterion is supposed to be.

As a point of order, it's problematic if NISTPQC starts from a laundry

list of ad-hoc arguments for downplaying particular proofs, removes

arguments that are shredded, and then repeats the remaining arguments.

This is an example of https://en.wikipedia.org/wiki/Data_dredging

("p-hacking"), which is well known to encourage errors. There needs to

be a commitment to one clearly defined criterion, followed by an

analysis of that criterion.

---Dan

probability-eps attack against a proposed scheme X implies a time-T'

probability-eps' attack against a "well studied" problem Y, where T' is

close to T and eps' is close to eps.

The literature identifies several ways for this to fail: T' is far above

T; eps' is far below eps; the proof isn't actually for the proposed

scheme; the proof is only regarding a limited class of attacks; Y isn't

secure; the proof is wrong. For each failure mode, the question of

whether the failure occurs is applicable to every reduction, allowing

collection and analysis of data (see https://eprint.iacr.org/2019/1336)

regarding the prevalence and importance of that type of failure.

The quantitative T/eps issues in particular are called "looseness". The

impact of looseness is highlighted in, e.g., 1996 Bellare--Rogaway

(https://web.cs.ucdavis.edu/~rogaway/papers/exact.pdf, page 4) saying

that eps' is much smaller than eps in the RSA-FDH proof, so "we must

move to a larger security parameter"; and 2007 Damgaard (page 2 of

https://users-cs.au.dk/%7Eivan/positionpaper.pdf) saying that if T' is,

say, T^2 then "we need to have larger key sizes" to apply the proof.

Within NISTPQC, MQDSS advertised a loose proof, violated the requirement

from the above papers to choose key sizes accordingly, and was broken.

Round2 advertised a proof for its CCA conversion, but the proof was

incorrect, and Round2 was broken. NISTPQC should respond by adding

procedural protections against loose proofs and unreviewed proofs. This

could have been done earlier, but better late than never.

The last few pqc-forum messages appear to be arguing that "worst-case"

proofs such as LWE >= SIVP, proofs that have been frequently advertised

in NISTPQC but do _not_ say anything about the parameter sets proposed

for standardization, should be given higher weight than

* the MQDSS proof,

* the LWR >= LWE proof, and

* the LWR >= SIVP proof.

Structurally, the argument sounds like it's trying to

(1) identify a criterion for acceptable security proofs,

(2) say that the MQDSS and LWR proofs flunk this criterion, and

(3) say that the LWE >= SIVP proof meets this criterion.

However, there isn't a clear statement of #1, so #2 and #3 become

impossible to evaluate. Clarification of #1 is thus required as a

preliminary step.

We are, for example, told that the large moduli required for the LWR

proofs, larger than the moduli proposed in NISTPQC, are a failure of

"approximational" tightness. What's the definition of "approximational"

for a general reduction? If the answer is something about the ratio

T/eps being much smaller than T'/eps', or something about X not being

the proposed scheme, then why isn't the failure shared by LWE >= SIVP?

For the MQDSS proof, there's an objection that "the attacker can

interact with some kind of oracle, and may even have some control over

what it receives from the cryptosystem"---but this sounds like it's
asking the reader to throw away, e.g., the proof that AES PRF security

implies the security of various standard modes of operation of AES.

There are also other ad-hoc objections, and again it isn't clear what

the evaluation criterion is supposed to be.

As a point of order, it's problematic if NISTPQC starts from a laundry

list of ad-hoc arguments for downplaying particular proofs, removes

arguments that are shredded, and then repeats the remaining arguments.

This is an example of https://en.wikipedia.org/wiki/Data_dredging

("p-hacking"), which is well known to encourage errors. There needs to

be a commitment to one clearly defined criterion, followed by an

analysis of that criterion.

---Dan

Jun 19, 2021, 4:25:43 PM6/19/21

to pqc-forum, D. J. Bernstein, pqc-...@list.nist.gov

Thanks everyone for the insights. For concreteness, let’s focus on Frodo-640, 128-bit security against classical attack.

From proof’s perspective, Frodo-640 aims to use SIVP => LWE => IND-CPA. However, SIVP => LWE proof requires parameters that are *not* applicable for Frodo-640, so Frodo-640 has to use a *far less cryptanalyzed* problem BDDwDGS (compared to SIVP): BDDwDGS => LWE. Is it a fair summary? If it is then it’s not very assuring that Frodo-640's LWE with concrete Gaussian error parameters is safe even if we ignore all the running time looseness.

From cryptanalytic attacks, Frodo-640 uses SIVP => LWE. This has nothing to do with the hardness of BDDwDGS or reduction BDDwDGS => LWE.

I'm confused with the above methodology and I can't even convince myself that the methodology is logical.

Regards,

- Quan

Jun 20, 2021, 7:04:49 PM6/20/21

to pqc-forum, D. J. Bernstein, pqc-...@list.nist.gov

Hi Dan,

This quote is nuts.

Apparently everyone but you understands the state of the science, and is willing to accept new results as they happen.

Stop propagandizing.

Best,

--Daniel Apon, NIST PQC.

This quote is nuts.

Apparently everyone but you understands the state of the science, and is willing to accept new results as they happen.

Stop propagandizing.

Best,

--Daniel Apon, NIST PQC.

Jun 20, 2021, 8:03:10 PM6/20/21

to daniel.apon, D. J. Bernstein, pqc-...@list.nist.gov

NIST's primary responsibility here is to result in an acceptably

secure selection. NIST's secondary responsibility is to assure that

the process is one that the public can trust. I don't feel that your

response does a lot to further either of these purposes.

Dan has raised a credible argument that loose (or inapplicable) proofs

may have degraded submissions. To the extent that tighter proofs may

exist it is not necessarily the case that everyone knows about them--

e.g. Even in Peikert's case he states that he was unaware of the

particular work more recently discussed until near the round-3

deadline. Would Frodo-- or some other candidate-- have had a

different design if the new techniques were known previously and might

those design decisions ultimately make it weaker than another proposal

whose design was not guided by the tradeoffs or proof-restrictions

that were implied by a too-loose reduction? I have no idea-- but it's

a reasonable question to ask.

To some extent I think Dan's general argument is hard to argue with,

at least the process needs to be mindful of uselessly loose proofs or

non-parameter-compensated proposals. That is, unless you reject

Koblitz and Menezes well defended position that inappropriately

applied provable security has been shown to actively undermine

cryptosystem security in some instances.

The full strength version that submissions should be *penalized* for

using a weak argument (maybe even if later stronger ones were

discovered) is perhaps a bridge too far, but the science of high

integrity processes for selecting cryptosystems is far from

established and Dan's points are precisely the sort that should be

considered to help advance that field. Hearing and reasoning it out

should only increase confidence in the process and, ultimately, the

scheme and parameters it selects. After all, shutting down the

discussion now won't prevent the complaint from being raised later, it

will just make it impossible to argue that the point was considered

and that the decisions were made in light of it.

I think that strong of a post-submission change in perspective may

needlessly add noise to the process: Submitters didn't expect that

applying *additional* reasonable if-weak security arguments might

count against them. I don't think it would be at all unreasonable to

have a process that did so, but without a clear understanding upfront

you would reasonably expect proposals to randomly receive negative

weight. Worse, to the extent that proposals made multiple security

arguments the addition of a weaker argument might well be positively

correlated with the effort that went into the proposal, the skill of

the proposers, or the analyzability of their proposal. This would be

less true if people knew at submission time that making a complex

security argument that was found to be too weak to be useful would

actively count against them.

> --

> You received this message because you are subscribed to the Google Groups "pqc-forum" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

> To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/cc457a9a-3356-4994-876e-8c7a2024dc66n%40list.nist.gov.

secure selection. NIST's secondary responsibility is to assure that

the process is one that the public can trust. I don't feel that your

response does a lot to further either of these purposes.

Dan has raised a credible argument that loose (or inapplicable) proofs

may have degraded submissions. To the extent that tighter proofs may

exist it is not necessarily the case that everyone knows about them--

e.g. Even in Peikert's case he states that he was unaware of the

particular work more recently discussed until near the round-3

deadline. Would Frodo-- or some other candidate-- have had a

different design if the new techniques were known previously and might

those design decisions ultimately make it weaker than another proposal

whose design was not guided by the tradeoffs or proof-restrictions

that were implied by a too-loose reduction? I have no idea-- but it's

a reasonable question to ask.

To some extent I think Dan's general argument is hard to argue with,

at least the process needs to be mindful of uselessly loose proofs or

non-parameter-compensated proposals. That is, unless you reject

Koblitz and Menezes well defended position that inappropriately

applied provable security has been shown to actively undermine

cryptosystem security in some instances.

The full strength version that submissions should be *penalized* for

using a weak argument (maybe even if later stronger ones were

discovered) is perhaps a bridge too far, but the science of high

integrity processes for selecting cryptosystems is far from

established and Dan's points are precisely the sort that should be

considered to help advance that field. Hearing and reasoning it out

should only increase confidence in the process and, ultimately, the

scheme and parameters it selects. After all, shutting down the

discussion now won't prevent the complaint from being raised later, it

will just make it impossible to argue that the point was considered

and that the decisions were made in light of it.

I think that strong of a post-submission change in perspective may

needlessly add noise to the process: Submitters didn't expect that

applying *additional* reasonable if-weak security arguments might

count against them. I don't think it would be at all unreasonable to

have a process that did so, but without a clear understanding upfront

you would reasonably expect proposals to randomly receive negative

weight. Worse, to the extent that proposals made multiple security

arguments the addition of a weaker argument might well be positively

correlated with the effort that went into the proposal, the skill of

the proposers, or the analyzability of their proposal. This would be

less true if people knew at submission time that making a complex

security argument that was found to be too weak to be useful would

actively count against them.

> You received this message because you are subscribed to the Google Groups "pqc-forum" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

> To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/cc457a9a-3356-4994-876e-8c7a2024dc66n%40list.nist.gov.

Jun 20, 2021, 11:00:54 PM6/20/21

to pqc-forum

Brief-ish summary of this message:

- In
*all*the presented examples of loose reductions that are accompanied by actual demonstrated security losses (including MQDSS),**the loss is directly attributable to the reduction's "advantage loss," not its "time loss."** **It is a mistake to conflate these two notions of looseness into a single metric**(as Dan and others who've critiqued looseness have done), because this obscures their major qualitative and practical differences. Based on many empirical examples,**advantage-loose reductions are demonstrably risky**, whereas**advantage-tight (but poly-time-loose) reductions are not;**indeed,**they are highly correlated with a***lack*of security loss in practice.- Because of this,
**NIST should tend to credit nontrivial "advantage-tight" reductions much more highly than "advantage-loose" ones**-- even if the former are (polynomially) time-loose and hence don't prove anything about proposed parameters. **The approxSIVP<=LWE and BDDwDGS<=LWE reductions are very advantage-tight**(though poly-time-loose) and**even apply to the proposed LWE parameters underlying FrodoKEM**.- So, they should be taken as
*meaningful evidence*(but not proof) in support of the security of those LWE parameters, complementing and reinforcing the extensive cryptanalysis. **The known LWE<=LWR reductions are inherently very advantage-loose**for the proposed parameters of LWR-style submissions (NTRU Prime and SABER). Obtaining even moderate advantage-tightness would require moving to dramatically larger parameters.- So, these LWE<=LWR reductions should be seen as providing
*very little evidence (if any)*for the security of LWR with the proposed parameters, or anything near them. - To gain confidence in the absence of more meaningful reductions, we need a good deal more cryptanalysis specifically dedicated to LWR with the proposed parameters. (We don't currently have that, but that's a topic for another message...)

Expanded:

- A reduction from problem X to problem Y, written X<=Y, can be tight or loose in at least two different respects: time and "advantage." The latter is essentially the probability that the attack solves the problem. (It encompasses the "statistical" aspect of reductions that I've referred to in recent messages, so I'll use the term "advantage" from now on.)
- There are several cases where a (correct)
**somehow-loose X<=Y reduction is known, but Y has turned out to be much easier to break than X**. - In an inventory of examples provided by critiques of looseness (see below), it turns out that
-- and most are even quite time-tight!*all*such examples are*advantage-loose* - Moreover,
**the attacks that break Y faster than X can easily be seen as directly exploiting this advantage-looseness.** - This is not saying that
*all*advantage-loose reductions are exploitable by matching attacks, only that a good number have turned out to be -- and precisely due to that property. - In sum:
**advantage-loose reductions are demonstrably risky.** - There are many other cases where an
reduction X<=Y is known.*advantage-tight but (polynomially) time-loose* - In
*all*such examples provided by critiques of looseness (and all others I have thought of), X and Y are*about equally hard to break*according to the known cryptanalysis. - In other words,
**the critiques have not provided a single example of an advantage-tight reduction where there is a demonstrated security gap due to poly-time-looseness**. (I'm not aware of any natural examples either, and would very much like to know if anyone has one; I suspect that contrived examples might exist.) - This is not saying that such reductions
*prove*that X and Y are about equally hard; they certainly do not! It is saying that*empirically*, there is a very strong (perfect?) correlation in that direction. - In sum:
**we have not seen any evidence that advantage-tight, time-loose reductions present an actual security risk;**indeed,**empirically they are***highly correlated*with a lack of security loss in practice. - Given all this -- and assuming that any additional examples tend to reinforce the above-described patterns -- I make the following conclusions and recommendations:
- I think
**it is a serious mistake to conflate time-looseness and advantage-looseness into one single metric of looseness**(as has been done in this thread's original message, elsewhere in pqc-forum, and in the literature critiquing looseness). Such conflation obscures a major qualitative and practical difference between these two kinds of looseness. - I recommend that
**when considering reductions as potential evidence of security, NIST should tend to credit nontrivial advantage-tight reductions much more highly than advantage-loose ones**-- even if the former are polynomially time-loose and hence don't prove anything about proposed concrete parameters, and the latter are time-tight. Of course, reductions that are tight in both time and advantage (and any other relevant metrics) are best of all. - I don't have a scientific or mathematical explanation for why advantage-tight reductions correlate so well with a lack of demonstrated security loss. I am simply observing that from an empirical, risk-focused perspective, their time-looseness does not seem to have ever been a security problem, even across a wide variety of very different kinds of problems and reductions.
**The demonstrated risks of looseness have all been due to advantage-looseness, so that is what we should be most cautious about.** - On lattice reductions specifically:
**The approxSIVP<=LWE reduction can be made extremely advantage-tight**, with little negative effect on its other aspects; the same goes for the GapSVP<=LWE reduction and the BDDwDGS<=LWE reduction. (See my previous messages in this thread for details.)- Note: the BDDwDGS<=LWE reduction highlighted in the FrodoKEM submission is classical (not quantum) and substantially time-tighter, though still rather loose -- but see my previous message for promising approaches to make it much tighter, under stronger but realistic assumptions about the LWE attacker.
**The previous point applies even to the proposed LWE parameters underlying FrodoKEM**. That is, there are advantage-tight {approxSIVP,GapSVP,BDDwDGS}<=LWE reductions for concrete LWE dimensions n=640, 976, 1344, moduli q=2^15, 2^16, 2^16, and Gaussian error with standard deviation sigma=2.8, 2.3, 1.4 (respectively). Again, these reductions do not*prove*any concrete security for such parameters (due to their time-looseness), but they should be taken as*meaningful evidence*that complements and reinforces the extensive cryptanalysis in support of such a conclusion.- By contrast,
**the known LWE<=LWR reductions are inherently very advantage-loose**, by more than an additive 1/q term (where q is the modulus). For the concrete values of q used in, e.g., the LWR-style NTRU Prime submission, the advantage loss is more than 2^{-13}. - Note: this 2^{-13} advantage loss is
*much larger*than that of other advantage-loose reductions having demonstrated security losses. - This is not saying that LWR with such parameters is insecure, only that we should take the highly advantage-loose LWE<=LWR reduction as very limited evidence (if any) for its security.

At this point I should say that none of the observations in this message are especially deep, and I would not be surprised if some or all of them have been made before. But I am not aware of any literature highlighting the point that practical security gaps arising from loose reductions have always been due to advantage-looseness, not time-looseness. I also don't know of any systematic treatment of the qualitative difference between the two notions, and would welcome any references. (If no such paper exists, maybe one needs to be written...)

Technical substantiation for the above claims:

A reduction X<=Y comes with a theorem of the form "any attack that breaks Y in time T' with advantage eps' can be transformed into an attack that breaks X in time T with advantage eps," where T, eps are some functions of T', eps'.

Here are some typical forms of T:

- T = T' + [something relatively small].
- In this case the reduction is essentially optimally time-tight. The reduction breaks X in essentially the same time as the original attack breaks Y.
- T = [small constant] * T' + [small].
- Still very time-tight: the reduction breaks X in nearly the same time as the attack breaks Y.
- T = poly(secparam, 1/eps, other quantities) * T' + poly(...).
- Asymptotically, the reduction is "polynomially time-tight."
- But for concrete parameters, it could be anywhere from moderately time-tight to time-loose, depending on how those poly terms grow.
- In the loose case, a fast attack against Y may only yield an attack against X that takes too much time to be interesting. So, the reduction doesn't prove anything useful for these parameters.
- T = poly(secparam, 1/eps, others, T') + poly(...).
- Now T' is inside the poly(), but asymptotically the reduction is still poly-time-tight.
- Yet concretely this is often even looser than the previous form, e.g., if the dependence on T' is quadratic or more.
- T = subexponential(secparam, 1/eps, others) * T' + ...
- Here the reduction is asymptotically quite time-loose (not even polynomially tight), and likely concretely so as well. Again, it depends on the specific terms.

Here are some typical forms of eps:

- eps = eps' - [something small versus the smallest eps' of interest].
- Here the reduction is advantage-tight.
- eps = (eps' - [small vs eps']) / [small constant].
- Still very advantage-tight.
- eps = eps' / poly(secparam, ...).
- Here the reduction could be anywhere from moderately advantage-tight to (concretely) advantage-loose, depending on those poly terms.
- Concretely in the loose case, an attack that succeeds against Y with good advantage eps' may only yield an attack against X with much too small advantage eps to be interesting, so the reduction doesn't prove anything useful for these parameters.
- As we see below,
**some examples of demonstrated security losses are directly due to this kind of advantage-looseness**. - eps = eps' - [something >= the smallest eps' of interest].
- Here we have only the
*meaningless*advantage bound eps <= 0. - This may be a big problem! An attack against Y with relevant advantage ~eps' (or less)
**may not yield any attack against X at all**-- not even a slow and/or low-advantage one! **More examples of major demonstrated security losses are directly due to this kind of advantage-looseness.**- eps = (eps' - [something >= smallest eps']) / poly(...).
- This inherits the problems of both of the previous two items.

From the critiques of looseness, here is an inventory of somehow-loose reductions X<=Y where the known attacks against Y are significantly better than those against X. In *every case*, the reduction is **advantage-loose** -- though often quite time-tight! -- and **the security gap between X and Y is directly due to that advantage-looseness**.

- X = a certain MQ problem, Y = NISTPQC signature candidate MQDSS.
- Here the reduction is advantage-loose in the sense of eps = eps' - [something >= smallest eps'] (it's actually worse, but this suffices for the point).
- The subtracted term reflects some "bad event" in the random-oracle analysis, which causes the reduction to completely fail to break X.
- The attack makes enough random-oracle queries to make the bad event likely to occur (so the reduction would fail). It does not even try to break the MQ problem X -- instead, it just circumvents attacking X by making the subtracted term big enough to force eps <= 0.
- X = a block-cipher with 64-bit block size, Y = various modes of operation for symmetric encryption and/or authentication.
- Here again the reductions are advantage-loose in the sense of eps = eps' - [something >= smallest eps'].
- The subtracted term reflects a "birthday paradox" bad event in the block-cipher-to-PRF switch that tends to occur after about 2^32 block-cipher invocations, which causes the reduction to completely fail to break X.
- The Sweet 32 attacks get enough block cipher inputs to make the birthday paradox "bad event" likely to occur (so the reduction would fail). They do not even try to break the block cipher X -- instead, they just circumvent attacking X by making the subtracted term big enough to force eps <= 0.
- X = single-user security, Y = multi-user security (for various primitives), as explored in https://eprint.iacr.org/2011/442 .
- Here the reductions are advantage-loose in the sense of eps = eps' / N, where N is the number of users.
- The factor-of-N loss reflects the "bad event" that the reduction fails to correctly "guess" which user the attacker will ultimately target.
- The paper's attack directly exploits this advantage loss: in about 1/N of the time it takes to complete a single-user key search, it finds
*some*user's secret key with probability eps' ~= 1/4. By the reduction, this means it can find a*specific*user's secret key in 1/N of the time with probability eps ~= 1/(4N). - Succeeding with ~1/N probability in 1/N the time is
*exactly*the trivial time-success tradeoff for brute-force key search. So, the attack does not even try to break single-user security -- instead, it exploits the reduction's advantage-looseness, and the fact that obtaining a lesser advantage trivially requires less time.

Finally, here is an inventory of advantage-tight, (concretely) time-loose reductions X<=Y drawn from the critiques of looseness, along with some others I thought of. As far as I can tell, **there is ***not a single case* in which Y has been shown to be substantially easier to break than X, according to the best cryptanalysis. (Again, I would welcome hearing about natural or even contrived examples where there is a demonstrated security gap in the presence of an advantage-tight reduction.)

- DLOG <= Blum-Micali hard-core bit, Factoring <= Blum-Blum-Shub hard-core bit, other hard-core bits for number-theoretic problems, Goldreich-Levin hard-core bit, etc.
- All these reductions are advantage-tight, but concretely very time-loose for typical parameters.
- But to my knowledge, the best known way to attack these hard-core bits is to break their underlying problems (DLOG, Factoring, etc.).
- LWE search <= decision reductions.
- These reductions are actually "uber"-advantage-tight: non-negligible advantage eps' against decision can be transformed to advantage ~1 against search.
- Their time overhead is a moderate polynomial (roughly n*q/eps'), which corresponds to at least 24 bits of security for typical parameters.
- Yet we don't know how to solve decision-LWE significantly faster than search.
- approxSIVP (etc.) <= LWE reductions.
- These reductions are also "uber"-advantage-tight.
- Their time overhead is a rather large polynomial that corresponds to hundreds of bits of security for typical parameters.
- Yet we don't know how to solve the LWE problem significantly faster than its underlying approxSIVP.
- In fact, the LWE problem may even be substantially
*harder*than its underlying approxSIVP. (This would not be in conflict with the reduction.)

Sincerely yours in cryptography,

Chris

--

You received this message because you are subscribed to the Google Groups "pqc-forum" group.

To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20210619153506.644871.qmail%40cr.yp.to.

Jun 20, 2021, 11:42:03 PM6/20/21

to Greg Maxwell, daniel.apon, D. J. Bernstein, pqc-...@list.nist.gov

Greg, a few points of clarification:

On Sun, Jun 20, 2021 at 8:03 PM Greg Maxwell <gmax...@gmail.com> wrote:

Dan has raised a credible argument that loose (or inapplicable) proofs

may have degraded submissions.

I don't think that it is a credible argument, as I explain in my latest email highlighting the distinction between advantage-loose reductions (demonstrably risky, with multiple serious security losses) and advantage-tight, poly-time-loose ones (no demonstrated security losses, to my knowledge). From the very start, the argument is premised on a conflation of the two notions.

To the extent that tighter proofs may

exist it is not necessarily the case that everyone knows about them--

e.g. Even in Peikert's case he states that he was unaware of the

particular work more recently discussed until near the round-3

deadline.

To be clear, that particular work did *not* give a new or tighter proof (reduction) for LWE, nor new techniques. It analyzed the existing proofs from the perspective of concrete parameters.

Would Frodo-- or some other candidate-- have had a

different design if the new techniques were known previously

So, the premise behind this question -- that this later work contained new techniques that might have affected the design -- is false.

To some extent I think Dan's general argument is hard to argue with,

at least the process needs to be mindful of uselessly loose proofs or

non-parameter-compensated proposals.

I think Dan's argument is not so hard to dispute, once one recognizes the big difference between time-looseness and advantage-looseness.

I do agree that the process needs to be mindful of uselessly loose proofs. That is why I recommend that, given the history, nontrivial advantage-tight reductions be given much more credit than advantage-loose ones.

That is, unless you reject

Koblitz and Menezes well defended position that inappropriately

applied provable security has been shown to actively undermine

cryptosystem security in some instances.

I don't reject that position, but I do highlight that in all such instances they have put forward related to looseness, the blame lies in advantage-looseness, not time-looseness.

In what I have read in their critiques, they do not distinguish between the two notions; they conflate them into a single quantity. I think this is a mistake, because it obscures the true source of the demonstrated security losses, and inappropriately tars one type of looseness with the failures of the other.

The full strength version that submissions should be *penalized* for

using a weak argument (maybe even if later stronger ones were

discovered) is perhaps a bridge too far, but the science of high

integrity processes for selecting cryptosystems is far from

established and Dan's points are precisely the sort that should be

considered to help advance that field.

I agree that Dan's points should be carefully considered and evaluated, and I have done so (as you can see from my lengthy replies).

Ultimately, I find his comparison of the (advantage-)loose MQDSS reduction to the (advantage-tight, time-)loose SIVP<=LWE reduction inapt and unpersuasive. This is because the comparison is premised on the faulty notion of a single combined metric of "looseness" that does not distinguish between time and advantage.

As an empirical matter, these two kinds of looseness carry very different levels of risk -- advantage-looseness is the source of several major demonstrated security losses, whereas time-looseness (on its own) is not. So, the process should treat them differently.

Jun 21, 2021, 5:00:32 AM6/21/21

to pqc-forum

I appreciate the clarity of the new claim that "advantage-tight (but

poly-time-loose) reductions" are not "demonstrably risky".

To disprove the claim, I'll specify problems X and Y; exhibit an

advantage-tight but poly-time-loose reduction converting any attack

against X into an attack against Y; and note that known attacks against

X are in fact much faster than known attacks against Y.

This is a basic textbook example. It's a pre-quantum example---the same

structure used inside the reduction turns out, with more work, to enable

a poly-time quantum break, which is another reason to be concerned about

such reductions---but this suffices to disprove the claim. (The examples

selected to support the claim included pre-quantum examples. Retroactive

narrowing of the claim to post-quantum examples would be p-hacking.)

Problem Y: Compute discrete logs in a group G(lambda) with generator

g(lambda), where the discrete logs are chosen uniformly at random mod

#G(lambda), and lambda is the security parameter.

Problem X: Same except that the discrete logs are chosen uniformly at

random between 0 and E(lambda)-1 where E(lambda) <= #G(lambda) and the

ratio #G(lambda)/E(lambda) is Theta(lambda^10). (In context, this would

be the proposed problem that someone, presumably not knowing any better

attacks, is trying to convince us is at least as secure as Problem Y.)

Reduction: Given an attack A against Problem X, here's an attack B

against Problem Y. The Y input is h. Abbreviate g(lambda) as g. Take R

in Theta(lambda^10), and repeat the following R times: compute e =

A(hg^r) for uniform random r mod #G; if g^e = hg^r, output e-r mod #G

and stop. (Some people would find it cleaner to insert a test for e

being between 0 and E(lambda)-1, but one-sided bounds don't need this.)

Advantage tightness of the reduction: The probability that hg^r = g^e

for some e between 0 and E(lambda)-1 is E(lambda)/#G(lambda). If this

occurs, then hg^r is a valid A input, and A(hg^r) recovers e with

conditional probability eps by assumption, in which case B sees that e

is between 0 and E(lambda)-1, sees that g^e = hg^r, and outputs e-r,

which is a correct discrete log of h. Overall each loop succeeds with

probability eps E(lambda)/#G(lambda). Failures are independent across

loops, so the overall failure chance is (1-eps E(lambda)/#G(lambda))^R,

which is (1-eps/Theta(lambda^10))^Theta(lambda^10), i.e., 1-Theta(eps),

for total success chance Theta(eps).

Time looseness of the reduction: B takes Theta(lambda^10) times as long

as A, plus poly overhead for computing g^r etc. (Maybe B gets lucky and

succeeds more quickly, but looseness is defined by the upper bound.)

Loss of security against known attacks: For reasonable choices of G,

the best attacks known against Problem Y take Theta(sqrt(#G(lambda)))

group operations, while the best attacks known against Problem X (by the

"kangaroo method") are Theta(lambda^5) times faster than that. For

concrete sizes this could easily make attacks trillions of times faster,

depending on the Theta constants.

---Dan

poly-time-loose) reductions" are not "demonstrably risky".

To disprove the claim, I'll specify problems X and Y; exhibit an

advantage-tight but poly-time-loose reduction converting any attack

against X into an attack against Y; and note that known attacks against

X are in fact much faster than known attacks against Y.

This is a basic textbook example. It's a pre-quantum example---the same

structure used inside the reduction turns out, with more work, to enable

a poly-time quantum break, which is another reason to be concerned about

such reductions---but this suffices to disprove the claim. (The examples

selected to support the claim included pre-quantum examples. Retroactive

narrowing of the claim to post-quantum examples would be p-hacking.)

Problem Y: Compute discrete logs in a group G(lambda) with generator

g(lambda), where the discrete logs are chosen uniformly at random mod

#G(lambda), and lambda is the security parameter.

Problem X: Same except that the discrete logs are chosen uniformly at

random between 0 and E(lambda)-1 where E(lambda) <= #G(lambda) and the

ratio #G(lambda)/E(lambda) is Theta(lambda^10). (In context, this would

be the proposed problem that someone, presumably not knowing any better

attacks, is trying to convince us is at least as secure as Problem Y.)

Reduction: Given an attack A against Problem X, here's an attack B

against Problem Y. The Y input is h. Abbreviate g(lambda) as g. Take R

in Theta(lambda^10), and repeat the following R times: compute e =

A(hg^r) for uniform random r mod #G; if g^e = hg^r, output e-r mod #G

and stop. (Some people would find it cleaner to insert a test for e

being between 0 and E(lambda)-1, but one-sided bounds don't need this.)

Advantage tightness of the reduction: The probability that hg^r = g^e

for some e between 0 and E(lambda)-1 is E(lambda)/#G(lambda). If this

occurs, then hg^r is a valid A input, and A(hg^r) recovers e with

conditional probability eps by assumption, in which case B sees that e

is between 0 and E(lambda)-1, sees that g^e = hg^r, and outputs e-r,

which is a correct discrete log of h. Overall each loop succeeds with

probability eps E(lambda)/#G(lambda). Failures are independent across

loops, so the overall failure chance is (1-eps E(lambda)/#G(lambda))^R,

which is (1-eps/Theta(lambda^10))^Theta(lambda^10), i.e., 1-Theta(eps),

for total success chance Theta(eps).

Time looseness of the reduction: B takes Theta(lambda^10) times as long

as A, plus poly overhead for computing g^r etc. (Maybe B gets lucky and

succeeds more quickly, but looseness is defined by the upper bound.)

Loss of security against known attacks: For reasonable choices of G,

the best attacks known against Problem Y take Theta(sqrt(#G(lambda)))

group operations, while the best attacks known against Problem X (by the

"kangaroo method") are Theta(lambda^5) times faster than that. For

concrete sizes this could easily make attacks trillions of times faster,

depending on the Theta constants.

---Dan

Jun 21, 2021, 11:58:04 AM6/21/21

to D. J. Bernstein, pqc-forum

Hi Dan,

Would it be fair to generalize this counterexample as follows?

“”"

Chris claims a distinction that reductions with advantage loss are risky,

but those with polynomial time loss are usually fine. But this criterion

can’t be accurate for certain domains (especially: random-self-reducible

search problems with efficiently checkable answers), because the

advantage can be boosted at the cost of time by repeatedly running

the reduction with different randomness. So an advantage-loose

reduction implies a time-loose but advantage-tight one.

You could try to make some kind of structural distinction to disallow

boosting, but it would inevitably fail to be rigorous.

“”"

A couple more points, since I’ve gotten involved. First, I consider the

LWE >= SIVP/BDDwDGS and LWR >= LWE reductions not to say that

LWE or LWR are better choices for cryptosystems than those other

problems, precisely because the reductions are loose. On the contrary,

if we could base a key exchange directly on worst-case SIVP, with

similar parameters to an LWE cryptosystem (both in terms of S[I]VP

dimension and cryptosystem performance), then this would be

preferable in my opinion. Instead, the reduction excludes a way that

LWE might fall down spectacularly, namely an attack that breaks LWE

with Gaussian error in polynomial time but don’t break worst-case

SIVP.

In the case of Frodo, the reduction is loose with respect to dimension

and modulus. So perhaps an attack could take advantage of Frodo’s

small modulus or small dimension, but probably not purely its choice

of Gaussian noise. This isn’t a convincing argument for Frodo’s

security, but it is an argument that the authors didn’t screw up certain

aspects of the design, much as a (Q)ROM proof mostly exists to show

that you didn’t screw up the padding scheme.

Likewise with the LWR >= LWE reduction, this doesn’t say that LWR is

a better choice than LWE, because the parameters are much worse.

On the contrary, if we could build LWE with the same simplicity and

performance as LWR at the same parameter sizes, then this would

be preferable in my opinion. Rather, the reduction (weakly) addresses

the concern that LWR might be somehow structurally much worse

than LWE, by ruling out attacks that break LWR despite extreme

amounts of rounding, but not LWE with smaller amounts of noise.

In general, I think it’s perfectly acceptable to use such theorems to

justify design decisions — they just aren’t meaningful as security

arguments for the whole system. There are other contexts where

we use similar reasoning to choose or to justify aspects of the

design. For example, we use (Q)ROM proofs to justify padding

schemes, even though the real hash isn’t a random oracle. Some

systems avoid unnecessary structure (e.g. cyclotomics, or reducible

or composite moduli, or elliptic curves with small CM discriminant)

even when it isn’t known to lead to an attack on the system, because

it rules out a class of hypothetical attacks which would take advantage

of those structures.

It’s certainly possible to use these arguments dishonestly, but I don’t

think it’s fair to presume dishonesty without evidence. It’s also possible

that they lead to a system which is somehow worse — usually slower

or more complicated, e.g. I would not argue for adding an extra bit to

RSA-FDH to make the ROM reduction go through — but conceivably

also less secure. But again, I would not presume without evidence

that a system is less secure due to a reduction-motivated choice, and

the performance and complexity are already part of the evaluation.

So I don’t think it’s at all fair to say that we should give such proofs

a negative weight in the evaluation.

Best regards,

— Mike

Would it be fair to generalize this counterexample as follows?

“”"

Chris claims a distinction that reductions with advantage loss are risky,

but those with polynomial time loss are usually fine. But this criterion

can’t be accurate for certain domains (especially: random-self-reducible

search problems with efficiently checkable answers), because the

advantage can be boosted at the cost of time by repeatedly running

the reduction with different randomness. So an advantage-loose

reduction implies a time-loose but advantage-tight one.

You could try to make some kind of structural distinction to disallow

boosting, but it would inevitably fail to be rigorous.

“”"

A couple more points, since I’ve gotten involved. First, I consider the

LWE >= SIVP/BDDwDGS and LWR >= LWE reductions not to say that

LWE or LWR are better choices for cryptosystems than those other

problems, precisely because the reductions are loose. On the contrary,

if we could base a key exchange directly on worst-case SIVP, with

similar parameters to an LWE cryptosystem (both in terms of S[I]VP

dimension and cryptosystem performance), then this would be

preferable in my opinion. Instead, the reduction excludes a way that

LWE might fall down spectacularly, namely an attack that breaks LWE

with Gaussian error in polynomial time but don’t break worst-case

SIVP.

In the case of Frodo, the reduction is loose with respect to dimension

and modulus. So perhaps an attack could take advantage of Frodo’s

small modulus or small dimension, but probably not purely its choice

of Gaussian noise. This isn’t a convincing argument for Frodo’s

security, but it is an argument that the authors didn’t screw up certain

aspects of the design, much as a (Q)ROM proof mostly exists to show

that you didn’t screw up the padding scheme.

Likewise with the LWR >= LWE reduction, this doesn’t say that LWR is

a better choice than LWE, because the parameters are much worse.

On the contrary, if we could build LWE with the same simplicity and

performance as LWR at the same parameter sizes, then this would

be preferable in my opinion. Rather, the reduction (weakly) addresses

the concern that LWR might be somehow structurally much worse

than LWE, by ruling out attacks that break LWR despite extreme

amounts of rounding, but not LWE with smaller amounts of noise.

In general, I think it’s perfectly acceptable to use such theorems to

justify design decisions — they just aren’t meaningful as security

arguments for the whole system. There are other contexts where

we use similar reasoning to choose or to justify aspects of the

design. For example, we use (Q)ROM proofs to justify padding

schemes, even though the real hash isn’t a random oracle. Some

systems avoid unnecessary structure (e.g. cyclotomics, or reducible

or composite moduli, or elliptic curves with small CM discriminant)

even when it isn’t known to lead to an attack on the system, because

it rules out a class of hypothetical attacks which would take advantage

of those structures.

It’s certainly possible to use these arguments dishonestly, but I don’t

think it’s fair to presume dishonesty without evidence. It’s also possible

that they lead to a system which is somehow worse — usually slower

or more complicated, e.g. I would not argue for adding an extra bit to

RSA-FDH to make the ROM reduction go through — but conceivably

also less secure. But again, I would not presume without evidence

that a system is less secure due to a reduction-motivated choice, and

the performance and complexity are already part of the evaluation.

So I don’t think it’s at all fair to say that we should give such proofs

a negative weight in the evaluation.

Best regards,

— Mike

> --

> You received this message because you are subscribed to the Google Groups "pqc-forum" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

> To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20210621090014.782571.qmail%40cr.yp.to.
> You received this message because you are subscribed to the Google Groups "pqc-forum" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Jun 21, 2021, 12:02:27 PM6/21/21

to pqc-forum

On Mon, Jun 21, 2021 at 5:00 AM D. J. Bernstein <d...@cr.yp.to> wrote:

I appreciate the clarity of the new claim that "advantage-tight (but

poly-time-loose) reductions" are not "demonstrably risky".

To disprove the claim, I'll specify problems …

Dan, thanks for the example. As I said, I suspect that contrived examples exist (of advantage-tight reductions X<=Y where Y is significantly easier than X in a concrete sense), and this one is certainly contrived! That is immediately apparent from the lambda^10 appearing in the problem description; see also the end of this message for a quick concrete analysis.

However, I don't think that this kind of contrived example demonstrates any real risk. See below for my reasoning, but first...

=== Interlude

Do you agree with the following of my other points, which you have not remarked upon (and are not dependent on the point you did address)? To save time, I would be happy to interpret your silence as agreement (and to do the same for the unanswered points made in this, this, this, this, this, this, and this).

- In all the previously presented examples of loose reductions that are accompanied by demonstrated security losses -- including attacks on real-world proposals MQDSS, symmetric modes of operation, and multi-user systems -- the loss is directly attributable to the reduction's "advantage loss," not its "time loss." So, advantage-loose reductions are demonstrably risky.
- Advantage-tight (but poly-time-loose) reductions are highly correlated with a
*lack*of security loss in practice (i.e., in real, non-contrived systems). Indeed, the critiques of looseness have presented*zero*counterexamples to this phenomenon. - For these reasons, it is a mistake to conflate time-looseness and advantage-looseness into a single metric. In particular, your comparison between the advantage-loose MQDSS reduction and advantage-tight lattice reductions is inapt.
- For the same reasons, NIST should tend to credit nontrivial "advantage-tight" reductions much more highly than "advantage-loose" ones.
- The {approxSIVP,GapSVP,BDDwDGS}<=LWE reductions are very advantage-tight and even apply to the proposed LWE parameters underlying FrodoKEM.

- The known LWE<=LWR reductions are inherently very advantage-loose for the proposed parameters of LWR-style submissions (NTRU Prime and SABER).

=== (now back to Dan's example)

Contrived examples have their place---for example, there are contrived and even much-less-contrived counterexamples to the random-oracle heuristic. Yet in practice we treat real hash functions as random oracles all over the place, where we judge that this represents an acceptably low risk compared to the benefits.

I have not heard of any real standards or systems that received good expert design/scrutiny of their random-oracle modeling, and ended up suffering a significant security loss due to a previously unknown failure mode. (To be clear: the possibility that the hash function will fail to have standard properties like collision resistance has always been a *known* failure mode.) In this sense, despite the existence of contrived counterexamples, we have not yet seen any *demonstrated risk* of the random-oracle heuristic, applied competently.

Similarly, I don’t think your example demonstrates any real risk. It would only do so if:

- it was not contrived, i.e., it had a shred of a chance of being proposed for real use, independent of its role as an artificial example, and
- the proposer relied primarily on the reduction X<=Y to justify the concrete security of the (contrived) problem Y, and
- there was not adequate cryptanalysis of problem Y that would reveal its relative weakness.

It's worth noting that none of these apply to LWE parameters covered by a worst-case hardness theorem, as used by FrodoKEM. (1) LWE is obviously not contrived. (2) The FrodoKEM submission explicitly disclaims using the worst-case-hardness reduction for concrete security; it uses cryptanalysis. (3) There has been enormous cryptanalytic effort on LWE, most of which even targets FrodoKEM's specific family of error distributions (Gaussians).

Interestingly, I think your example actually illustrates why there is *less risk* when using problems supported by worst-case hardness reductions (even time-loose ones), as FrodoKEM's LWE parameters are.

- In the proposed reduction X<=Y, problem X is discrete log with uniformly random exponent, and Y is dlog with an ad-hoc, highly restricted exponent (chosen from a relatively much smaller interval).
- Since the 1970s we have known that X has a kind of "worst-case-hard" exponent distribution, i.e., solving dlog (in the fixed group) for uniform exponent is at least as hard as for any other exponent distribution, via a tight reduction. In particular, Y<=X (tightly).
- The ad-hoc exponent distribution in Y has a significant non-obvious weakness, which turns out to be exploitable by a clever cryptanalytic attack (Pollard's kangaroo).
- Clearly, problem X (or something with near-uniform exponent distribution) is the less risky choice.

By analogy:

- X is to Y, as LWE (with worst-case-hard parameters) is to ad-hoc lattice problems seen in other NISTPQC submissions---e.g., using quotients; using very small binary/ternary errors, perhaps of fixed Hamming weight; using small deterministic rounding instead of random errors; etc.
- In recent years we have seen multiple clever cryptanalytic attacks that can exploit each one of these ad-hoc choices to various degrees in certain parameter regimes---even completely breaking some parameters that were thought to be quite secure! And some choices, like small rounding, have not received much devoted cryptanalytic effort. Our understanding of these ad-hoc choices and their various combinations is not at all mature.
- By contrast, and despite a lot of cryptanalytic effort, we have not seen any attacks that do better on LWE than on generic BDD for corresponding lattice parameters.
- On this evidence, LWE (or something distributionally close to it) is the less risky choice.

The only gap in the analogy is that LWE's worst-case-hardness reduction is time-loose (but advantage-tight), whereas discrete log's is fully tight. But even the critics have not provided a single example where this kind of distinction has led to a significant security gap in a non-contrived proposal. As a matter of demonstrated risk, the time-looseness is far less risky than the ad-hoc choices described above.

To close, here is a concrete analysis of the contrived example. (Note: the labeling of X,Y and eps, eps’ are swapped between Dan’s and my messages, but for consistency I will swap them back when I quote him, so that we're always talking about reductions X<=Y.)

Problem X: Compute discrete logs in a group G(lambda) with generator

g(lambda), where the discrete logs are chosen uniformly at random mod

#G(lambda), and lambda is the security parameter.

Problem Y: Same except that the discrete logs are chosen uniformly at

random between 0 and E(lambda)-1 where E(lambda) <= #G(lambda) and the

ratio #G(lambda)/E(lambda) is Theta(lambda^10).

The lambda^10 already makes this contrived. More generally, we must use a concretely big ratio P = #G(lambda)/E(lambda) to make the example work, because the X-versus-Y security gap (in bits) is roughly (log P)/2.

For Y to have lambda bits of security against brute-force attacks, we need E(lambda) >= 2^lambda. Let’s take a typical lambda of 128=2^7. Then in the stipulated group G, discrete logs are about 2*lambda=256 bits long.

So, in Y the discrete log is 70 bits shorter than in X (using P=lambda^10). In other words, it fixes more than 27% of the exponent's bits to zero. This is highly unnatural, and doesn't even pass a basic sanity check (even if we didn't know about Pollard's kangaroo attack).

In sum, the example does not demonstrate any real risk, especially to a highly scrutinized standardization process.

Jun 21, 2021, 5:57:36 PM6/21/21

to pqc-forum

Security claims regarding NISTPQC submissions have to be stated clearly

for public review. When a reviewer disproves a security claim, the claim

has to be withdrawn, and the withdrawal has to properly credit the

disproof. Any other approach disincentivizes security review.

There was a clear claim that "advantage-tight (but poly-time-loose)

reductions" are not "demonstrably risky". This claim had no ill-defined

escape hatches. I wrote a detailed disproof of the claim. Where's the

retraction?

> So, in Y the discrete log is 70 bits shorter than in X (using P=lambda^10). In

> other words, it fixes more than 27% of the exponent's bits to zero. This is

> highly unnatural, and doesn't even pass a basic sanity check (even if we didn't

> know about Pollard's kangaroo attack).

In fact, there are many short-exponent discrete-log proposals in the

literature, for example driven by efficiency concerns (along with some

assessment of known attacks). How is it not _natural_ to save time?

Cryptographers also have different ideas regarding what qualifies as a

"sanity check".

The reduction between discrete logs and short discrete logs is

well-known textbook material. If a short-discrete-log proposer hears a

claim that "advantage-tight (but poly-time-loose) reductions" are a good

thing, isn't the _natural_ response to

* write down such a reduction and then

* specialize the polynomial to fit the proposal's parameters?

Doing this work for a real short-discrete-log proposal would come out

with different polynomials, depending on the proposal details---maybe

lambda^10 to fit one proposal, maybe 100*lambda^50 to fit another. The

looseness of the theorems _for these parameters_ would be quantified and

reviewable, helping everyone see the limits of the theorems.

We're being told that seeing a lambda^10 makes it "immediately apparent"

that we're looking at something "contrived". But how are these specific

numbers not the _natural_ result of people designing their proposals for

this advertising mechanism---including following through and quantifying

their theorems, as per Goldreich's statement that "one needs to know the

exact complexity of the reduction"?

If NISTPQC proposals advertising "worst-case" theorems were required to

match up their parameter sets to quantified theorem statements then we'd

see all sorts of specific numbers there too. By an amazing coincidence,

in this very discussion someone highlighted a thesis where Theorem 1 has

looseness factor 276 x^2 y n^8 m q log(...), which sounds like total

degree 13, including 8 just for n. If the definition of, e.g., Frodo had

been modified to allow this theorem to apply---while, subject to this

constraint, minimizing space and time---then wouldn't we see related

numbers in the definition of the cryptosystem?

> To save time, I would be happy to interpret your silence as agreement

> (and to do the same for the unanswered points made in this, this,

> this, this, this, this, and this).

I'm too busy at the moment to give those additional points the treatment

that they clearly deserve. My apologies.

Since you seem to have some time to spare, perhaps you could consider

addressing the unanswered points on this topic already made two years

ago in https://cr.yp.to/papers.html#latticeproofs, which identifies four

of your claims as erroneous, and a fifth as unclear.

Let me also suggest an alternative way to save time, namely a simple

requirement of _clarity_ for every step of NISTPQC evaluations and

decisions. For example, instead of spending time arguing about what's

"natural", we can simply agree that the lack of clarity of the word

forces it to be eliminated from the discussion.

This is _almost_ required by the call for proposals, which says that

"NIST will perform a thorough analysis of the submitted algorithms in a

manner that is open and transparent to the public"---the analysis can't

be transparent if the public can't figure out what the words mean. The

only question, then, is whether NIST should be allowed to evade this

requirement by relying on unclear evaluations from outside NIST.

---Dan

for public review. When a reviewer disproves a security claim, the claim

has to be withdrawn, and the withdrawal has to properly credit the

disproof. Any other approach disincentivizes security review.

There was a clear claim that "advantage-tight (but poly-time-loose)

reductions" are not "demonstrably risky". This claim had no ill-defined

escape hatches. I wrote a detailed disproof of the claim. Where's the

retraction?

> So, in Y the discrete log is 70 bits shorter than in X (using P=lambda^10). In

> other words, it fixes more than 27% of the exponent's bits to zero. This is

> highly unnatural, and doesn't even pass a basic sanity check (even if we didn't

> know about Pollard's kangaroo attack).

literature, for example driven by efficiency concerns (along with some

assessment of known attacks). How is it not _natural_ to save time?

Cryptographers also have different ideas regarding what qualifies as a

"sanity check".

The reduction between discrete logs and short discrete logs is

well-known textbook material. If a short-discrete-log proposer hears a

claim that "advantage-tight (but poly-time-loose) reductions" are a good

thing, isn't the _natural_ response to

* write down such a reduction and then

* specialize the polynomial to fit the proposal's parameters?

Doing this work for a real short-discrete-log proposal would come out

with different polynomials, depending on the proposal details---maybe

lambda^10 to fit one proposal, maybe 100*lambda^50 to fit another. The

looseness of the theorems _for these parameters_ would be quantified and

reviewable, helping everyone see the limits of the theorems.

We're being told that seeing a lambda^10 makes it "immediately apparent"

that we're looking at something "contrived". But how are these specific

numbers not the _natural_ result of people designing their proposals for

this advertising mechanism---including following through and quantifying

their theorems, as per Goldreich's statement that "one needs to know the

exact complexity of the reduction"?

If NISTPQC proposals advertising "worst-case" theorems were required to

match up their parameter sets to quantified theorem statements then we'd

see all sorts of specific numbers there too. By an amazing coincidence,

in this very discussion someone highlighted a thesis where Theorem 1 has

looseness factor 276 x^2 y n^8 m q log(...), which sounds like total

degree 13, including 8 just for n. If the definition of, e.g., Frodo had

been modified to allow this theorem to apply---while, subject to this

constraint, minimizing space and time---then wouldn't we see related

numbers in the definition of the cryptosystem?

> To save time, I would be happy to interpret your silence as agreement

> (and to do the same for the unanswered points made in this, this,

> this, this, this, this, and this).

that they clearly deserve. My apologies.

Since you seem to have some time to spare, perhaps you could consider

addressing the unanswered points on this topic already made two years

ago in https://cr.yp.to/papers.html#latticeproofs, which identifies four

of your claims as erroneous, and a fifth as unclear.

Let me also suggest an alternative way to save time, namely a simple

requirement of _clarity_ for every step of NISTPQC evaluations and

decisions. For example, instead of spending time arguing about what's

"natural", we can simply agree that the lack of clarity of the word

forces it to be eliminated from the discussion.

This is _almost_ required by the call for proposals, which says that

"NIST will perform a thorough analysis of the submitted algorithms in a

manner that is open and transparent to the public"---the analysis can't

be transparent if the public can't figure out what the words mean. The

only question, then, is whether NIST should be allowed to evade this

requirement by relying on unclear evaluations from outside NIST.

---Dan

Jun 21, 2021, 6:44:25 PM6/21/21

to pqc-forum

On Mon, Jun 21, 2021 at 5:57 PM D. J. Bernstein <d...@cr.yp.to> wrote:

There was a clear claim that "advantage-tight (but poly-time-loose)

reductions" are not "demonstrably risky". This claim had no ill-defined

escape hatches. I wrote a detailed disproof of the claim. Where's the

retraction?

It seems we have different ideas of what "demonstrably risky" means. Everything has non-zero risk, but not everything is "risky." In my reply I gave my reasoning why I don't think your example rises to that level, and what it would take for one to do so (conditions 1-3). It's not a good use of our (nor pqc-forum's) time to debate such semantics further.

I do think it would be useful to address more than just that single one of my points. These are the key ones that remain; hopefully the semantics are clear enough.

- In all the previously presented examples of loose reductions that are accompanied by demonstrated security losses -- including attacks on real-world proposals MQDSS, symmetric modes of operation, and multi-user systems -- the loss is directly attributable to the reduction's "advantage loss," not its "time loss." So, advantage-loose reductions are demonstrably risky.
- Advantage-tight (but poly-time-loose) reductions are highly correlated with a
*lack*of security loss in practice (i.e., in real, non-contrived systems). Indeed, the critiques of looseness have presented*zero*counterexamples to this phenomenon. - For these reasons, it is a mistake to conflate time-looseness and advantage-looseness into a single metric. In particular, your comparison between the advantage-loose MQDSS reduction and advantage-tight lattice reductions is inapt.
- For the same reasons, NIST should tend to credit nontrivial "advantage-tight" reductions much more highly than "advantage-loose" ones.
- The {approxSIVP,GapSVP,BDDwDGS}<=LWE reductions are very advantage-tight and even apply to the proposed LWE parameters underlying FrodoKEM.
- The known LWE<=LWR reductions are inherently very advantage-loose for the proposed parameters of LWR-style submissions (NTRU Prime and SABER).

Sincerely yours in cryptography,

Chris

Jun 22, 2021, 12:20:26 AM6/22/21

to Christopher J Peikert, pqc-forum

On Mon, Jun 21, 2021 at 3:44 PM Christopher J Peikert

<cpei...@alum.mit.edu> wrote:

>

> On Mon, Jun 21, 2021 at 5:57 PM D. J. Bernstein <d...@cr.yp.to> wrote:

>>

>> There was a clear claim that "advantage-tight (but poly-time-loose)

>> reductions" are not "demonstrably risky". This claim had no ill-defined

>> escape hatches. I wrote a detailed disproof of the claim. Where's the

>> retraction?

>

>

> It seems we have different ideas of what "demonstrably risky" means. Everything has non-zero risk, but not everything is "risky." In my reply I gave my reasoning why I don't think your example rises to that level, and what it would take for one to do so (conditions 1-3). It's not a good use of our (nor pqc-forum's) time to debate such semantics further.

So you agree the theorems don't actually inform parameter setting that
<cpei...@alum.mit.edu> wrote:

>

> On Mon, Jun 21, 2021 at 5:57 PM D. J. Bernstein <d...@cr.yp.to> wrote:

>>

>> There was a clear claim that "advantage-tight (but poly-time-loose)

>> reductions" are not "demonstrably risky". This claim had no ill-defined

>> escape hatches. I wrote a detailed disproof of the claim. Where's the

>> retraction?

>

>

> It seems we have different ideas of what "demonstrably risky" means. Everything has non-zero risk, but not everything is "risky." In my reply I gave my reasoning why I don't think your example rises to that level, and what it would take for one to do so (conditions 1-3). It's not a good use of our (nor pqc-forum's) time to debate such semantics further.

directly because they don't describe the worst case attacker time

then?

We're interested in the actual security of a scheme, that is the best

possible attack. Now this is inherently unknowable. So what are we to

do? It seems to me that roughly there are four approaches:

1: We can have well studied schemes that people have put work into

analyzing as the basis of our schemes

2: We can argue from longevity

3: We can reduce the actual scheme to a well specified problem, that

is considered hard

4: We can argue that some analogous problems are hard, and that we've

avoided the problems from before.

NTRU and Classic McElice would be an example of 1+2. Both are many

decades old, have survived quite a bit of analytic effort as shown by

the demise of similar schemes in the code world. SIKE would be an

example of 3+1: we know lots about quaternion algebras and the

computation of isogenies, and while the extra data might be worrisome,

it's been around a while with no real improvements. NTRU prime would

be 4+1: we're avoiding particular problematic structures. This is a

rough schema, there's lots of overlap.

The best case example of setting parameters ignoring reduction loss

would be Schnorr signatures. We prove it in the ROM, the forking lemma

costs us square root of our soundness. But then we say that no one has

come up with an attack that can exploit this in however many years,

and no one has any idea how to use it, and so we're justified in using

the hardness of the DLP to set parameters. And I doubt that people

have a serious objection to this: it's quite hard to see how one would

use the fact that it's SHA-256 in any way to attack Schnoor.

By contrast the LWE/SIVP reduction seems a bit fishy. We invoke it,

but then use arguments about the runtime of known LWE attacks to set

parameters. So what's the point of using the reduction, given that it

doesn't rule out a whole bunch of attacks we are worried about, if the

constants are sufficiently bad? And given that we're not really using

the reduction to set the parameters, why is the looseness of LWR to

LWE more worrisome than LWE to SVIP?

Sincerely,

Watson

>

> Sincerely yours in cryptography,

> Chris

>

> --

> You received this message because you are subscribed to the Google Groups "pqc-forum" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

> To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/CACOo0QiU3Vvim4Ow%2BDB0jDr2xY%2BWZ2KnVyjs6KaqL3XV3%3D290Q%40mail.gmail.com.
> You received this message because you are subscribed to the Google Groups "pqc-forum" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Jun 22, 2021, 8:38:44 AM6/22/21

to Watson Ladd, pqc-forum

Watson, this message directly answers the questions you have posed, but some of these questions were already answered near the beginnings of my recent messages in this thread. So, I hope you (and everyone else) will read this one carefully and follow the references for further details.

So you agree the theorems don't actually inform parameter setting that

directly because they don't describe the worst case attacker time

then?

I agree that the known worst-case-hardness theorems don't actually inform parameter setting that directly, and have said so publicly and explicitly more times than I can count (including in this very thread), and have never claimed or even suggested otherwise. The FrodoKEM submission says the same, in multiple places. The 2008 Micciancio-Regev survey on lattice-based cryptography says the same. If anyone has the impression that any of these sources say otherwise, they are badly misinformed.

The reason is not that the theorems "don't describe the worst-case attacker time" -- follow-ups to the seminal works have done exactly that -- but that the analyzed time is too large to draw any useful conclusions about the concrete parameters that have been proposed for practice.

However, the Masters thesis I previously linked to claims that one *can* draw useful conclusions about concrete parameters that are not exceedingly far from the proposed ones. If this is true, it indicates reasonable hope that the proposed parameters could one day be meaningfully supported by tighter worst-case-hardness theorems; for details see the latter half of my reply to your previous message in this thread.

We're interested in the actual security of a scheme, that is the best

possible attack. Now this is inherently unknowable. So what are we to

do? It seems to me that roughly there are four approaches:

1: We can have well studied schemes that people have put work into

analyzing as the basis of our schemes

2: We can argue from longevity

3: We can reduce the actual scheme to a well specified problem, that

is considered hard

4: We can argue that some analogous problems are hard, and that we've

avoided the problems from before.

Following your categorization of other NISTPQC KEMs, FrodoKEM would be an example of 1+2+3+4. (If one thinks the worst-case-hardness theorems have no meaning in practice, then one can focus on these properties instead.)

1. People have put a great deal of work into concretely analyzing the (plain) LWE problem, which is the direct basis of the FrodoKEM construction (see point 3). LWE and its sibling SIS are now the central problems that lattice cryptanalysis has focused on for many years, and this work builds on decades of prior lattice algorithms and cryptanalysis.

2. LWE using Gaussian error was formally defined in 2005, but the "LWE family" of problems -- differing mainly in their use of somewhat different error distributions -- has an even longer history:

- LWE is exactly the "dual"/"bounded-distance decoding" version of the SIS problem, which was proposed in 1996 (concurrently with NTRU).
- It is tightly equivalent to a "bounded random knapsack" [of density Omega(1) or even 1-o(1), depending on parameters], versions of which have been considered dating back to the 1980s.

3. FrodoPKE (the "pre-CCA-transform" version of FrodoKEM) is tightly equivalent to LWE with matching parameters; the latter is widely considered hard (see point 1). [NB: All the remaining NISTPQC KEMs rely on some CCA transform of a non-CCA base system, so they are similar in this respect.]

4. See point 2 for problems that LWE is tightly equivalent to, and add the analogous or even equivalent (via loose reduction) problems: high-density random subset-sum (1980s), Ajtai-Dwork (1997), NTRU (1996), approximate-GCD, and maybe others. All are considered hard, for appropriate parameters analogous to the ones proposed for LWE.

The best case example of setting parameters ignoring reduction loss

would be Schnorr signatures. We prove it in the ROM, the forking lemma

costs us square root of our soundness. But then we say that no one has

come up with an attack that can exploit this in however many years,

and no one has any idea how to use it, and so we're justified in using

the hardness of the DLP to set parameters.

I agree that this is a best-case example. In addition to being very advantage-loose, the reduction is also quite time-loose.

By contrast, the approxSIVP<=LWE reduction is very time-loose, but it is advantage-tight, doesn't use the ROM, and doesn't use the forking lemma. And despite lots of cryptanalytic effort, nobody has come up with an attack that exploits the specific LWE distribution any better than for BDD on generic lattices. So far, so good.

By contrast, people *have found ways* to nontrivially exploit certain ad-hoc choices made in other NISTPQC lattice proposals, e.g., binary/ternary secrets, NTRU with "stretched" parameters. These choices carry nontrivial demonstrated risk. Others, like small deterministic rounding (in lieu of random errors), appear not to have received much dedicated cryptanalytic attention at all.

By contrast the LWE/SIVP reduction seems a bit fishy. We invoke it,

but then use arguments about the runtime of known LWE attacks to set

parameters. So what's the point of using the reduction, given that it

doesn't rule out a whole bunch of attacks we are worried about, if the

constants are sufficiently bad?

I answered this question near the top of my previous reply to you in this thread:

"FrodoKEM uses worst-case-hardness reductions to identify in the first place *which problem to parameterize* -- namely, LWE with Gaussian error -- and then does so according to (extrapolations of) the best known cryptanalysis.

In lattice cryptography, the importance of first identifying a sound problem to parameterize cannot be overstated. The design space is huge and subtle, and the history is littered with the corpses of ad-hoc designs...

The FrodoKEM submission cites a few historical examples of unexpected structural weaknesses in ad-hoc lattice designs (lacking worst-case hardness theorems), but a full list is much longer..."

And given that we're not really using

the reduction to set the parameters, why is the looseness of LWR to

LWE more worrisome than LWE to SVIP?

I also answered this question in the top-level summary of this message, but to repeat: one big reason is that the LWE<=LWR reduction is *extremely advantage-loose* for the parameters used in NISTPQC submissions, whereas the SIVP<=LWE reduction is very advantage-tight (but poly-time-loose).

Empirically, advantage-loose reductions have been much riskier than advantage-tight (but poly-time-loose) ones:

- There are multiple examples of even moderately advantage-loose reductions being directly to blame for major security gaps, including on standardized and deployed systems.
- By contrast, I know of no examples where an advantage-tight but poly-time-loose reduction hid a significant security gap in a carefully proposed and cryptanalyzed system.

Jun 22, 2021, 9:13:42 AM6/22/21

to Watson Ladd, pqc-forum

Apologies for one more email, but I want to be as clear as I possibly can about this:

On Tue, Jun 22, 2021 at 8:38 AM Christopher J Peikert <cpei...@alum.mit.edu> wrote:

- By contrast, I know of no examples where an advantage-tight but poly-time-loose reduction hid a significant security gap in a carefully proposed and cryptanalyzed system.

By "hid," I mean concealed the gap from discovery until after the reduction was used to justify security, and after the careful cryptanalysis was done. In particular, a gap that was already known beforehand cannot be hidden (by anything). I hope this is clear enough.

Reply all

Reply to author