Guidelines for submitting tweaks for Third Round Finalists and Candidates

Moody, Dustin (Fed)

unread,

Jul 23, 2020, 8:02:12 AM7/23/20

to pqc-forum

Deadline: October 1, 2020

Finalist and Candidate teams must meet the same submission requirements and minimum acceptability criteria as given in the original Call for Proposals. Submissions must be submitted to NIST at pqc-sub...@nist.gov by October 1, 2020. It would be helpful if submission teams provided NIST with a summary of their expected changes by August 10, 2020. If either of these deadlines will pose a problem for any submission team, they should contact NIST in advance. In particular, submissions should include a cover sheet, algorithm specifications (and other supporting documentation), and optical/digital media (e.g., implementations, known-answer test files, etc.) as described in Section 2 of CFP.

NIST does NOT need new signed IP statements unless new submission team members have been added or the status of intellectual property for the submission has changed. If either of these cases apply, NIST will need new signed IP statements (see Section 2.D of the CFP). These statements must be actual hard copies—not digital scans—and must be provided to NIST by the 3rd NIST PQC Standardization Conference. In particular, NIST will need new signed IP statements for new members of the merged Classic McEliece team.

In addition, NIST requires a short document outlining the modifications introduced in the new submission. This document should be included in the Supporting Documentation folder of the submission (see Section 2.C.4 of the CFP). NIST will review the proposed changes to see if they meet the submission requirements and minimum acceptability requirements, as well as if they would significantly affect the design of the algorithm, requiring a major re-evaluation. As a general guideline, NIST expects any modifications to the seven finalists to be relatively minor while allowing more latitude to the eight alternate candidate algorithms. Note, however, that larger changes may signal that an algorithm is not mature enough for standardization for some time.

As performance will continue to play a large role in the third round, NIST offers the following guidance. Submitters must include the reference and optimized implementation (which can be the same) with their submission package. The reference implementation should still be in ANSI C; however, the optimized implementation is not required to be in ANSI C. NIST strongly recommends also providing an AVX2 (Haswell) optimized implementation and would encourage other optimized software implementations (e.g. microcontrollers) and hardware implementations (e.g. FPGAs).

NIST is aware that some submission packages may be large in size. The email system for pqc-submi...@nist.gov is only set to handle files up to 25MB. For files which are larger, you may upload your submission package somewhere of your choosing and send us the download link when you submit. If that option is not suitable, NIST has a file transfer system that can be used. To find out about this option, please send a message to pqc-co...@nist.gov. NIST will review the submitted packages as quickly as possible and post the candidate submission packages which are “complete and proper” on our webpage www.nist.gov/pqcrypto. Teams are encouraged to submit early. General questions may be asked on the pqc-forum. For more specific questions, please contact us at pqc-co...@nist.gov.

The NIST PQC team

D. J. Bernstein

unread,

Jul 23, 2020, 10:36:23 AM7/23/20

to pqc-...@list.nist.gov

I'm trying to figure out what security levels submitters are supposed to
target in third-round tweaks. This would seem urgent, given that NIST is
asking for summaries of tweaks in just two weeks.

The call for proposals said the following:

NIST recommends that submitters primarily focus on parameters meeting
the requirements for categories 1, 2 and/or 3, since these are likely
to provide sufficient security for the foreseeable future. To hedge
against future breakthroughs in cryptanalysis or computing
technology, NIST also recommends that submitters provide at least one
parameter set that provides a substantially higher level of security,
above category 3.

However, NIST's latest document states a different rule:

While category 1, 2, and 3 parameters were (and continue to be) the
most important targets for NIST’s evaluation, NIST nevertheless
strongly encourages the submitters to provide at least one parameter
set that meets category 5.

There's a jump from "above category 3" to "category 5", plus a wording
change from "recommends" to "strongly encourages". There are comments
applying the new rule to specific submissions, although the wording is
strikingly different from one submission to another:

* NIST "encourages" Dilithium to add a category 5 parameter set.

* BIKE "did not provide category 5 parameters", and NIST "strongly
encourages" them to do so.

* Anything named after NTRU has been assigned failure wording, as if
category 5 had been requested in the first place: for example, one
of these submissions "lacks" category 5 parameters.

Did I miss a previous announcement of the change of rules? What is the
rationale for the change of rules? Should submission teams aim for some
parameter sets above category 5 in case the goalposts move again?

---Dan

signature.asc

Moody, Dustin (Fed)

unread,

Jul 24, 2020, 11:39:49 AM7/24/20

to D. J. Bernstein, pqc-forum

Dan,

As I noted in a pqc-forum message on June 25, the call for proposals specified a set of evaluation criteria, not a set of "rules." While NISTIR 8309 does strongly encourage submitters to provide at least one parameter set that meets category 5, providing one is not a requirement. If the intention had been for it to be a requirement, it would have been stated as one, as was done, for example, with Rainbow:

Before Rainbow can be ready for standardization, its parameters must be adjusted to ensure that it meets its claimed security targets.

The wording in the comments for Dilithium, BIKE, NTRU, and NTRU Prime may differ, but that should be interpreted as the result of the report having been written by 13 different authors, rather than a deliberate attempt to send subtly different messages to different teams.

While providing category 5 parameters is not a requirement, the call for proposals did note that "schemes with greater flexibility will meet the needs of more users than less flexible schemes, and therefore, are preferable." It particularly noted that flexibility may include that "It is straightforward to customize the scheme’s parameters to meet a range of security targets and performance goals." Providing category 5 parameters would help to demonstrate that a scheme offers this flexibility.

Dustin

NIST

From: pqc-...@list.nist.gov on behalf of D. J. Bernstein
Sent: Thursday, July 23, 2020 10:36 AM
To: pqc-forum
Subject: Re: [pqc-forum] Guidelines for submitting tweaks for Third Round Finalists and Candidates

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20200723143604.203006.qmail%40cr.yp.to.

D. J. Bernstein

unread,

Jul 28, 2020, 4:35:24 AM7/28/20

to pqc-forum

Still puzzled as to which security levels NIST wants from submitters.

The call for proposals "recommends ... at least one parameter set ...
above category 3". However, the latest document "strongly encourages ...
at least one parameter set" in "category 5".

This is clearly a change (even if "recommends" and "strongly encourages"
mean the same thing, which is far from clear). The call for proposals
defines category 4. Category 4 was good enough for the request in the
call for proposals. It is not good enough for the new request.

What is NIST's rationale for this change?

Should submitters also provide something _above_ category 5, in case
NIST decides to define an even higher security level at the end of round
3 and reward submissions that target this security level?

For example, should submitters target the security level of SHA3-512
preimages? This security level was required in the SHA-3 competition, so
why won't it suddenly appear in NISTPQC too? It's not that there's
anything difficult about targeting this security level; it's just yet
another distraction from the job at hand, namely protecting users
against quantum computers.

It would be helpful for submitters to see a rationale that (1) makes
clear why NIST's request changed from 4 to 5 and (2) makes clear that
the request won't change again.

'Moody, Dustin (Fed)' via pqc-forum writes:
> While providing category 5 parameters is not a requirement, the call for
> proposals did note that "schemes with greater flexibility will meet the needs
> of more users than less flexible schemes, and therefore, are preferable." It
> particularly noted that flexibility may include that "It is straightforward to
> customize the scheme’s parameters to meet a range of security targets and
> performance goals." Providing category 5 parameters would help to demonstrate
> that a scheme offers this flexibility.

This says that submitters should also target security levels 2^512 and
2^1024 to demonstrate _even more_ flexibility (plus an even larger
"hedge" against cryptanalysis), right? If not, why not?

Content-wise, I don't understand why flexibility to reach super-high
security levels is even a question at this point. Didn't every
submission already explain how to scale up parameters? Is there some
dispute about the details? (Sure, some submissions struggle to prove
their claimed failure rates for decryption, but this is already an issue
for category 1.)

I also don't understand how this is supposed to answer the rationale
question. Yes, the call for proposals requested flexibility---and
requested a parameter set above category 3. If flexibility was
supposed to justify a higher request for category 5, why didn't the call
for proposals already request category 5? Why did this request suddenly
appear in July 2020?

Meanwhile NIST didn't criticize Kyber for a much more clear lack of
flexibility to target dimensions strictly between 512 and 768, which
are claimed to reach category 1 and category 3. If some implausibly
speculated inability to target 5 deserves complaints, why does a much
more clear inability to target 2---which NIST says is a much more
important security level than 5---not deserve complaints? NIST _did_
criticize NewHope for having nothing between 1 and 5!

Even if we ignore NIST's private discussions with NSA, it is not a good
look for NIST to be (1) inconsistently applying its announced evaluation
criteria and (2) changing to new evaluation criteria without first
asking for public comment. Perhaps there's a clear explanation for what
happened here, but I don't see it, and this raises further questions
regarding what's going to happen in the future.

Round-3 submitters are being asked to summarize changes by 10 August. In
previous rounds, a submission structurally locked into a few security
levels (such as NewHope) didn't really have a choice of what to do, but
more flexible submissions were actively misled by NIST's call for
proposals requesting something "above category 3" when now it turns out
that NIST wants specifically "category 5". Now we all have to ask
whether NIST is suddenly going to change the criteria again in 2021.

---Dan

signature.asc

Kirk Fleming

unread,

Jul 28, 2020, 1:58:18 PM7/28/20

to pqc-...@list.nist.gov

In the table on page 18 of the Call for Proposals NIST give the following classical estimates for each security category.

Category 1: 2^143 gates
Category 2: 2^146 gates
Category 3: 2^207 gates
Category 4: 2^210 gates
Category 5: 2^272 gates

I think all reasonable readers would agree that while Category 4 is higher than Category 3 (210 is indeed greater than 207) it is not substantially so. Rainbow is a good example of why Category 4 parameters are not enough of a hedge against even slight improvements in cryptanalysis. They claimed the Rainbow-IIIc parameters met the requirements for Category 4 but the improved RBS attack drops them below Category 3.

Kirk

Sent: Tuesday, July 28, 2020 at 8:35 AM
From: "D. J. Bernstein" <d...@cr.yp.to>
To: "pqc-forum" <pqc-...@list.nist.gov>

Subject: Re: [pqc-forum] Guidelines for submitting tweaks for Third Round Finalists and Candidates

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20200728083505.605157.qmail%40cr.yp.to.

D. J. Bernstein

unread,

Jul 29, 2020, 4:13:23 AM7/29/20

to pqc-...@list.nist.gov

Kirk Fleming writes:
> In the table on page 18 of the Call for Proposals NIST give the following
> classical estimates for each security category.
> Category 1: 2^143 gates
> Category 2: 2^146 gates
> Category 3: 2^207 gates
> Category 4: 2^210 gates
> Category 5: 2^272 gates

I agree that for non-quantum attacks it's basically 1=2, big jump to
3=4, big jump to 5=6 (where by 6 I mean the next line in the table).

However, the quantum-attack costs in the same table are much more like
1, big jump to 2=3, big jump to 4=5, big jump to 6. Specifically, the
call for proposals considers MAXDEPTH as large as 2^96, and for this
MAXDEPTH the table estimates the following costs (obviously not
accounting for subsequent quantum AES attack speedups):

category 1: 2^74 quantum gates
category 2: 2^146 non-quantum gates
category 3: 2^137 quantum gates
category 4: 2^210 non-quantum gates
category 5: 2^202 quantum gates
line 6: 2^274 non-quantum gates

The point is that categories 1/3/5 allow Grover speedups while 2/4
don't, so AES-128 is nowhere near category 2, and AES-192 is nowhere
near category 4.

It's certainly possible that quantum gates will be so expensive and
MAXDEPTH so much smaller as to close this gap---meaning that Grover's
algorithm will never be of any use, and quantum computing will pretty
much consist of using Shor etc. to break cryptosystems---but maybe
quantum computing will turn out to not have such extreme overheads.

No matter where category 4 is positioned between category 3 and category
5, there's clearly a huge quantitative jump from category 3 (AES-192) to
category 5 (AES-256), leaving a ton of room between the requested "above
category 3" in the call for proposals and the requested "category 5" in
the latest document. I'm still puzzled about NIST's rationale for this
change, and I'm still trying to figure out whether this means submitters
should propose parameter sets _beyond_ category 5.

> I think all reasonable readers would agree that while Category 4 is higher than
> Category 3 (210 is indeed greater than 207) it is not substantially so.

One could equally well argue the opposite picture, looking at the
quantum numbers above: category 4 is _much_ higher than category 3,
while category 3 is not substantially higher than category 2. Most
submissions have some sort of Grover speedup, creating a big gap between
category 1 and category 2, and a big gap between category 3 and category
4, where the size of the gap depends on the Grover effectiveness.

SIKE, on the other hand, seems unaffected by Grover-type algorithms
under reasonable cost assumptions, so _for that submission_ category 1
is pretty much the same as category 2, and category 3 is pretty much the
same as category 4, which is the picture you described.

(On the other _other_ hand, in the "quantum gates" metric with one very
common definition, SIKE _is_ affected by known quantum algorithms! But
this doesn't mean that SIKE would be unable to claim categories 2/4---it
really means that categories 2 and 4 provide less security against known
attacks than NIST's table claims, in one of the oversimplified metrics
that the table claims to be using. Anyway, in reality there's no
security loss.)

It's awfully confusing to try to compare submissions according to
"categories" whose spacing varies from one submission to another. I
objected to this whole system when the draft call was issued: I said
that we should focus exclusively on post-quantum security, emphasize
post-quantum security levels between 2^64 and 2^128, and compare systems
on a two-dimensional graph of performance vs. post-quantum security
level. See

https://blog.cr.yp.to/20161030-pqnist.html
https://groups.google.com/a/list.nist.gov/d/msg/pqc-forum/lBc-Gj2rSx0/cGCFEz4UAAAJ

and subsequent messages.

I don't think it's too late in the process for NIST to scrap the
counterproductive pre-quantum requirements, and to scrap the
counterproductive blurring of security levels into "categories". Imagine
how much information would be lost from the graphs in

https://cr.yp.to/papers.html#paretoviz

if each dot were bumped to one of just five possible vertical positions!

In any case, NIST should _not_ be making and applying changes in the
evaluation criteria without first asking for public comments on those
changes.

> Rainbow is a good example of why Category 4 parameters are not enough
> of a hedge against even slight improvements in cryptanalysis.

One can similarly argue that category 5 is not enough, since a slight
improvement in cryptanalysis can drop a category-5 parameter set below
the category-5 floor _and_ below the category-4 floor, assuming
sufficient Grover applicability.

However, before issuing the final call for proposals, NIST issued the
following clarification regarding procedures:

We’re not going to kick out a scheme just because they set their
parameters wrong. ... We will respond to attacks that contradict the
claimed security strength category, but do not bring the maturity of
the scheme into question, by bumping the parameter set down to a
lower category, and potentially encouraging the submitter to provide
a higher security parameter set.

I haven't seen any announcements of proposed changes to this policy. In
the latest document, NIST is allowing various submissions to proceed
even with proposed parameter sets that seem below the floor of category
1 as defined in the call for proposals---which is much closer to actual
danger than the categories 3 vs. 4 vs. 5 that we're discussing.

Regarding the bleeding-edge parameters, I commented on this list a few
months ago that the incentives here are a problem: "There are many
reasons to think that the efficiency of bleeding-edge parameters will
attract attention and reward the submission as a whole, while complaints
about security will at worst have _those parameters_ removed, according
to NIST's rules." Sure enough, NIST fell into exactly this trap in its
latest report, praising the performance of specific parameter sets that
everyone can see claim pre-quantum Core-SVP security only 2^100!

---Dan

signature.asc

Moody, Dustin (Fed)

unread,

Jul 29, 2020, 10:59:45 AM7/29/20

to D. J. Bernstein, pqc-forum

Dan,

In our report we strongly encouraged submitters to provide at least one parameter set that meets category 5. We have previously noted this is NOT a requirement. As we've also already said, the call for proposals specified a set of evaluation criteria, not a set of "rules" or "requirements".

Again, while providing category 5 parameters is not a requirement, the call for proposals did note that "schemes with greater flexibility will meet the needs of more users than less flexible schemes, and therefore, are preferable." It particularly noted that flexibility may include that "It is straightforward to customize the scheme’s parameters to meet a range of security targets and performance goals." Providing category 5 parameters would help to demonstrate that a scheme offers this flexibility.

We are happy to discuss and get feedback from the community on this (and any other) issue. In doing so, we strive to adhere to the principles, processes, and procedures set forth in NISTIR 7977, NIST Cryptographic Standards and Guidelines Development Process. We want the PQC standardization process to be as open and transparent as possible. We encourage discussion on the pqc-forum, but as not everything needs to be on the public forum, we also can be contacted directly at pqc-co...@nist.gov.

Dustin Moody

NIST PQC

From: pqc-...@list.nist.gov on behalf of D. J. Bernstein

Sent: Tuesday, July 28, 2020 4:35 AM
To: pqc-forum

Mike Hamburg

unread,

Jul 29, 2020, 12:38:50 PM7/29/20

to Moody, Dustin (Fed), D. J. Bernstein, pqc-forum

Hi all,

Here is my view on additional parameter sets.

If the security levels of the proposals are not eroded by future incremental advances in cryptanalysis, then Category 5 is excessive. Nobody is going to do 2^137 quantum work to break a Category 3 scheme, and that’s with the rather implausible MAXDEPTH=2^96; 2^64 is a more likely limit. However, I don’t think we can be so confident in the security of any of the proposals, with the possible exceptions of SPHINCS+, PICNIC and maybe McEliece. I am most familiar with the lattice proposals, and there has been significant progress in lattice cryptanalysis over the past decade. So I think it makes sense to specify a Category 5 parameter set in most or all systems, or at least a “Category 3 under plausible improvements in asymmetric cryptanalysis” parameter set if that’s somehow different (eg, 192-bit symmetric secrets + 64-bit salt, but 256-bit core-SVP or whatever is your favorite metric).

A Category 2 parameter set, or rather “Category 2a: halfway between Category 1 and Category 3” which is potentially quite different, would also be useful for the same reason. NTRU and NTRU Prime’s ability to add this easily is an advantage over the likes of Kyber and Saber. Lacking such a parameter set, concerned implementors will need to use Category 3 parameters instead of 2a, which generally won’t be a show-stopper, but it should be considered as reducing the performance advantage of Kyber and Saber. This is especially true for Kyber, whose attainment of Category 1 is debatable and may push more users to Category 3. NIST did point this out in their rationale document, but I also think that this is an important area of concern for Kyber in Round 3, by some combination of improved analysis and stronger parameters.

If there are no advances in cryptanalysis, I don’t think intermediate parameter sets will see much deployment, at least for lattices. In the classical crypto realm, we have seen that elliptic curves tend to be deployed at discrete levels (256, 384, 521) corresponding to the AES security levels, and Ed448 got quite a bit of pushback for not hitting these levels exactly. Even AES-192 is virtually unused, because everybody jumps from AES-128 “fast and probably good enough” to AES-256 “military-grade security (tm)" for only 40% more cost. By contrast, RSA is deployed at levels that don’t correspond to (128, 192, 256)-bit security, owing to both an unclear correspondence and a roughly lambda^9 performance curve. Most PQ proposals do not have this sort of performance curve, and they will at least initially be advertised at roughly AES-matching security levels due to the category scheme, so I expect PQ to play out more like ECC than like RSA. But creeping improvements in cryptanalysis might make it look more like RSA, where slightly stronger parameter sets could be deployed according to the user’s budget.

From my point of view, it is strongly preferable for submitters to extend their parameter sets (to 2a and/or 5) now in Round 3, instead of later during standardization, even though we know that eg NTRU Prime can add these sets just by running a script. That way there can be public discussion of those parameter sets, and the precise performance implications will be clear. I don’t want to see NIST adding parameter sets themselves during standardization, since then there will be accusations of a backdoor, and even NIST asking the authors for an extra one is suboptimal because it won’t give as much time for public discussion.

Best regards,

— Mike

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/DM6PR09MB5909D83F420E79B9551E1F1FE5700%40DM6PR09MB5909.namprd09.prod.outlook.com.

D. J. Bernstein

unread,

Jul 30, 2020, 11:18:40 AM7/30/20

to pqc-...@list.nist.gov

I think everyone agrees that the best argument for overkill security
levels is that the attacker's costs can be lower than what the public
thinks the costs are. Maybe the public is looking at an oversimplified
attack problem (e.g., single-target attacks), or has a worse algorithm
than the (current or future) attacker, or overestimates the number of
operations in an algorithm, or overestimates the costs of operations.
Complicated attack pictures tend to strengthen these arguments.

NIST's call for proposals picked an overkill security target, namely
"above category 3", and recommended providing such a parameter set:

To hedge against future breakthroughs in cryptanalysis or computing

technology, NIST also recommends that submitters provide at least one

parameter set that provides a substantially higher level of security,
above category 3.

The submission documents vary in how much they say about parameter
selection, but it's clear that many teams paid close attention to NIST's
rules and selected parameters accordingly.

Suddenly, more than three years later, NIST changes "above category 3"
to "category 5". Why? This doesn't matter for submissions that never had
the flexibility to target something intermediate, but it _does_ matter
for other submissions.

Procedurally, if there was some reason for this change in the announced
security target, then NIST should have posted the reason, asked for
public comments, and then (depending on the results of the discussion)
asked submitters for new parameters if necessary, while making sure to
avoid _any_ appearance of issuing retroactive complaints on the basis of
the new target. Unfortunately, this is not at all what happened.

Content-wise, the efforts that I've seen to justify "category 5" as a
meaningful hedge would also justify going even farther: e.g., to the 6th
line of NIST's table in the call for proposals, or to the 2^512 security
level that was required (for preimages) in the SHA-3 competition. Should
submitters do this? If not, why not?

This is an urgent question for NIST, since NIST has set an extremely
short timeline for summarizing round-3 changes. So far NIST hasn't even
managed to admit that, yes, asking for "category 5" is a change from
asking for "above category 3", never mind explaining _why_ this change
happened, never mind explaining what submitters can expect regarding
the possibility of further changes in the target at the end of round 3.

> From my point of view, it is strongly preferable for submitters to
> extend their parameter sets (to 2a and/or 5) now in Round 3, instead
> of later during standardization

I agree with Mike that having NIST add new parameters after selecting a
standard is much worse than having the parameters specified now. The
problem is that it still isn't clear which security levels NIST wants!

Maybe the safest approach for submitters is to present a huge spectrum
of parameter sets---but will NIST then complain that this is too many?

---Dan

signature.asc

Moody, Dustin (Fed)

unread,

Jul 31, 2020, 10:42:10 AM7/31/20

to D. J. Bernstein, pqc-forum

In the original Call for Proposals published in 2016, NIST recommended submitters focus on categories 1, 2, and 3. NIST also recommended submitters provide at least one parameter set above category 3. In our latest report published, we encouraged a few teams to include category 5 parameter sets. We don't see this as a "sudden change". These are NOT requirements. Throughout the process we've been in dialogue with various teams as they have adjusted parameter sets. The decisions are always made by the submitters, who can submit what they think best. We gave our current recommendations in our 2nd round report, and don't anticipate making any suggestions for more parameter sets. Submission teams can submit more parameter sets if they choose, although in general, NIST believes that too many parameter sets make evaluation and analysis more difficult.

Dustin

From: pqc-...@list.nist.gov on behalf of D. J. Bernstein

Sent: Thursday, July 30, 2020 11:18 AM

To: pqc-forum
Subject: Re: [pqc-forum] Guidelines for submitting tweaks for Third Round Finalists and Candidates

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20200730151815.778593.qmail%40cr.yp.to.

D. J. Bernstein

unread,

Aug 2, 2020, 5:50:43 AM8/2/20

to pqc-...@list.nist.gov

NIST's request for a summary of expected changes by 10 August was always
surprisingly rushed. Now this latest message makes it even _less_ clear
which security levels NIST is asking for.

'Moody, Dustin (Fed)' via pqc-forum writes:

> NIST believes that too many parameter sets make evaluation and
> analysis more difficult.

How many is "too many"? How did flexibility, which was portrayed as
purely positive in the call for proposals, turn into a bad thing for
NIST? The call for proposals explicitly allowed multiple parameter sets
_per category_, never suggesting that this would be penalized!

NIST's latest report complains about NewHope's lack of flexibility to
use dimensions strictly between 512 and 1024. If a submission team is
thinking "Aha, Kyber similarly suffers from its lack of flexibility to
target security levels strictly between maybe-2^128 and maybe-2^192, and
we can clearly show this to NIST by selecting parameter sets at several
intermediate security levels", then isn't this something NIST should be
interested in, rather than discouraging by making submitters worry that
this is "too many parameter sets"?

One might think that simply presenting graphs of the size vs. security
level for many more defined parameter sets---see, e.g.,

https://eprint.iacr.org/2018/1174.pdf

regarding the NTRU submission---would adequately make the point. Anyone
who looks at the numbers sees that there are interesting security levels
and useful metrics where the NTRU submission is more efficient than
Kyber. But NIST claimed the opposite---that NTRU is "not quite at the
level of the highest-performing lattice schemes"---and didn't comment on
Kyber's lack of flexibility, which is directly relevant to exactly this
comparison. The simplest explanation of went wrong in the evaluation
process is that NIST didn't look at the numbers. So how do we make sure
that NIST will look at the numbers?

(I'm reminded of how one submitter, presumably not having looked at the
numbers, tried to downplay this as "fine-tuning". Useful rule of thumb:
whenever someone uses performance-comparison terminology with unclear
boundaries, ask him what the actual performance numbers are. Very often
such a lack of clarity is a cover for not knowing the facts.)

In general, NIST seems to be ignoring flexibility that isn't
demonstrated through the _selected_ parameter sets. For example, NIST's
latest report complains about some submissions not selecting any
category-5 parameter sets, and (post-report) NIST tries to justify this
by pointing to "flexibility". It's not that anyone is claiming that some
submissions didn't explain how to scale up security levels; it's simply
that the _selected_ parameter sets sometimes didn't go that far.

But wait a minute. In the same report, NIST praises Kyber for its
supposedly "straightforward" ability to vary "noise parameters"---which
is something that _doesn't_ vary across the selected Kyber parameter
sets (perhaps because it _isn't_ actually so easy to achieve a useful
"adjustment of the performance/security trade-off" by varying the
noise). If flexibility should be demonstrated through the selected
parameter sets, and some submissions are being criticized on this basis,
then why is Kyber being praised for flexibility that isn't shown in its
selected parameter sets?

Sure, round-1 Kyber varied its noise choice. But this doesn't answer the
question. Round-1 NTRU (NTRUEncrypt before the merge) also varied its
security levels, including category-5 proposals, but NIST still
complains in its latest report that NTRU "lacks a category 5 parameter
set proposal". Evidently NIST feels free to, and often does, ignore
flexibility that isn't visible in the _current_ selected parameter sets.

As these examples illustrate, NIST's evaluation of the desirability of
any particular selection of security levels is a very complicated black
box that doesn't seem to be following any clear public principles. The
black box is being probed experimentally, with a delay of 12-18 months
before we see the results of the experiments. The bits of information
that NIST is providing are often unclear and often seem contradictory.
This poses a serious problem for people trying to figure out which
security levels to submit, except for submissions that are locked into
just a few choices to begin with.

I'm baffled at NIST's apparent inability to understand this from a
submitter's perspective. I have a suggestion here: NIST should go
through what Silicon Valley calls "eating your own dog food"---reading
its own call for proposals and other announcements, preparing a sample
submission accordingly (for example, an ECDH submission from someone
feigning ignorance of Shor's algorithm), and pointing out the
difficulties that were encountered. Of course it would have been even
better for NIST to do this before the original submission deadline, but
it would still help now.

> In the original Call for Proposals published in 2016, NIST recommended
> submitters focus on categories 1, 2, and 3. NIST also recommended submitters
> provide at least one parameter set above category 3. In our latest report
> published, we encouraged a few teams to include category 5 parameter sets.

NIST's latest report also stated this in the general text regarding all
submissions, making readers think that this is something that was
already "strongly encouraged":

While category 1, 2, and 3 parameters were (and continue to be) the
most important targets for NIST’s evaluation, NIST nevertheless

strongly encourages the submitters to provide at least one parameter
set that meets category 5. Most of the candidate algorithms have
already done this; a few have not.

But anyone checking the call for proposals sees that "category 5" is a
change from what the call for proposals "recommended":

NIST recommends that submitters primarily focus on parameters meeting
the requirements for categories 1, 2 and/or 3, since these are likely

to provide sufficient security for the foreseeable future. To hedge

against future breakthroughs in cryptanalysis or computing
technology, NIST also recommends that submitters provide at least one
parameter set that provides a substantially higher level of security,

above category 3. [page break, no indication of paragraph break:]
Submitters can try to meet the requirements of categories 4 or 5, or
they can specify some other level of security that demonstrates the
ability of their cryptosystem to scale up beyond category 3.

There's no way that "above category 3 ... categories 4 or 5 ... beyond
category 3" in the call for proposals can be understood to simply mean
"category 5".

As far as I can tell, NIST's rationale for this change _still_ hasn't
been made public. Every guess I've seen for NIST's hidden rationale
strongly suggests that submitters should also go _beyond_ category 5.

> We don't see this as a "sudden change".

Please clarify. Is NIST denying that this was a change? Or is NIST
admitting that this was a change, but simply denying that it was
"sudden"? Did I miss some previous announcement where NIST proposed this
change and explained the rationale for the change?

> Throughout the process we've been in dialogue with various teams as
> they have adjusted parameter sets.

You're talking about _private_ discussions that NIST has had with
various teams? This is supposed to somehow be a replacement for having a
change in evaluation criteria proposed and discussed in public? Wow.
When did NIST announce that submitters were expected to use this private
source of information rather than relying on the public announcements?

I gave some talks in July on lattice-based cryptography, and was
thinking about risk-management failures and the public's inability, for
a stretch of at least three months, to correct whatever errors NIST
might have received privately in April 2020 in response to its call for
private input, which in retrospect wasn't the first warning signal. I
tweeted the following:

After NIST's Dual EC standard was revealed in 2013 to be an actual
(rather than just potential) NSA back door, NIST promised more
transparency. Why does NIST keep soliciting private #NISTPQC input?
(The submissions I'm involved in seem well positioned; that's not the
point.)

https://twitter.com/hashbreaker/status/1285922808392908800

This happened to be about 2 minutes before NSA sent its first message to
the mailing list, some hours before NIST announced round 3. Did NIST
tell NSA the timing of NIST's announcement? Did NIST show NSA a draft of
the report in advance? Did NIST ask NSA for comments on the draft? What
exactly has NSA told NIST regarding NISTPQC, regarding security levels
or otherwise?

---Dan

signature.asc

Mike Hamburg

unread,

Aug 2, 2020, 3:46:16 PM8/2/20

to D. J. Bernstein, pqc-...@list.nist.gov

On Aug 2, 2020, at 2:50 AM, D. J. Bernstein <d...@cr.yp.to> wrote:

(I'm reminded of how one submitter, presumably not having looked at the
numbers, tried to downplay this as "fine-tuning". Useful rule of thumb:
whenever someone uses performance-comparison terminology with unclear
boundaries, ask him what the actual performance numbers are. Very often
such a lack of clarity is a cover for not knowing the facts.)

If this insult is against me, about our conversation on “fine-tuning”, then

it is only partially accurate. I had written:

“””

… Fine-tuning is a useful feature, especially when you have
sharp bandwidth and security requirements. But ThreeBears and
Saber and especially LAC are also inherently more bandwidth-
efficient than much of the field (using current sieve or enumeration
or hybrid estimates). So use cases with sharp bandwidth constraints
are also likely to be favorable to these systems, even more favorable
to Round5, and unfavorable for NewHope regardless of CPA vs CCA
or tuning gap.

“””

And in fact, I had not generated all the intermediate performance data,

but of course I did know and had analyzed the performance of

existing parameter sets.

Then you wrote a paper that more or less directly accused me of

dishonesty. In this paper, you also did not generate or analyze all the

intermediate performance data: you only graphed the data that we

already had using stairsteps instead of as a scatterplot. This paper

was worded to suggest that my claim about efficiency was wrong, but

the data you presented did not support your argument.

Later on this list, you tried to bring up this discussion: not because

we were comparing performance again, but to support a claim that

my analyses were oversimplified.

To this I replied:

“”"

That paper doesn’t analyze smooth parameter spaces. Furthermore,

it shows that NTRU LPRime has an advantage against ThreeBears,

by *one bit* of estimated core-sieve security, for 36% of the range of

parameters in the range of [BabyBear … MamaBear). This range is

favorable to NTRU LPRime: cutting off at MamaBear’s 1307-byte

ciphertext size maximises the ratio.

This comparison would not be improved — except for the “by one bit”

part — by further exploring the smooth parameter space of NTRU

LPRime, because it is in some sense already tuned optimally for this

comparison (where it wins, it wins by the minimum amount). It would

be made worse if ThreeBears were allowed to be tuned as well, because

a parameter set with D=260, d=3 would have around 1103 bytes of

ciphertext and likely more core-sieve security (my old spreadsheets

suggest 197 bits, but that was with an old version of the estimator)

than any NTRU LPRime instance in that graph.

The comparisons against LAC and SABER are even less favorable to

NTRU LPRime.

“””

Last I checked, this is where that argument stood.

So perhaps I am mistaken, and you are actually insulting someone

else? Otherwise, in the future, please consider using arguments

where you were actually right when claiming that your colleagues

are are lazy, dishonest or ignorant.

Regards,

— Mike

D. J. Bernstein

unread,

Aug 2, 2020, 11:39:31 PM8/2/20

to pqc-...@list.nist.gov

Mike Hamburg writes:
> In this paper, you also did not generate or analyze all the
> intermediate performance data: you only graphed the data that we
> already had using stairsteps instead of as a scatterplot.

Actually, I did generate, analyze, and graph various intermediate
performance data, and showed examples in the accompanying talk at the
NIST workshop:

https://cr.yp.to/talks.html#2019.08.23-1

In my example of the triangles vs. the squares, the correct stairsteps
show more area advantage for the triangles _or_ for the squares,
depending on the user's weighting of security levels; the fake lines
incorrectly make the squares look better at all security levels; if all
lines are omitted then the reader's eye tends to fill in the fake lines;
and adding the intermediate performance data for the triangles shifts
the comparison towards them. The talk slides show each of these graphs.

I don't think most people are aware of these effects before seeing the
graphs side by side. Common graphing tools make the fake lines easy to
add, often automatic, and the correct lines unnecessarily difficult to
add (and of course the intermediate data takes much more work to add).
Contrary to what you now claim, I didn't accuse you of being dishonest.

> Last I checked, this is where that argument stood.

Maybe next time try checking more carefully. Thanks in advance!

---Dan

signature.asc

Mike Hamburg

unread,

Aug 3, 2020, 12:45:42 AM8/3/20

to D. J. Bernstein, pqc-...@list.nist.gov

> On Aug 2, 2020, at 8:39 PM, D. J. Bernstein <d...@cr.yp.to> wrote:
>
> Mike Hamburg writes:
>> In this paper, you also did not generate or analyze all the
>> intermediate performance data: you only graphed the data that we
>> already had using stairsteps instead of as a scatterplot.
>
> Actually, I did generate, analyze, and graph various intermediate
> performance data, and showed examples in the accompanying talk at the
> NIST workshop:
>
> https://cr.yp.to/talks.html#2019.08.23-1
>
> In my example of the triangles vs. the squares, the correct stairsteps
> show more area advantage for the triangles _or_ for the squares,
> depending on the user's weighting of security levels; the fake lines
> incorrectly make the squares look better at all security levels; if all
> lines are omitted then the reader's eye tends to fill in the fake lines;
> and adding the intermediate performance data for the triangles shifts
> the comparison towards them. The talk slides show each of these graphs.

Ah, sorry, I’d missed this because it wasn’t in the paper. It also appears to
be a graph of SNTRUP vs Kyber, which does not address my statement: I
stated specifically that the efficiency of ThreeBears, Saber and LAC would
generally outweigh their lack of tunability, and later I compared these
systems to NTRU LPRime.

> I don't think most people are aware of these effects before seeing the
> graphs side by side. Common graphing tools make the fake lines easy to
> add, often automatic, and the correct lines unnecessarily difficult to
> add (and of course the intermediate data takes much more work to add).
> Contrary to what you now claim, I didn't accuse you of being dishonest.

Thanks for clarifying. Does the same apply for your latest parenthetical?
Because that looks an awful lot like a claim that I was both dishonest and
ignorant.

Regards,
— Mike

Mike Hamburg

unread,

Aug 3, 2020, 1:29:10 AM8/3/20

to D. J. Bernstein, pqc-...@list.nist.gov

Also, because I can’t resist sabotaging my own argument:

Specifically for the comparison NTRU-HPS vs SABER, SABER wins 67% of the time between its Class 1 and 3 parameters, and always above Class 3. However, with all the data in the NTRU submission, NTRU-HPS wins 75% of the time between Class 1 and Class 3, but loses 71% of the time between Class 3 and Class 5.

So tunability does make a difference for that specific matchup. It doesn’t work for comparing with ThreeBears, because ThreeBears does support additional parameter sets (eg, Koala).

Cheers,

— Mike

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/ECF9C181-8D96-430A-8DF8-34A4A0BBAFC0%40shiftleft.org.

Reply all

Reply to author

Forward