More SUPERCOP planning

307 views
Skip to first unread message

D. J. Bernstein

unread,
Oct 6, 2021, 10:45:12 PM10/6/21
to pqc-...@list.nist.gov
I'm currently trying to figure out a deadline for code updates for the
next SUPERCOP benchmarking run. The obvious goal is to have benchmark
numbers available by the end of October, ensuring direct comparability
across all submissions, especially for Intel Haswell. (SUPERCOP doesn't
cover Cortex-M4; see the pqm4 project for Cortex-M4 benchmarks.)

Dealing properly with a new NISTPQC speed-comparability problem is
making scheduling more difficult than usual. To understand the problem,
look at slide 8 of the SIKE update

https://csrc.nist.gov/CSRC/media/Presentations/sike-round-3-presentation/images-media/session-6-sike-de-feo.pdf

reporting performance numbers such as 9.7 million cycles for SIKEp434
encapsulation using "x86_64 asm". Given previous discussions regarding
comparability and NIST announcements (e.g., "NIST strongly recommends
also providing an AVX2 (Haswell) optimized implementation"), the reader
would expect the 9.7 million to be a Haswell measurement. However, my
understanding is that all SIKE code currently available for Haswell is
considerably slower than this, and that the slides silently substituted
a faster Intel CPU for improved SIKE speeds.

Of course, SUPERCOP Haswell results are clearly labeled, and anyone with
a Haswell machine can easily re-run the full benchmarks and spot-check
the results. However, unlike most submission teams, the SIKE team hasn't
submitted any post-round-1 code to SUPERCOP. See also the SUPERCOP
NISTPQC status updates dated 19 Oct 2019 20:47:47, 5 May 2020 22:36:59
+0200, and 19 Jun 2020 09:17:05 +0200.

NIST's submission requirements include a blanket authorization to use
the submitted code for benchmarking, so I've downloaded the latest SIKE
submission from the NIST web pages and plan to get the code from that
submission working in SUPERCOP, after which I'll try to figure out how
difficult it is (as a technical matter and as a legal matter) to include
the subsequent SIKE code updates. Schedule uncertainties at this point
include code-review time; code-integration time; and---since SUPERCOP
tries many implementations and compiler options to select the best for
each parameter set, and runs each benchmark many times for statistically
robust data---CPU time for SIKE benchmarking.

---Dan
signature.asc

David Jao

unread,
Oct 8, 2021, 9:21:08 AM10/8/21
to pqc-...@list.nist.gov
Hi Dan,

(speaking on behalf of the SIKE team)

It was always my understanding that the talks from the Round 3
standardization conference were not meant to be standalone sources of
information, but rather simply a supplement to the written
specifications published on the NIST web site. As such, certain minor
details may be omitted from the presentation for valid reasons such as
lack of time given the brevity of the presentation slots, and it is
understood that interested parties should consult the written
documentation accompanying each submission for full details. Looking at
our supporting documentation, for example, it is clear that the slide
you referenced is taken from Table 2.1 of the round 3 SIKE specification
(the reason it is clear is because the numbers are identical), and in
said specification we clearly identify the CPU as being from the Skylake
microarchitecture. There was no intent to silently substitute a faster
CPU, nor did we successfully achieve this effect even if that was our
intent. I am not aware of any requirement from NIST that only Haswell
chips may be used for benchmarking.

If other submission teams were consistently obeying some sort of
unwritten rule to report only Haswell numbers, then it would be
reasonable to call out our deviation in this respect. However, as far as
I can tell, other teams are doing exactly the same thing as what we are
doing. For example, compare slide 8 of
https://csrc.nist.gov/CSRC/media/Presentations/saber-round-3-presentation/images-media/session-7-saber-vercauteren.pdf
(which notably does not identify the CPU microarchitecture) with Table 6
of https://eprint.iacr.org/2020/1397.pdf which reports the exact same
numbers and clearly indicates Skylake.

The inference made that "slides silently substituted a faster Intel CPU
for improved SIKE speeds" is not true. To my knowledge, no member of the
SIKE team was contacted over this issue. In the future we ask you to
contact submission teams for clarification before making such unfounded
assumptions. We would also like to alert NIST that, in our opinion,
these accusations are being made in less than good faith with no attempt
to contact the respective submission teams beforehand. A simple email
could have clarified both the actual details of the performance claims
and the fact that what we are doing is no different from what everyone
else does.

-David

D. J. Bernstein

unread,
Oct 8, 2021, 11:52:24 AM10/8/21
to pqc-...@list.nist.gov
My understanding is that SIKE's non-participation in the ongoing
SUPERCOP process is a policy decision by Brian LaMacchia. I sent him
email last year to discuss this---after all other teams had contributed
their latest code---and never heard back. More to the point, the public
record shows a series of calls for SUPERCOP submissions, and the SIKE
team has not submitted for whatever reason, so at this point something
really needs to be done to ensure proper SIKE benchmarking.

Initial experiments show SIKE on Haswell taking 50% more cycles than
SIKE on Skylake. This is not a "minor detail". Of course it's possible
that I'm looking at the wrong code, but I think it's fair to blame that
on the SIKE team not participating in the process. Normally the work to
integrate code into SUPERCOP is distributed across submission teams (who
can easily check that the performance results are what they expect)
rather than being shifted onto other people.

The decision to emphasize Haswell for comparability is not an "unwritten
rule": the decision is clearly visible in NIST pqc-forum messages dated

13 Dec 2018 16:06:59 +0000,
30 Jan 2019 12:38:55 +0000,
5 Mar 2019 14:09:54 +0000, and
23 Jul 2020 12:02:03 +0000,

in the quote "NIST strongly recommends also providing an AVX2 (Haswell)
optimized implementation" from the first message in this thread, and in
many NISTPQC-related SUPERCOP announcements. Creating too many targets
for optimization, failing to account for the amount of work required and
the limited human resources available, would create effectively random
slowdowns on each target.

Since publicly verifiable SABER results are already available in
SUPERCOP, I didn't look at the numbers that SABER selected for their
talk. If they switched from Haswell to Skylake without saying so, they
were unfairly gaining 10% for any readers looking at those numbers.

---Dan
signature.asc

David Jao

unread,
Oct 8, 2021, 12:32:25 PM10/8/21
to pqc-...@list.nist.gov
(speaking for myself, not on behalf of the SIKE team)

The observation that SIKE on Haswell is slower than SIKE on Skylake by
more than the usual 10% differential is indeed a valid point -- however,
it is NOT a point that was made in your original email, in which no
mention of the magnitude of the differential was made. We provide
several levels of x64 assembly optimization for SIKE. The highest level
of optimization requires ADX instructions, which unfortunately were
introduced in Broadwell -- one generation after Haswell. We have never
hidden or concealed this fact.

The quote that

"NIST strongly recommends also providing an AVX2 (Haswell) optimized
implementation"

does NOT (in my view) imply:

"NIST recommends against providing any additional optimized
implementations for any CPU generations beyond Haswell"

which is what you seem to be saying. Our submission includes several
levels of optimization, some applicable to Haswell, some applicable to
prior CPU generations, and some applicable to later generations.

If NIST wishes to penalize SIKE for providing too many options for
optimization, that is their perogative. If NIST wishes to penalize SIKE
for not participating in SUPERCOP, that is their perogative.

-David

D. J. Bernstein

unread,
Oct 8, 2021, 9:25:08 PM10/8/21
to pqc-...@list.nist.gov
David Jao writes:
> The quote that
> "NIST strongly recommends also providing an AVX2 (Haswell) optimized
> implementation"
> does NOT (in my view) imply:
> "NIST recommends against providing any additional optimized
> implementations for any CPU generations beyond Haswell"

I agree, and already said so on pqc-forum last year. However, given
that NIST did say "NIST strongly recommends also providing an AVX2
(Haswell) optimized implementation", readers are entitled to expect
Haswell by default. We're now facing a serious speed-comparability
problem caused by SIKE

(1) gaining speed by selecting Skylake instead of Haswell and
(2) not participating in the community's primary mechanism for
collecting robust, easily verifiable, clearly labeled benchmarks
on each platform---in particular Haswell.

Selecting Skylake for comparison, even with Skylake being made perfectly
clear, wouldn't be fair to other teams that optimized for the designated
Haswell platform. This would be

* benchmarking crime A3, "Selective data set hiding deficiencies";
* benchmarking crime D3, "Unfair benchmarking of competitors"

in the classification of https://arxiv.org/pdf/1801.02381.pdf. We simply
don't know at this point whether other submissions can gain as much on
Skylake as SIKE did; human resources have to be taken into account.

So how do we get from here to obtaining direct comparability of all
submissions on Haswell? It needs to be possible to openly discuss the
problem here and propose solutions without being subjected to ad-hominem
attacks. I've said what I'm currently planning to do and why; this is
also a general scheduling issue since (1) the time isn't easy to predict
and (2) deadlines should be set the same way for everybody.

---Dan

P.S. For the record, the question that I sent Brian last year was "At
this point SIKE is the only remaining NISTPQC candidate that hasn't
submitted its latest code to SUPERCOP. Do I correctly understand that
this is because you'd like benchmarking done by, and only by, people
independent of the submitters?" There was no need for a quick reply (to
that message or, more importantly, the public calls for SUPERCOP input)
at that point; but several weeks ago NIST suddenly stated "We would
suggest October 31st as a suitable date" for supplying data.

P.P.S. Quote from the "More SUPERCOP results" announcement dated 7 Mar
2020 20:58:02 -0000: "Submission teams that want to go beyond NIST's
highlighted CPUs (Haswell and Cortex-M4) should be able to report, e.g.,
the Cortex-A7 speeds achieved---while refraining from comparing these to
unoptimized Cortex-A7 speeds of other submissions! (To avoid any
accusations of bias in supporting this option, I'll avoid advertising
speed results on non-NIST-highlighted CPUs for submissions I'm involved
in.)"
signature.asc

David Jao

unread,
Oct 8, 2021, 9:45:40 PM10/8/21
to pqc-...@list.nist.gov
(speaking for myself)

I'll repeat what I said before. Even if we all generally accept "Haswell
by default" as correct practice, we clearly identified our numbers as
Skylake, therefore as something other than default. There is no silent
substitution going on.

The submission software and all other versions of PQCrypto-SIDH are
clearly licensed under the MIT license. There are no legal uncertainties
in working with this code for benchmarking or any other purposes, aside
from the usual concerns about third-party patent claims, which I cannot
control, and which apply equally as much to any other submission.

I appreciate your efforts in organizing and helping to collect
centralized benchmarks of the PQC candidates. I cannot speak for Brian
LaMacchia or anyone else identified in your email as to what motivations
they may have for not participating in the SUPERCOP process. I know that
I myself simply do not have the time to do this; there is no nefarious
reason on my part, just a complete lack of time. I know that it is
unfair to expect you to do it when I cannot; hence I do not expect this.
If as a result of this dilemma SIKE is unable to be represented in
SUPERCOP, that is certainly our loss more than your loss. Despite this
possible loss, we will continue to carry on as best we can.

-David
Reply all
Reply to author
Forward
0 new messages