Second SUPERCOP results for NISTPQC

153 views

Skip to first unread message

D. J. Bernstein

unread,

Dec 1, 2018, 9:49:14 AM12/1/18

to pqc-...@list.nist.gov

Publicly verifiable SUPERCOP benchmarks are now available for hundreds
of implementations of many primitives from the following 39 submissions,
about 70% of the unbroken first-round NIST submissions:

BIG QUAKE, BIKE (see below), Classic McEliece, CRYSTALS-DILITHIUM,
CRYSTALS-KYBER, DAGS, EMBLEM and R.EMBLEM (partially), FrodoKEM,
GeMSS, Gravity-SPHINCS, Gui, HILA5, KINDI, LAC, LAKE, LEDAkem, LIMA
(partially), LOCKER, LOTUS, LUOV, McNie, Mersenne-756839, MQDSS,
NewHope, NTRUEncrypt (partially), NTRU-HRSS-KEM, NTRU Prime, NTS-KEM,
Odd Manhattan, Picnic, pqRSA, qTESLA, Rainbow, Ramstake, SABER, SIKE
(partially), SPHINCS+, Three Bears, Titanium.

As before, the results demonstrate many speedups, often huge speedups,
compared to non-SUPERCOP benchmarks. Full tables on a typical Haswell
appear here:

https://bench.cr.yp.to/results-kem.html#amd64-titan0
https://bench.cr.yp.to/results-encrypt.html#amd64-titan0
https://bench.cr.yp.to/results-sign.html#amd64-titan0

Comments regarding specific submissions:

* BIKE: Now integrated, but a few days ago I received new software
that I'm told is incompatible and faster for two reasons. First,
the previous software was actually level 5 instead of the labeled
level 1. Second, the new software includes an NTL-based
implementation. I'll integrate the new software soon.

* EMBLEM and R.EMBLEM: Now integrated, but see the caveats in my
"EMBLEM and R.EMBLEM" message.

* CRYSTALS-DILITHIUM, HILA5, SIKE: Code improvements from the
previous SUPERCOP benchmarks have perhaps produced some speedups.

* Issues stated regarding other specific submissions in my "Re: First
SUPERCOP results for NISTPQC" message are still applicable.

To repeat four important general caveats from my "First SUPERCOP results
for NISTPQC" message:

* Security is job #1. One aspect of security is avoiding timing
attacks, but many implementations don't run in constant time. The
slowdown for constant-time code depends on the primitive.

* For most of the submissions, cost in typical applications is
dominated by key size, ciphertext size, etc., not by CPU time.

* The community has done serious Haswell optimization of _some_
primitives but certainly not all---never mind all the other CPU
microarchitectures of interest. Subsequent speedups will vary from
one primitive to another.

* There may be cases where existing software achieves speeds not
reflected in the current reports. Submitters should look for any
unexpected slowdowns and send updated software to the ebats mailing
list. If the software is already there but doesn't work, please
check the online error reports to see what went wrong.

Finally, I realize that NIST wants speed data, but I'm worried about how
the cycle counts are going to be used given the above caveats:

* Pretty much any use of the cycle counts is going to distract people
from the byte counts, even though the byte counts usually cost much
more. (SIKE is an exception.)

* Maybe a reasonably weighted sum such as 1000*#bytes+#cycles will
sometimes be tipped one way or the other by the cycle counts, and
at some point we're going to have to switch to presuming that
submissions can't run noticeably faster than the latest software.
However, after looking through the current software, my assessment
is that for now this presumption is not reasonable. (The fastest
systems in the current benchmarks are exceptions, but they're also
so fast that the cycle counts hardly matter.)

* Performance comparisons will often be reversed when security levels
are taken into account---but pinning down security levels is even
more challenging than pinning down performance.

---Dan