[New paper] Hardware implementation of CRYSTALS-Dilithium

204 views
Skip to first unread message

Krzysztof M. Gaj

unread,
Oct 31, 2021, 10:19:56 PM10/31/21
to pqc-...@list.nist.gov
Hi,

It is our pleasure to announce the publication of our new paper:

"High-Performance Hardware Implementation of CRYSTALS-Dilithium"
by Luke Beckwith, Duc Tri Nguyen, and Kris Gaj
Cryptology ePrint Archive: Report 2021/1451

This paper will appear in the proceedings of the International Conference on Field-Programmable Technology, FPT 2021, to be held on December 6-10, 2021.

The major contributions of our paper are as follows:

1. We developed the first unified architecture of CRYSTALS-Dilithium supporting  
    a) key generation, signature generation, and signature verification
    b) three security levels - 2, 3, and 5,
   with the choice among all operations and security levels performed at runtime
   (rather than fixed at the synthesis time).

2. Our design for CRYSTALS-Dilithium outperforms 
    a) an earlier implementation by Ricci et al. [1] by factors 2.7, 1.5, and 3.7, in terms of latency in time units,
       for key generation, signing, and verifying, respectively, at the security level 2.
       Most elements of the resource utilization are smaller in our design, with the exception of a larger number of BRAMs in our unified design
       compared to the number of BRAMs required specifically for key generation and signature verification in [1].
    b) an earlier implementation by Land et al. [2] by factors 3.0, 2.2-2.4, and 3.0 in terms of latency in time units,
       for key generation, signing, and verifying, respectively, at the security level 5.
       Our design uses 19% more LUTs, but 2.8x fewer DSP units and 6.5% fewer BRAMs.

3. We provide insights into the comparison between CRYSTALS-Dilithium and two other Round 3 signature schemes, Picnic and SPHINCS+ (see Table V).

In particular, the Picnic implementation by Kales et al. [3] at security level 5 has substantially lower performance than Dilithium. 
While it consumes no DSP units, it requires approximately 3x more LUTs and BRAMs.

Multiple SPHINCS+ implementations for different parameter sets are reported by Amiet et al. [4].
We selected the parameters with the best performance to compare to our design. 
At security level 5, this implementation uses a comparable number of LUTs, and its use of BRAMs and DSPs is substantially smaller. 
The verification time is 6x shorter than in CRYSTALS-Dilithium.
However, signature generation in Dilithium is 12x faster in the best case and 5.3x faster in the average case than in SPHINCS+.

These results suggest that CRYSTALS-Dilithium is more efficient in hardware than Picnic.
CRYSTALS-Dilithium is also more efficient than SPHINCS+ for signature generation but less efficient for signature verification. 
However, more investigation is required for a definitive ranking.

Any comments and suggestions are very welcome!

Luke, Duc, and Kris
Cryptographic Engineering Research Group (CERG)
George Mason University

References:
[1] S. Ricci, L. Malina, P. Jedlicka, D. Smekal, J. Hajny, P. Cibik, and P. Dobias, "Implementing CRYSTALS- Dilithium Signature Scheme on FPGAs," Cryptology ePrint Archive 2021/108, Jan. 2021.
[2] G. Land, P. Sasdrich, and T. Guneysu, "A Hard Crystal - Implementing Dilithium on Reconfigurable Hardware," Cryptology ePrint Archive 2021/355, Mar. 2021.
[3] D. Kales, S. Ramacher, C. Rechberger, R. Walch, and M. Werner, “Efficient FPGA Implementations of LowMC and Picnic,” in The Cryptographers’ Track at the RSA Conference 2020, CT-RSA 2020, San Francisco, Springer, Feb. 2020.
[4] D. Amiet, L. Leuenberger, A. Curiger, and P. Zbinden, "FPGA-based SPHINCS+ Implementations: Mind the Glitch, in 23rd Euromicro Conference on Digital System Design, DSD 2020, Kranj, Slovenia, Aug. 2020, pp. 229–237.

Reply all
Reply to author
Forward
0 new messages