CROSS on AVX2 and TLS benchmarks

321 views
Skip to first unread message

Marco Gianvecchio

unread,
Jan 20, 2026, 9:12:00 AMJan 20
to pqc-...@list.nist.gov
Hi all,

I'm reaching out from the CROSS team to share our recent work on optimizing CROSS for AVX2 platforms [1]. We focused on modular reductions, exponentiations, and vector-matrix multiplications. We'd appreciate any feedback on the implementation, as well as suggestions on how to improve it.

In Section 4.2, you'll also find a suite of TLS benchmarks comparing CROSS to other round-2 candidates through OpenSSL and oqs-provider. We found that it achieves quite competitive performance. We simulated four realistic network scenarios and then validated our results using AWS instances.

Notably, we discovered a bug [2] in the OpenSSL s_time utility that caused client-side validation of the certificate chain to be skipped entirely. The bug was fixed in our experiments and, more recently, in mainline OpenSSL. However, previous TLS experiments comparing PQ signatures may be affected [3][4].

We hope this publication proves useful and helps further the exploration of the on-ramp candidates.

Take care,
Marco Gianvecchio, Alessandro Barenghi, Gerardo Pelosi
on behalf of the CROSS team



Harry Hart

unread,
Jan 22, 2026, 5:09:41 AMJan 22
to pqc-forum, Marco Gianvecchio

We would also like to inform everyone about our recent work on reducing the memory footprint of CROSS signature scheme[1].


The large memory footprint was prohibitive for schemes like CROSS to be implemented on small microcontrollers, such as the Cortex-M4. 


We proposed methods like on-the-fly hashing, optimised GGM and Merkle tree algorithms, and variable usage optimisation. Due to these optimizations, all versions of CROSS signature scheme now run on Cortex-M4. 


We propose several levels of optimization. Our smallest implementation takes 15KB-111KB of memory for Sign, reduced from the reference implementation by 92%, albeit with some trade-off in run-time. The balanced implementation reduces the memory footprint of Keygen/Sign/Verify by 95/61/85%, while maintaining or indeed even gaining speed from 0.2%-33%.


We will submit our code as a pull request to the pqm4 library soon.


[1] https://eprint.iacr.org/2024/1929

Reply all
Reply to author
Forward
0 new messages