SLH-DSA (SPHINCS) Performance Optimization Techniques

215 views
Skip to first unread message

conduition

unread,
Nov 24, 2025, 1:39:35 AM (11 days ago) Nov 24
to Bitcoin Development Mailing List
Hi devs,

I've spent the last several months implementing and benchmarking optimization techniques for the post-quantum hash-based signature scheme SLH-DSA (formerly SPHINCS+), which is being considered as a candidate for a quantum-resistant soft-fork upgrade to Bitcoin, re: BIP360.


char1.png

As a material result of my findings, I believe I now possess what may be the fastest publicly available implementation of SLH-DSA (at least on my hardware), and possibly also one of the fastest GPU implementations, though I've had difficulty finding comparable alternatives on that front. Its speed is owed to the Vulkan graphics programming API, often used by video game devs to squeeze performance out of gaming PCs and mobile phones.

The code: 

Using my CPU, this code can sign a message with SLH-DSA-SHA2-128s in just 11 milliseconds, and can generate keys in only 2 milliseconds (1ms if batched). Verification throughput approaches that of ECDSA, at around 15000 nanoseconds per verification if properly batched. If you have a GPU with drivers, everything runs even faster.

For perspective, the fastest open source SLH-DSA library I could find, PQClean, requires 94 milliseconds for SLH-DSA-SHA2-128s signing and 12ms for keygen on my CPU. PQClean can only achieve this speed on x86 CPUs, whereas Vulkan works on ARM devices, including Apple silicon.

There are caveats. This technique is memory-hungry, requiring several megabytes of RAM for signing and keygen, so it will not help in resource-constrained environments like hardware wallets. Dedicated hash accelerator chips or FPGAs would be more appropriate for those use-cases.

Furthermore, there is a hefty startup penalty, owing to the need to compile shaders on-device at runtime, though this can be mitigated by on-disk caching, and proper context scoping (e.g. don't compile verification shaders if you only need signing shaders). For daemon programs like bitcoind or lnd, I believe this would be not such a big issue, but it would be problematic for start-and-stop apps like CLI utilities.

More research is needed to gather additional data, and to assess the viability of this technique on diverse platforms. If you are interested in collaborating, please email me :)

regards,
conduition

Tim Ruffing

unread,
Nov 28, 2025, 10:47:44 AM (6 days ago) Nov 28
to conduition, Bitcoin Development Mailing List
Let me just say that leave the note here that this is awesome work!

I didn't expect that so much can be gained using SIMD, and that it
beats SHA-NI by such a large margin (even taking into account the
caveats you've mentioned).

Tim

On Sun, 2025-11-23 at 18:46 -0800, 'conduition' via Bitcoin Development
Mailing List wrote:
> Hi devs,
>
> I've spent the last several months implementing and benchmarking
> optimization techniques for the post-quantum hash-based signature
> scheme SLH-DSA (formerly SPHINCS+), which is being considered as a
> candidate for a quantum-resistant soft-fork upgrade to Bitcoin, re:
> BIP360.
>
> Survey article: https://conduition.io/code/fast-slh-dsa/
>
> --
> You received this message because you are subscribed to the Google
> Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to bitcoindev+...@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/d463887f-3a9e-48a5-b61a-8680646a370an%40googlegroups.com
> .

Nagaev Boris

unread,
Dec 1, 2025, 3:36:25 AM (4 days ago) Dec 1
to conduition, hun...@surmount.systems, Bitcoin Development Mailing List
Hey Conduition,

Great work!

I just noticed that BIP360 doesn't appear to specify the exact hash function to be used in SPHINCS, and different implementations have chosen different functions.
This makes it difficult to compare the implementations directly.

Would it make sense for all implementations to target the same hash function?

CC: Hunter Beast

Best,
Boris

--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/d463887f-3a9e-48a5-b61a-8680646a370an%40googlegroups.com.


--
Best regards,
Boris Nagaev

conduition

unread,
Dec 1, 2025, 11:58:30 AM (3 days ago) Dec 1
to Nagaev Boris, hun...@surmount.systems, Bitcoin Development Mailing List
Thanks Boris!

I just noticed that BIP360 doesn't appear to specify the exact hash function to be used in SPHINCS

BIP360 specifies a new witness version and address format: P2TSH (effectively taproot without the Elliptic Curve Crypto). It is agnostic to the cryptography or opcodes used in script. The changes which actually define the PQ crypto suites to add into Bitcoin consensus will be part of a new BIP that is still being drafted at the moment.


This is the same code used by PQClean, so any benchmark on libbitcoinpqc should be roughly equivalent to the PQClean performance shown in my article.*

different implementations have chosen different functions.

While initially there was some debate about the choice of hash function to use for SLH-DSA, we ultimately decided to propose only SHA2, for a number of reasons: Better legacy hardware support, better software performance, easier implementation, and less controversy. Any security benefit of using SHA3 would be nullified by the fact the rest of Bitcoin already depends on collision resistance of SHA2 for security.

regards,
conduition


* Since publishing my article, closer inspection has shown me that the PQClean team actually made some significant changes to the reference code, such as adding malloc/free calls on performance-critical code paths. Benchmarks on my PC show the SPHINCS+ team's reference code is noticably faster than PQClean, especially their 'clean' (non-AVX) code which signs in only 250ms on my CPU, compared to PQClean at 438ms. So PQClean is not the fastest open-source SLH-DSA implementation I know of anymore. SLHVK still blows both of them out of the water, but nonetheless I plan to update the article once I find some time in the new year.
You received this message because you are subscribed to a topic in the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bitcoindev/LAll07BHwjw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bitcoindev+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAFC_Vt78zhOhP8tks4GX_j0SvneeimX96LnPnSWPN4MUsnB20w%40mail.gmail.com.

publickey - conduition@proton.me - 0x474891AD.asc
signature.asc
Reply all
Reply to author
Forward
0 new messages