Updates on pre-hash for FIPS 204 and 205

3,816 views
Skip to first unread message

Moody, Dustin (Fed)

unread,
Apr 19, 2024, 2:06:09ā€ÆPMApr 19
to pqc-forum

Hello all,

Ā 

As described at the 5th PQC Standardization Conference, NIST is currently planning to specify separate "pure" and "pre-hash" versions of the signature algorithms in FN-DSA, ML-DSA and SLH-DSA, as I described in a forum post in January. In each case, the base signature generation and signature verification functions will remain unchanged, but the input string will be modified in order to create domain separation.

Ā 

Taking SLH-DSA as an example, the internal signing function, slh_sign_internal(M, SK, addrnd) will be the same as Algorithm 18 of FIPS 205 IPD (with the exception that if hedged signing is used, then the random value for opt_randĀ is passed as an input (addrnd) rather than being generated within the function). FIPS 205 will then define two API functions for signing and two API functions for verification, one of each for "pure" signing and one of each for "pre-hash" signing.

Ā 

For "pure" signing, the API would be slh_sign(M, ctx, SK). For hedged signing this function would:

  • generate an n-byte random addrnd.
  • construct M' = octet(0) || octet(OLEN(ctx)) || ctxĀ || MĀ  // ctxĀ must be at most 255 octets and is the empty string by default
  • returns slh_sign_internal(M', SK, addrnd)

For "pre-hash" signing, the API would be hash_slh_sign(M, ctx, PH, SK). For hedged signing this function would:

  • generate an n-byte random addrnd.
  • construct M' = octet(1) || octet(OLEN(ctx)) || ctxĀ || OIDPHĀ || PH(M)Ā  // ctxĀ must be at most 255 octets and is the empty string by default
  • Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  Ā  //Ā  PH is a NIST approvedĀ hash function or XOF
  • Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  //Ā Ā OIDPHĀ is the DER encoding of the OID for PH
  • returns slh_sign_internal(M', SK, addrnd)

If implementing hedged signing, then addrndĀ needs to be generated within the same cryptographic module as the one that performs slh_sign_internal (except when performing KAT testing). M' may be constructed outside of the cryptographic module.

Ā 

For pre-hashing, the FIPS will allow any approvedĀ hash function or XOF to be used for PH. However, when defining OIDs for signatures, we plan to specify a limited number of options -- perhaps only one or two options for PH for each parameter set. OIDs will be posted to the NIST web site; they will not be specified in the FIPS.


David Cooper

NIST PQC

Ā 

Filippo Valsorda

unread,
Apr 20, 2024, 7:20:22ā€ÆAMApr 20
to pqc-...@list.nist.gov
This, too, is excellent news. Built-in domain separation is not just a convenience, but the only way to ensure protocols don't expose the raw underlying signature operation in a way that makes it hard or impossible to separate later. Thank you!

Sebastien Riou

unread,
Apr 22, 2024, 8:58:19ā€ÆAMApr 22
to pqc-forum, Moody, Dustin (Fed)
Hi,

Assuming we have the pre-hashing version, what is the added value of having the pure version ?
I am afraid we will end up having to support both even in constrained devices (smart cards, secure elements....).

Best regards,
Sebastien Riou

Bas Westerbaan

unread,
Apr 22, 2024, 9:34:28ā€ÆAMApr 22
to Sebastien Riou, pqc-forum, Moody, Dustin (Fed)
To the contrary, I expect the prehashedĀ API to see little use.

In many schemes, the messages to be signed are alreadyĀ short (TLS), or there is an existing mechanism for prehashing (X.509 certs). As there is no advantage in those cases to using the prehash API, I expect these to adopt the pure API.

There are clear downsides of using the prehash API:

1. We need to choose a hash function. That might require mapping existing negotiated hashes to OIDs and extra codepoints.
2. More variants (because of the choice of hash) to test.
3. It's cryptographically weaker in general. For instance: pure SLH-DSA does not assume collisionĀ resistance of the hash ā€” prehash SLH-DSA does.

Quoting RFC 8032 (2017) in which a prehashed variant of Ed25519 is defined:

The Ed25519ph and Ed448ph variants are prehashed.Ā  This is mainly
useful for interoperation with legacy APIs, since in most of the
cases, either the amount of data signed is not large or the protocol
is in the position to do digesting in ways better than just
prehashing (e.g., tree hashing or splitting the data).Ā  The
prehashing also makes the functions greatly more vulnerable to
weaknesses in hash functions used.Ā  These variants SHOULD NOT be
used.

Best,

Ā Bas


--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/114533c4-c933-447c-995b-0492e061dee7n%40list.nist.gov.

Kampanakis, Panos

unread,
Apr 22, 2024, 10:46:14ā€ÆAMApr 22
to Sebastien Riou, Moody, Dustin (Fed), pqc-forum, Bas Westerbaan

+1 on Basā€™ comment and the probable low adoption of prehash-Sigs. For reference, https://www.nccoe.nist.gov/sites/default/files/2023-12/pqc-migration-nist-sp-1800-38c-preliminary-draft.pdf (Appendix C) includes a summary of the pros and cons and some background on pure vs prehash-EdDSA. It is a moot point now because NIST decided differently, but imo the limited use-cases that needed prehashing could define it in their own specs.

Ā 

Ā 

From: 'Bas Westerbaan' via pqc-forum <pqc-...@list.nist.gov>
Sent: Monday, April 22, 2024 9:34 AM
To: Sebastien Riou <sebasti...@pqshield.com>
Cc: pqc-forum <pqc-...@list.nist.gov>; Moody, Dustin (Fed) <dustin...@nist.gov>
Subject: RE: [EXTERNAL] [pqc-forum] Re: Updates on pre-hash for FIPS 204 and 205

Ā 

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

Filippo Valsorda

unread,
Apr 22, 2024, 1:06:13ā€ÆPMApr 22
to Kampanakis, Panos, Sebastien Riou, Moody, Dustin (Fed), pqc-forum, Bas Westerbaan
I also agree that the pre-hash variant will not be commonly used, and will nonetheless add implementation and testing complexity.

Maybe a solution would be to only specify the context string into the API, and then provide guidance for how to include the hash OID in the context string.

The hash is after all only one of the parameters that describe the message.
--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Sebastien Riou

unread,
Apr 22, 2024, 5:15:05ā€ÆPMApr 22
to Filippo Valsorda, Kampanakis, Panos, Moody, Dustin (Fed), pqc-forum, Bas Westerbaan
Thanks for yourĀ answers. I get that in your technical context the pure version is enough and you see pre-hash as unnecessary.Ā 

My question is about the value of keeping the pure version in a standard that has the pre-hash version. Isn't it just adding complexity to theĀ standard ? (it is simpler than pre-hash but adding something,Ā no matter how simple,Ā is adding complexity)

My understanding is that pre-hash can do everything that the pure version does, hence my question, what do we gain byĀ keeping the pure version ?

Best regards,

Sebastien Riou

Director, Product Security Architecture

PQShield Ltd

Ā 

M:Ā Ā  Ā  Ā  Ā  Ā  Ā  +33 782 320 285

E:Ā  Ā  Ā  Ā  Ā  Ā  Ā Ā sebasti...@pqshield.com

W:Ā Ā  Ā  Ā  Ā  Ā  Ā Ā www.pqshield.com


Bas Westerbaan

unread,
Apr 23, 2024, 2:59:22ā€ÆAMApr 23
to Sebastien Riou, Filippo Valsorda, Kampanakis, Panos, Moody, Dustin (Fed), pqc-forum
My understanding is that pre-hash can do everything that the pure version does, hence my question, what do we gain byĀ keeping the pure version ?

The pure version is strictly more secure for SLH-DSA (doesn't require collision resistance of the hash) and it is more compatible. (If either side supports the pure version, you're golden ā€” if either side supports prehash, you also still need to support the same hash.)

Best,

Ā Bas

Robin Larrieu

unread,
Apr 26, 2024, 9:44:49ā€ÆAMApr 26
to pqc-...@list.nist.gov
Dear all,

On the one hand I understand the benefits of domain separation from a theoretical point of view, but on the other hand I am not convinced by the relevance of a pure vs prehash distinction at the algorithm level. From an abstraction point of view, I prefer to view the signature primitive as "authenticate some arbitrary-length data", rather than "authenticate some data, that may or may not be prehashed, with an optional context string"; the latter feels like a protocol consideration that could be handled at the protocol level. I believe that having this distinction at the algorithm level, with a separate API for each, may cause headaches for protocol designers and implementers. For example :
- the protocol requires to authenticate some data, prepended with a context string. Should I use sign(M, ctx, SK), or sign(ctx || M, "", SK) ? Notice that M' = (octet(0) || octet(OLEN(ctx)) || ctx || M) in the first case, while M' = (octet(0) || octet(0) || ctx || M) in the second case.
- the protocol requires to authenticate some data; that data happens to be a hash. Should I use pure or pre-hash signature ?
- the protocol requires to authenticate some data, which is formatted as in the proposal to provide domain separation. Should I use pure signature, or should I pass directly that data as per "M' may be constructed outside of the cryptographic module" ?
- I need for whatever reason to replace ML-DSA/SLH-DSA in my application by some other algorithm that does not support this distinction and has a different API because of this (e.g. XMSS as per RFC 8391). What message should I sign with this new algorithm, especially if the context string is non-empty and/or if the data was pre-hashed ?

All of these ambiguities disappear when sticking to the definition of the signature primitive as "authenticate some arbitrary-length data", and letting the protocol handle domain separation if needed. Moreover, as others have noted, the prehash variant will probably have a low adoption, like preHash-EdDSA; its main benefit in practice is that it makes implementation of multi-part interfaces trivial, while the two passes on the message performed by EdDSA and SLH-DSA are incompatible with this (and comes with a performance overhead for long messages). But maybe there are other solutions to this multi-part API issue. I will start another thread on this topic.

As a side note, since "M' may be constructed outside of the cryptographic module", I guess that implementations of sign_internal/verify_internal are supposed to validate that the input is in one of these formats ? If so, please mention it in the standard and prepare the test data accordingly (possibly including invalid inputs that must be rejected).

As another side note, what about the "signed enveloppe" interface (where the message is recovered from the signature) ? After all, this was the interface that was imposed during the competition, and therefore used by every submission. If NIST does not consider this interface to be relevant, why did they use this one for the submission template in the first place ? I am not sure it is used by many protocols, but it may still be useful to support it; the problem is that it is not a generic construction, some submissions concatenated the message then the signature, some submissions did the opposite, and some even used a specific encoding (e.g. Falcon); also it is not clear how this combines with the pure- vs pre-hash- signature.

TL;DR: I would find much more elegant (and simpler to use in higher-level protocols) if domain separation betwen pure signature and hash-then-sign scenario was handled at the protocol level if needed, rather than in the signature algorithm itself. Perhaps the generic construction proposed here could be treated in a separate document giving recommendations for protocol designers instead.

Speaking for myself,
Robin Larrieu

Bobby McGee

unread,
Apr 26, 2024, 1:01:58ā€ÆPMApr 26
to pqc-forum, Robin Larrieu
"Authenticate what is meant, not what is said (and know the difference)" is a sound principle, but where should it be enforced?Ā  More generally, where does NIST end and the rest of the world begin?Ā  What are NIST's guidelines or principles for where their standards end and where applications of those standards begin?Ā  Does NIST define "digital signature algorithm" (*-DSA seems to be thing they're standardizing here)?Ā  Would it be beneficial to have some strict definition of the boundaries of what is being standardized?

Falko Strenzke

unread,
Apr 29, 2024, 2:02:55ā€ÆAMApr 29
to Robin Larrieu, pqc-...@list.nist.gov
Hi Robin,

Am 26.04.24 um 15:44 schrieb 'Robin Larrieu' via pqc-forum:
On the one hand I understand the benefits of domain separation from a theoretical point of view, but on the other hand I am not convinced by the relevance of a pure vs prehash distinction at the algorithm level. From an abstraction point of view, I prefer to view the signature primitive as "authenticate some arbitrary-length data", rather than "authenticate some data, that may or may not be prehashed, with an optional context string"; the latter feels like a protocol consideration that could be handled at the protocol level. I believe that having this distinction at the algorithm level, with a separate API for each, may cause headaches for protocol designers and implementers. For example :
- the protocol requires to authenticate some data, prepended with a context string. Should I use sign(M, ctx, SK), or sign(ctx || M, "", SK) ? Notice that M' = (octet(0) || octet(OLEN(ctx)) || ctx || M) in the first case, while M' = (octet(0) || octet(0) || ctx || M) in the second case.
- the protocol requires to authenticate some data; that data happens to be a hash. Should I use pure or pre-hash signature ?
- the protocol requires to authenticate some data, which is formatted as in the proposal to provide domain separation. Should I use pure signature, or should I pass directly that data as per "M' may be constructed outside of the cryptographic module" ?
- I need for whatever reason to replace ML-DSA/SLH-DSA in my application by some other algorithm that does not support this distinction and has a different API because of this (e.g. XMSS as per RFC 8391). What message should I sign with this new algorithm, especially if the context string is non-empty and/or if the data was pre-hashed ?

All of these ambiguities disappear when sticking to the definition of the signature primitive as "authenticate some arbitrary-length data", and letting the protocol handle domain separation if needed. Moreover, as others have noted, the prehash variant will probably have a low adoption, like preHash-EdDSA; its main benefit in practice is that it makes implementation of multi-part interfaces trivial, while the two passes on the message performed by EdDSA and SLH-DSA are incompatible with this (and comes with a performance overhead for long messages). But maybe there are other solutions to this multi-part API issue. I will start another thread on this topic.

The problem with your suggestion is that for a protocol that does not hash any meta data, specifically not the signature algorithm identifier, when computing the message digest, there occurs an ambiguity with respect to how to interpret the signature. A signature over the pre-hashed variant of SLH-DSA interpreted as the non-pre-hashed variant would lead to an existential forgery: if an adversary changes the signature algorithm identifier accordingly, it appears that the signer signed hash(m), while he in fact signed m. Similarly, the ctx allows to distinguish between composite and stand-alone use of the signature algorithm in a sound way which doesn't allow stripping off one signature without this being detected.

- Falko

--

MTG AG
Dr. Falko Strenzke
Executive System Architect

Phone: +49 6151 8000 24
E-Mail: falko.s...@mtg.de
Web: mtg.de


Follow us

MTG AG - Dolivostr. 11 - 64293 Darmstadt, Germany
Commercial register: HRB 8901
Register Court: Amtsgericht Darmstadt
Management Board: JĆ¼rgen Ruf (CEO), Tamer Kemerƶz
Chairman of the Supervisory Board: Dr. Thomas Milde

This email may contain confidential and/or privileged information. If you are not the correct recipient or have received this email in error,
please inform the sender immediately and delete this email.Unauthorised copying or distribution of this email is not permitted.

Data protection information: Privacy policy

Robin Larrieu

unread,
Apr 29, 2024, 10:52:20ā€ÆAMApr 29
to Falko Strenzke, pqc-...@list.nist.gov
Hi Falko,

Thank you for your feedback.
I just want to make it clear that I am aware of the importance of domain separation; I am simply raising some issues I see when doing domain separation *at the algorithm level* rather than at the protocol level.
The first issue is the ambiguity that may arise when integrating such algorithms in existing protocols. Concretely, if a protocol relies on an abstract signature scheme (without specific features like pure vs. prehash distinction) to authenticate data, one would think that the default behavior should be to use the pure signature with no context string. However, the situation is not so clear if when reading the protocol specification, one notice that the data to be authenticated is some 32-byte string that happens to be the output of a hash function: should one stick to the default behavior for arbitrary data, or should one swith to prehash mode ? Similar questions arise when the protocol includes context strings, or formats the message to provide domain separation (see my initial message).
My second concern is what some call "crypto-agility", that is the ability to easily change the underlying cryptographic algorithm. It turns out that many legacy applications have hard-coded RSA as their crypto layer, which makes migration to PQC a significant technical and financial effort. Introducing the new cryptographic algorithms could be the occasion to solve the issue and facilitate the next migration if needed; however I believe that introducing such advanced features at the algorithm level (with the interface changes it implies) is just asking to make the same mistake of hard-coding algorithms again. For this reason, I believe that the interface of all crypto primitives should be kept as simple and uniform as possible, and tweaks or features specific to this or that algorithm should be avoided at all cost.

The proposed transform is interesting because it provides domain separation independently of the underlying signature scheme (provided that it supports signing arbitrary-length data). Therefore, I would put it in the same category as modes of operations for block ciphers, in the sense that some are more secure than others and one should use the appropriate one depending on the expected properties. Thus I would find it more elegant (and more practical for the reasons above) if it was a tool for protocol design rather than a feature specific to FIPS-204/205 as an attempt to save insecure protocols.


Le 29/04/2024 Ć  08:02, Falko Strenzke a Ć©critĀ :
Hi Robin,

Am 26.04.24 um 15:44 schrieb 'Robin Larrieu' via pqc-forum:
On the one hand I understand the benefits of domain separation from a theoretical point of view, but on the other hand I am not convinced by the relevance of a pure vs prehash distinction at the algorithm level. From an abstraction point of view, I prefer to view the signature primitive as "authenticate some arbitrary-length data", rather than "authenticate some data, that may or may not be prehashed, with an optional context string"; the latter feels like a protocol consideration that could be handled at the protocol level. I believe that having this distinction at the algorithm level, with a separate API for each, may cause headaches for protocol designers and implementers. For example :
- the protocol requires to authenticate some data, prepended with a context string. Should I use sign(M, ctx, SK), or sign(ctx || M, "", SK) ? Notice that M' = (octet(0) || octet(OLEN(ctx)) || ctx || M) in the first case, while M' = (octet(0) || octet(0) || ctx || M) in the second case.
- the protocol requires to authenticate some data; that data happens to be a hash. Should I use pure or pre-hash signature ?
- the protocol requires to authenticate some data, which is formatted as in the proposal to provide domain separation. Should I use pure signature, or should I pass directly that data as per "M' may be constructed outside of the cryptographic module" ?
- I need for whatever reason to replace ML-DSA/SLH-DSA in my application by some other algorithm that does not support this distinction and has a different API because of this (e.g. XMSS as per RFC 8391). What message should I sign with this new algorithm, especially if the context string is non-empty and/or if the data was pre-hashed ?

All of these ambiguities disappear when sticking to the definition of the signature primitive as "authenticate some arbitrary-length data", and letting the protocol handle domain separation if needed. Moreover, as others have noted, the prehash variant will probably have a low adoption, like preHash-EdDSA; its main benefit in practice is that it makes implementation of multi-part interfaces trivial, while the two passes on the message performed by EdDSA and SLH-DSA are incompatible with this (and comes with a performance overhead for long messages). But maybe there are other solutions to this multi-part API issue. I will start another thread on this topic.

The problem with your suggestion is that for a protocol that does not hash any meta data, specifically not the signature algorithm identifier, when computing the message digest, there occurs an ambiguity with respect to how to interpret the signature. A signature over the pre-hashed variant of SLH-DSA interpreted as the non-pre-hashed variant would lead to an existential forgery: if an adversary changes the signature algorithm identifier accordingly, it appears that the signer signed hash(m), while he in fact signed m. Similarly, the ctx allows to distinguish between composite and stand-alone use of the signature algorithm in a sound way which doesn't allow stripping off one signature without this being detected.

I am aware that without proper domain separation, a valid SLH-DSA signature for SHA512(m) is also a valid SHA512-SLH-DSA signature for m, and that is why domain separation can be important. Considering these are two different signature schemes, it is debatable whether or not this qualifies as an existential forgery (in the EF-CMA game for SHA512-SLH-DSA, one do not get access to a SLH-DSA signature oracle), but I get your point.
However, I think this does not matter that much in practice. If a user is allowed to query the SLH-DSA signature for a random-looking 64-byte string against some private key, it is probably also allowed to query the signature for m against that same private key (either directly SLH-DSA-sign(m, sk), or SLH-DSA-sign(SHA512(m), sk), or SHA512-SLH-DSA-sign(m, sk)). I do not claim extensive knowledge of existing protocols so I may be wrong. Maybe there are protocols where such an attack is relevant, typically if the same private key is used in several different protocols (but then you probably have bigger problems).
Notice also that the premise was "a protocol that does not hash any meta data, specifically not the signature algorithm identifier", so maybe the protocol itself should be fixed instead. Indeed, such a protocol would break with XMSS or LMS which did not define pure/prehash variants.

Best,
Robin Larrieu

Sophie Schmieg

unread,
Apr 29, 2024, 12:27:23ā€ÆPMApr 29
to Robin Larrieu, Falko Strenzke, pqc-...@list.nist.gov
I have to admit, I do not really see the point of the prehashed variant. It vastly complicates how a SLH-DSA key looks like and acts as an obstruction to a generic signature API.

Since the prehashed version requires an additional hash function, this parameter choice needs to be included in the public key, as it has to be independently authentic information, as otherwise an attacker could switch to a hash function that is not collision free. Even if a SLH-DSA key is never used prehashed, it would still need to include this information in the public key, as it might be.

On top of that, the prehashed and not-prehashed versions of SLH-DSA would be mutually incompatible, making the signature verification algorithm as currently stated not well-defined, as a verifier has to be aware of whether a message was signed prehashed or not in order to successfully verify it.

In practice, this means this information has to be either fixed for a use case, or included with the signature.
If a use case fixes the choice of prehashed or not prehashed, the domain separator is not needed, and instead of standardizing a prehashed variant of SLH-DSA, use cases that require prehashing could just specify to sign/verify the hash of the longer message.
If the choice is made dynamically, it has to be included with the message at protocol level, unless a verifier is expected to try both variants. This again would force the protocol to carry this information, making the domain separation at protocol level straightforward and even automatic, if the best practice of leaving no ambiguity for signature inputs is followed.

Furthermore, if the choice of SLH-DSA variant has to be taken dynamically, the API for SLH-DSA would be incompatible with any other signing scheme, as both signing and verifying now requires an additional input indicating whether the prehashed variant should be used, which will break any implementation that generalizes signature schemes to allow for cryptographic agility.

There might be a concern in the case of delegating prehashing completely to the protocol layer that a caller, who first uses the cryptographic module to compute the hash might not give the same hash back to the module to be signed/verified, but such a caller would be free to be dishonest about the message in any way they want in any case, so keeping the hash inside of the cryptographic module seems to not have a security benefit.

Note that prehashed versions of EdDSA have been so unpopular that they have not even been included in ISO15118, even though the embedded use case there would ostensibly benefit from having the prehashed version.

I'm not sure if I am missing something, but to me, including the prehashed version of SLH-DSA seems to have very limited advantages, while on the other hand increasing the complexity of implementations considerably and leading to incompatibilities down the line.

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.


--

Sophie SchmiegĀ |
Ā Information Security EngineerĀ |Ā ISE CryptoĀ |Ā ssch...@google.com

Markku-Juhani O. Saarinen

unread,
Apr 29, 2024, 2:00:19ā€ÆPMApr 29
to pqc-forum, Sophie Schmieg, Falko Strenzke, pqc-...@list.nist.gov, Robin Larrieu
Hi Sophie & Co,

I'd imagine that different "pqalg-prehash" combinations could be expressed similarly to different "parameter sets", so this does not complicate the APIs much. As you note, these are different algorithms from the interoperability viewpoint. The proposal from NIST was to include an "OID" (of some kind, not ASN.1) in the domain separation to specify the hash, and there would be very limited number of options (NIST doesn't have that many suitable hash functions.)

Example 1: There seem to be some users sers that really want to use SHA2 with ML-DSA, and this prehashing/OID business is the only way to do it. For example, "ML-DSA-87-SHA512" is my interpretation of what the new CNSA 2.0 FAQ Ā suggests (see "Q: Can I use SHA-3 as a hash?")

Another point: Especially with SLH-DSA, we shouldn't be too focused on the basic PKI /TLS case -- where signed messages are short and computers are powerful (and SLH-DSA is unlikely to be used -- except perhaps as a root, but certainly not for challenge-response authentication.). There are applications where one can't hold the entire message data in memory but wants to use an incremental "update" hash API with signatures. The two-pass behavior of SLH-DSA makes things difficult without prehashing. If this option is not explicitly included in the standard engineers are forced to do it in an ad hoc manner, which is dangerous. It's easy enough for an engineer to think "what's the danger of signing a hash" and then all of a sudden we have the same potential issue of existential forgeries again. Alternatively these engineers potentially have a product that can't be FIPS certified if strong language is put in place to prohibit this.

Additionally, There are also use cases where one can't access the signature or public before starting to process (hash) the message, or hybridization where the same message is signed by multiple algorithms, etc.Ā Ā 

Example 2: I see no obvious way of implementing hybrid firmware verification or certain attestation (signing) tasks in a Root-of-Trust unit without prehashing. As with the CNSA / SHA2 case, hybridization is a real use case that we must support -- it is seemingly required for compliance in Europe (BSI allows hash-based signatures without hybridization but other schemes not, and I think ANSSI holds a similar line). Double-hashing (computing one hash for both classical signature and PQ signature) in the secure boot process makes existing solutions unworkable. And there is no serious security rationale or actual attack to justify it.

As a sidenote, I don't see collision resistance as a problem, especially for SHAKE / SHA3. It's not just that it generally has a higher security margin, but with permutation-based hashes, one can turn a collision attack against the internal capacity bits into a second pre-image attack of the same complexity.

Cheers,
- markku

Sophie Schmieg

unread,
Apr 29, 2024, 4:24:18ā€ÆPMApr 29
to Markku-Juhani O. Saarinen, pqc-forum, Falko Strenzke, Robin Larrieu
I do agree that prehashing is something someone might want to do, I just do not see why this needs to be part of FIPS 205. You can, after all, always hash a message and then sign the hash of this message,Ā without requiring any changes to the underlying signature algorithm,Ā as this is a protocol consideration. It will usually result in the hash being applied twice, but that doesn't really matter for security, and as you noted, that level of performance considerations is unlikely to matter in SLH-DSA's most common use cases. My problem with prehashingĀ isn't the fact that things get prehashedĀ by itself, or even the need for a streaming implementation, but rather that this should happen on the protocol layer, and not within the signature scheme itself, just as the many other considerations that need to be dealt with on the protocol layer.

From what I understand, the proposal here is to add a domain separator into the signature scheme itself, to capture whether the message in question is any kind of byte sequence or the byte sequence produced by a hash function. With the hash function unspecified, this does not even fix the length of the input,Ā making the two scenarios fairly indistinguishable from the viewpoint of the signing scheme.
OTOH, domain separation only has a value when signing with the same key, as different keys are inherently separated, and even with the knowledge of both keys, it is usually impossible to produce a signature verifyingĀ under both of them (different from say GMAC, where this property is fairly straightforward, but a one byte domain separator wouldn't help there either). Therefore, at least from what I've gathered, the intention must be to use the same key both to produce prehashed and not-prehashedĀ signatures, leading to my statement that verification in this case is not well defined. After all, if it is a separate version similar to key size or f/s variant, there would be no need to introduce a domain separator. I do think mentioning that one can devise a protocol that hashes a message before signing it would be valuable, I'm mostly trying to argue against this protocol design decision being reflected within the algorithm itself.

David A. Cooper

unread,
Apr 30, 2024, 5:12:24ā€ÆPMApr 30
to pqc-forum
Just speaking for myself, I am having trouble understanding some of the concerns about having both pure and pre-hash versions of SLH-DSA and ML-DSA. Some of the issues that are raised as potential problems don't seem to me to be any different from what we already have with RSA and ECDSA.

One of the concerns seems to be that one cannot determine exactly how a message was signed just by looking at the public key. Today with ECDSA, the public public key is just identified as being an elliptic curve key, but there are multiple choices for the hash function that is used to compute the signature. Applications are never left to guess which hash function was used, that information is provided along with the signature. For example, there are different signature OIDs for ecdsaWithSHA1, ecdsaWithSHA256, ecdsaWithSHA384, etc. The situation with RSA is similar, but even more complication since the signature could be created using the PKCS #1 v1.5 padding scheme, or PSS, or something else. Just as with ECDSA, any necessary information is provided along with the signature -- the padding scheme, the hash algorithm used, and (in the case of PSS) the values of any other parameters associated with the signature.

The situation with SLH-DSA could be the same. When using OIDs, there could be one OID for each parameter set to identity the public key. The signature OID could then indicate whether pure or pre-hash signing was performed and, in the case of pre-hash, the hash function used. (Something similar could be done for protocols that do not use OIDs.) Another option (suggested in https://mailarchive.ietf.org/arch/msg/spasm/Ssk0hTwLm2ao0Fkvxa7jyfdMREg/) would be to use the same OID for both the public key and the signature. In that case, a given public key could only be used one way or the other, and the domain separation in the signature would be a matter of belts and suspenders in case, for example, the same public key was distributed as both a pure signing key and a pre-hash signing key.

There was a question about what a protocol should do if the message to be signed was a hash. But, that same issue applies to ECDSA and RSA. What does one do with RSA if the message to be signed happens to be a hash? Suppose the message to be signed, M, happens to be the SHA-256 hash of something and the signature identifier is sha256WithRSAEncryption (i.e., PKCS #1 v1.5 padding with SHA-256 hash). Should the PKCS #1 v1.5 padding be applied to M and then modular exponentiation applied to the result or should the PKCS #1 v1.5 padding be applied to SHA-256(M)?

There was a question of what one should do if the message to be signed happens to be formatted in the same way as the formatting for domain separation. Wouldn't the same question arrive if one were signing using PKCS #1 v1.5 and the message happened to be formatted in the same way as that padding scheme?

I am not familiar with ISO 15118, so I cannot comment on that. It may be the case that the modules that perform the signing are constrained, but how large are the messages that need to be signed? It certainly makes sense that some applications will specify that only the pure version can be used. But can we definitely say that no applications will need the pre-hash? For all practical purposes, a cryptographic module cannot sign a message using SLH-DSA unless it has sufficient memory to store the entire message. So, if we didn't define a pre-hash variant, what should be done if the message to be signed is larger than the cryptographic module's memory? One can imagine changing SLH-DSA to allow the randomizer to be set to the output of a random number generator, so that signing could be performed with just one pass over the message, but this was never discussed during the multiple rounds of review, and no one proposed it during the public comment period either. This still wouldn't address the concerns of some about having a low bandwidth connection to the cryptographic module.

Another option would be to only allow pure signing and leave it to applications to ensure that the messages to be signed are never too large. For many applications this would not be an issue at all. But, for those applications for which it is an issue, wouldn't requiring the application to change in order to ensure that the message to be signed is small make the transition more difficult than if an option for signing large messages (pre-hashing) was made available?

Mike Ounsworth

unread,
Apr 30, 2024, 6:02:37ā€ÆPMApr 30
to David A. Cooper, pqc-forum

+1 to Davidā€™s comment.

Ā 

> But can we definitely say that no applications will need the pre-hash?

Ā 

IMO, ā€œneedā€ is a very strong word, and, IMO, the reason this debate has circled so much. There are other potential solutions (modify all the protocols), so strictly speaking no applications ā€œneedā€ the pre-hash.

Ā 

That said, I know that my colleagues in the HSM firmware space have been bemoaning the loss of pre-hashed signatures. The cases that have come up are:

Ā 

  • CRL ā€“ can be large, several MB -- still within RAM of any reasonable HSM -- and signed infrequently, like once a day.
  • OCSP Responses ā€“ are small, but a CA can need to sign potentially billions per day and potentially on-demand in the nonce-based mode; the bigger problem is that without pre-hashing, OCSP responses are variable length, which means you canā€™t optimize on the size of those input packets, which seems like a minor performance / throughput issue, but does add up on millions ā€“ billions per day.
  • Proprietary signed objects ā€“ itā€™s hard to reason about things that are not publicly specified, but I understand that there exist lots of kinds of proprietary signed things ā€“ especially proprietary code and firmware signing ā€“ that do not pre-hash the data (and why would you when the data format was designed in the era of RSA and ECDSA?). When youā€™re signing something like an operating system which can be several GB over many thousands of library files to be signed individually, just the act of streaming all that data over the network into your high security area where the HSM lives can have latency impacts on your entire build infrastructure.

Ā 

These data formats are old enough to be effectively rusted-shut. We can change the signing algorithms, but the ship has long sailed to change the CRL data format to include a flag about whether the signature is over the raw data or over a pre-hash of it (and which hash function was used by the signer).

Ā 

So nothing will catch fire if we donā€™t get pre-hashed versions of SLH-DSA and ML-DSA, but performance and throughput of some specific high-volume and latency-sensitive usecases will suffer, even on big rack-mount HSMs.

Ā 

But also, Iā€™m sympathetic to the fact that pre-hashing un-does the domain-separation and collision-resistance that are built-in to SLH-DSA and ML-DSA (aka ā€œYes, itā€™s not worse than RSA and ECDSA, but itā€™s not the 1980ā€™s anymore and we can do betterā€). So Iā€™m sympathetic that providing pre-hashed versions is effectively NIST endorsing weakening the spec because implementers will see the OIDs for the pre-hashed versions and use them without reading the fine-print warnings, however by punting this to protocol designers as Sophie suggests gives a better chance that the people making the decision to support pre-hashing will be aware of the fine-print and can build replacement domain-separators -- aka H( pk || m ) -- into the protocol-level pre-hash. But as stated above, only for protocols and data formats that are not rusted shut.

Ā 

I see both sides of this. I honestly donā€™t know what the right decision is.

Ā 

---

Mike Ounsworth

Sophie Schmieg

unread,
Apr 30, 2024, 8:27:13ā€ÆPMApr 30
to Mike Ounsworth, David A. Cooper, pqc-forum
I am not opposed to having support for prehashedĀ signatures, I am specifically opposed to the specific way prehashedĀ signatures are proposed here. If I understand things correctly, the idea is to have two version of both ML-DSA and SLH-DSA, where one is Sign(0x00 || message) and the other is Sign(0x01 || Hash(message)), where "Sign" is the base version of ML-DSA/SLH-DSA, with this base version not available to callers. The reason I dislike this idea differ between ML-DSA and SLH-DSA:

In the case of SLH-DSA (and EdDSA) it is not possible to prehash transparently, since both of these schemes pass twice over the input. In other words, it is impossible to create a prehashedĀ interface that has the same property of being transparent to the verifier, the verifier needs to be aware of the mode used to sign. This is where the suggestion is to have this information in the OID or similar. While that works, it makes the usage of a domain separator superfluous, as a public key that is only ever used prehashedĀ or only ever used not-prehashedĀ would not require this type of domain separation. I.e. if this is the intended use case, we should just use Sign(message) or Sign(Hash(message)) and give those two different OIDs, which I would call a protocol decision, since the protocol needs to describe what an OID stands for. The algorithm here should just provide an interface for signing arbitrary byte sequences, and leave the hashing decision to that OID description. However, the inclusion of the domain separator suggests that this is not the intended way of using prehashed signature schemes, which leads to the comment that verification is not well-defined. Basically, either way, this should not have an influence over the definition of SLH-DSA or ML-DSA, since the only place where it matters is a higher level protocol. Especially if the hash function is not determined by NIST, but left up to the protocol to chose, pushing this functionality in the signature verification/signing algorithms means far higher implementation complexity, as this now has a very different interface from a standard signature algorithm, requiring additional parameters, which could just be left to the protocol description/implementation, where the OID specified the hash function used for prehashing.

For ML-DSA, there is an additional consideration. The big difference in that case (similar to ECDSA and RSA) is that the only thing needed for the signer is a hash of only public inputs, usually just a hash of the message. In the case of Dilithium/FIPS 204 we can do the same thing, with the slight caveat that we need to hash in the public key for Dilithium before handing this hash to the sender (called Āµ in the draft standard). This can be accomplished in a streaming interface, and the resulting signature would be no different to a non-prehashedĀ Dilithium signature. This transparent prehashingĀ option means less complexity in the implementation, as the verification logic does not need to be aware of the way the signing logic was done. On top of that, it also allows for things like moving an existing key from a software implementation to an HSM which only supports streaming inputs. (Ideally one would rotate the key while doing so, so this is a less important property inĀ myĀ mind).

In summary, using this specific pattern for prehashingĀ will lead to higher implementation complexity, while not really solving the problem it is supposed to solve.

Mike Ounsworth

unread,
Apr 30, 2024, 8:55:52ā€ÆPMApr 30
to Sophie Schmieg, David A. Cooper, pqc-forum

Hi Sophie,

Ā 

> In the case of Dilithium/FIPS 204 we can do the same thing, with the slight caveat that we need to hash in the public key for Dilithium before handing this hash to the sender (called Āµ in the draft standard). This can be accomplished in a streaming interface, and the resulting signature would be no different to a non-prehashedĀ Dilithium signature.

Ā 

I have a question about this (probably because I have not been following all of this thread in detail).

Ā 

One of the advantages of (external) pre-hashing with RSA is that you can do the pre-hash outside of the FIPS boundary ā€“ ex.: in the PKCS#11 client library which may be in a different datacentre from the FIPS module (HSM) so that you only need to transmit the hash over the wire. As far as FIPS 186-5 (which chains to RFC 8017 for RSA-PSS) is concerned, that is the message m, and whether itā€™s actually a message, or is the hash of something else is completely irrelevant to your FIPS certification ā€“ IE the hash function used to compute the pre-hash is not mentioned in FIPS 186-5 / RFC 8017 and therefore is not subject to FIPS validation.

Ā 

My concern / question about FIPS 204 (and the ECDSA part of FIPS 186-5 for that matter) is: if you take the pre-hash step (step 6 of FIPS 204 section 6, or step 1 of FIPS 186-5 section 6.4.1) and perform it externally to your FIPS module, then have you horribly violated your FIPS boundary in a way that will never pass a FIPS validation? In other words, I feel like you canā€™t actually ā€œliftā€ the pre-hash step out of the FIPS module, you can only add an extra round of hashing in front, which cannot be done transparently to the verifier ā€“ but I would love for someone with more clarity to confirm.

Ā 

---

Mike Ounsworth

D. J. Bernstein

unread,
May 1, 2024, 5:07:15ā€ÆAMMay 1
to pqc-...@list.nist.gov
'Sophie Schmieg' via pqc-forum writes:
> we should just use Sign(message) or Sign(Hash(message)) and give those
> two different OIDs, which I would call a protocol decision

If both options exist then, sure, interoperability boils down to the
protocol knowing which option it's using for each signature.

But what happens to _security_ if the attacker takes Sign(H(m)) and
sends that along with the other OID? Does this end up delivering H(m),
instead of m, to the application, with perhaps catastrophic effects?

People disagree regarding the best way to stop this sort of attack. Here
are some approaches:

* Whatever mechanism is used to authenticate the pk, use that for the
(OID,pk) pair, prohibiting any other OID being used for that pk.

* Set up the signature details so that a signature under one OID
isn't (and can't be converted into) a signature verifying under the
other---or just settle on one variant with one OID.

* Don't set up a protocol with the "agility" to choose between
Sign(m) and Sign(H(m)). Don't use the same pk across protocols.

These approaches ask different parts of the ecosystem to take action
(the signature-system designers for the second approach, the protocol
designers for the first and third approaches), so each side has an
easier default path of taking zero action and blaming the other side.

---D. J. Bernstein
signature.asc

Sophie Schmieg

unread,
May 1, 2024, 1:32:15ā€ÆPMMay 1
to pqc-...@list.nist.gov
A key should only be used with one protocol as is. Crypto agility switching to a different OID needs to switch to a different key.

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Mike Ounsworth

unread,
May 1, 2024, 1:53:06ā€ÆPMMay 1
to Sophie Schmieg, pqc-...@list.nist.gov

Using the same OID for signature and public key, and baking into the OID whether itā€™s direct or pre-hash goes a long way to guaranteeing that a single key doesnā€™t mix modes.

Ā 

---

Mike Ounsworth

Ā 

From: 'Sophie Schmieg' via pqc-forum <pqc-...@list.nist.gov>
Sent: Wednesday, May 1, 2024 12:32 PM
To: pqc-...@list.nist.gov
Subject: Re: [EXTERNAL] Re: [pqc-forum] Re: Updates on pre-hash for FIPS 204 and 205

Ā 

A key should only be used with one protocol as is. Crypto agility switching to a different OID needs to switch to a different key. On Wed, May 1, 2024 at 2:ā€Š07 AM D. J. Bernstein <djb@ā€Šcr.ā€Šyp.ā€Što> wrote: 'Sophie Schmieg' via pqc-forum

David A. Cooper

unread,
May 6, 2024, 1:07:15ā€ÆPMMay 6
to Mike Ounsworth, pqc-forum
Hello Mike,

On 4/30/24 5:55 PM, Mike Ounsworth wrote:

One of the advantages of (external) pre-hashing with RSA is that you can do the pre-hash outside of the FIPS boundary ā€“ ex.: in the PKCS#11 client library which may be in a different datacentre from the FIPS module (HSM) so that you only need to transmit the hash over the wire. As far as FIPS 186-5 (which chains to RFC 8017 for RSA-PSS) is concerned, that is the message m, and whether itā€™s actually a message, or is the hash of something else is completely irrelevant to your FIPS certification ā€“ IE the hash function used to compute the pre-hash is not mentioned in FIPS 186-5 / RFC 8017 and therefore is not subject to FIPS validation.

This is not exactly correct. For both RSA and ECDSA, FIPS 186-5 says that an approved hash function shall be used, but doesn't specify what that hash function is. For ECDSA, computing the hash of the message is step 1 in Section 6.4.1 of FIPS 185-6. For RSA, computing the hash of the message is step 2 in Section 9.1.1. of RFC 8017 for PSS and step 1 in Section 9.2 for PKCS #1 v1.5. For FIPS validation of an implementation of RSA the hash function(s) supported is considered as part of the testing (see Section 6.3 of https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Algorithm-Validation-Program/documents/dss2/rsa2vs.pdf).

FIPS validation does, however, allow for the hash to be computed in a different cryptographic module. This is covered in https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/component-testing, which covers cases in which the implementation of a cryptographic algorithm may be split across multiple cryptographic modules.

My concern / question about FIPS 204 (and the ECDSA part of FIPS 186-5 for that matter) is: if you take the pre-hash step (step 6 of FIPS 204 section 6, or step 1 of FIPS 186-5 section 6.4.1) and perform it externally to your FIPS module, then have you horribly violated your FIPS boundary in a way that will never pass a FIPS validation? In other words, I feel like you canā€™t actually ā€œliftā€ the pre-hash step out of the FIPS module, you can only add an extra round of hashing in front, which cannot be done transparently to the verifier ā€“ but I would love for someone with more clarity to confirm.

For ECDSA, see https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/component-testing#ECDSASigGenPrim.

For ML-DSA, Ray Perlner noted during the FIPS 204 Update at the 5th PQC Standardization Conference that the message representative,Ā Ī¼, could be computed in a separate cryptographic module. We will include a note about that in the final version of FIPS 204.

Mike Ounsworth

unread,
May 7, 2024, 1:26:22ā€ÆPMMay 7
to David A. Cooper, pqc-forum

That clarifies. Thank you David!

Ā 

---

Mike Ounsworth

Ā 

From: 'David A. Cooper' via pqc-forum <pqc-...@list.nist.gov>
Sent: Monday, May 6, 2024 12:07 PM
To: Mike Ounsworth <Mike.Ou...@entrust.com>
Cc: pqc-forum <pqc-...@list.nist.gov>
Subject: Re: [EXTERNAL] Re: [pqc-forum] Re: Updates on pre-hash for FIPS 204 and 205

Ā 

Hello Mike, On 4/30/24 5:ā€Š55 PM, Mike Ounsworth wrote: One of the advantages of (external) pre-hashing with RSA is that you can do the pre-hash outside of the FIPS boundary ā€“ ex.ā€Š: in the PKCS#11 client library which may be in a different datacentre

--

You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Reply all
Reply to author
Forward
0 new messages