Whether to hash-then-sign with Dilithium and Falcon?

784 views
Skip to first unread message

Mike Ounsworth

unread,
Aug 17, 2022, 1:27:13 PM8/17/22
to LAMPS, cf...@irtf.org, pqc-forum, jak...@amazon.com, kpa...@amazon.com, se...@ssn3rd.com, b...@westerbaan.name
Hi Jake, Panos, Sean, Bas,


We notice that your IETF draft-massimo-lamps-pq-sig-certificates-00 has the following security consideration:

> Within the hash-then-sign paradigm, hash functions are used as a domain restrictor over the message to be signed. By pre-hashing, the onus of resistance to
> existential forgeries becomes heavily reliant on the collision-resistance of the hash function in use. As well as this security goal, the hash-then-sign paradigm also
> has the ability to improve performance by reducing the size of signed messages. As a corollary, hashing remains mandatory even for short messages and assigns a
> further computational requirement onto the verifier. This makes the performance of hash-then-sign schemes more consistent, but not necessarily more efficient.
> Dilithium diverges from the hash-then-sign paradigm by hashing the message during the signing procedure (at the point in which the challenge polynomial).
> However, due to the fact that Dilithium signatures may require the signing procedure to be repeated several times for a signature to be produced, Dilithium
> implementations can make use of pre-hashing the message to prevent rehashing with each attempt.


First, quoting from the Dilithium NIST Round 3 submission documents:

> Since our signing procedure may need to
> be repeated several times until a signature is produced, we also append a counter in order
> to make the SHAKE-256 output differ with each signing attempt of the same message.

So it seems like the Dilithium designers explicitly want the hash to differ across repeated attempts.



Second, we had a similar discussion within the context of composite signatures when figuring out how to combine Dilithium and Falcon with ECDSA and RSA. We came out with a different conclusion; that hash-then-sign reduces the security properties of Dilithium and Falcon down to the collision resistance of the hash function used to pre-hash.

We would like community opinion on this.


Here's the Security Consideration text that we're working on:




In the hash-then-sign paradigm, the message to be signed is hashed externally to the signature primitive, and then the hash value is signed.

The hash-then-sign paradigm is required, for example, with RSA signatures in order to sign messages larger than the RSA modulus. Hash-then-sign also gives performance and bandwidth benefits, for example, when the signature is performed by a networked cryptographic appliance since you only need to send a small hash value rather than streaming the entire message.

With Dilithium and Falcon signatures it is not recommended to pre-hash for the following reasons:


The Dilithium construction includes

~~~
Sign(sk,M):
10: mu \in {0, 1}^384 := CRH(tr || M)
~~~

where `CRH` is any collision-resistant hash function and `tr` is a component of the secret key. This provides strong security against pre-computed collision attacks since an attacker has no a-priori knowledge of `r` and provides per-key hash-domain separation of the message to be signed.


The Falcon construction includes

~~~
Sign (m, sk, beta^2):
1: r <- {0, 1}^320 uniformly
2: c <- HashToPoint(r || m, q, n)
~~~

where `HashToPoint` is a SHAKE-256-based construct. This provides strong security against pre-computed collision attacks since an attacker has no a-priori knowledge of `r` and provides per-signature hash-domain separation of the message to be signed.

If the message to be signed is pre-hashed, for example `m0 = SHA256(m)` and then m0 provided to Dilithium or Falcon to sign, then you have re-introduced the collision problem since two messages m1 and m2 where SHA256(m1) == SHA256(m2) hash value will result a single Falcon or Dilithium signature value which is simultaneously valid for both m1 and m2. This removes the extra collision resistance built in to the Dilithium and Falcon primitives and reduces it to the collision resistance strength of the underlying hash function. For this reason it is in general not recommended to pre-hash when using Dilithium or Falcon except in cases where the implementor is comfortable with this reduction in security.

Therefore, for the purpose of interoperability of composite signatures, implementations MUST NOT pre-hash messages for Dilithium and Falcon. If pre-hashed versions of these signatures are desired, then separate signature algorithms will need to be defined.






Third, I can imagine that some applications (like TLS) will want to use non-pre-hashed versions of Dilithium and Falcon, but other applications (like code-signing) would prefer pre-hashed versions. These are not interoperable with each other. Is NIST planning to produce algorithm definitions, OIDs, Codepoints, etc, for both versions?


---
Mike Ounsworth
Software Security Architect, Entrust

John Gray
Sr Prin Software Developer, Entrust
Any email and files/attachments transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If this message has been sent to you in error, you must not copy, distribute or disclose of the information it contains. Please notify Entrust immediately and delete the message from your system.

Scott Fluhrer (sfluhrer)

unread,
Aug 17, 2022, 1:50:16 PM8/17/22
to Mike Ounsworth, LAMPS, cf...@irtf.org, pqc-forum, jak...@amazon.com, kpa...@amazon.com, se...@ssn3rd.com, b...@westerbaan.name
Hmmm, I don't see that in Dilithium; are they referring to the internal ExpandMask function? That isn't applied to the input message.

In any case, it's easy to derive SHAKE( M || 1 ), SHAKE( M || 2 ), ... without multiple passes through M; you compute the partial SHAKE state after process M, and then apply that partial state to 1, 2, ...

>
>
> Second, we had a similar discussion within the context of composite
> signatures when figuring out how to combine Dilithium and Falcon with
> ECDSA and RSA. We came out with a different conclusion; that hash-then-
> sign reduces the security properties of Dilithium and Falcon down to the
> collision resistance of the hash function used to pre-hash.
>
> We would like community opinion on this.
>
>
> Here's the Security Consideration text that we're working on:
>
>
>
>
> In the hash-then-sign paradigm, the message to be signed is hashed
> externally to the signature primitive, and then the hash value is signed.
>
> The hash-then-sign paradigm is required, for example, with RSA signatures in
> order to sign messages larger than the RSA modulus. Hash-then-sign also
> gives performance and bandwidth benefits, for example, when the signature
> is performed by a networked cryptographic appliance since you only need to
> send a small hash value rather than streaming the entire message.
>
> With Dilithium and Falcon signatures it is not recommended to pre-hash for
> the following reasons:
>
>
> The Dilithium construction includes
>
> ~~~
> Sign(sk,M):
> 10: mu \in {0, 1}^384 := CRH(tr || M)
> ~~~
>
> where `CRH` is any collision-resistant hash function and `tr` is a component
> of the secret key.

A hash of the public key, actually; see line 7 of the key generation process (which explicitly computes it from the components of the public key) - Dilithium stores it in the private key so the signer doesn't need to recompute it every time.

> This provides strong security against pre-computed
> collision attacks since an attacker has no a-priori knowledge of `r` and
> provides per-key hash-domain separation of the message to be signed.

Rather, it limits the usability of any found collision to a specific public key; however it does nothing to frustrate a collision attack against a specific public key.

Now, it does probably add a constant factor to any attack that searches for a simultaneous collision between the hash that RSA/ECDSA uses (without the prepend) and the hash that Dilithium uses (with the known prepend) - I would hesitate to give a value to that constant factor, but it is likely not large.

Mike Ounsworth

unread,
Aug 17, 2022, 2:42:47 PM8/17/22
to LAMPS, cf...@irtf.org, pqc-forum, jak...@amazon.com, kpa...@amazon.com, se...@ssn3rd.com, b...@westerbaan.name
I want to break out and expand our third point as it is actually a question to NIST and not to the IETF authors.


> Third, I can imagine that some applications (like TLS) will want to use non-pre-hashed versions of Dilithium and Falcon, but other applications (like code-signing) would prefer pre-hashed versions. These are not interoperable with each other. Is NIST planning to produce algorithm definitions, OIDs, Codepoints, etc, for both versions?

Expanding on the code-signing example: the messages to be signed can be very large; consider a several GB firmware image. Assuming our understanding below is correct, a direct-sign algorithm would require the entire thing to be streamed to a network HSM for signing and to a TPM for verification. Conversely code-signing environments often include counter-signatures from Time Stamping Authorities which protect against future discovery of collision attacks against the hash function -- as an example, Windows still accepts RSA-SHA1 signatures produced before SHA1 was deprecated. I can imagine that the code-signing community will decide that the performance gains of hash-then-sign outweigh the security loss.

So, will NIST standardize both direct-sign and some variant of hash-then-sign for PQC signature primitives?

---
Mike Ounsworth

-----Original Message-----
From: 'Mike Ounsworth' via pqc-forum <pqc-...@list.nist.gov>
Sent: August 17, 2022 12:27 PM
To: 'LAMPS' <sp...@ietf.org>; cf...@irtf.org; pqc-forum <pqc-...@list.nist.gov>; jak...@amazon.com; kpa...@amazon.com; se...@ssn3rd.com; b...@westerbaan.name
Subject: [EXTERNAL] [pqc-forum] Whether to hash-then-sign with Dilithium and Falcon?

WARNING: This email originated outside of Entrust.
DO NOT CLICK links or attachments unless you trust the sender and know the content is safe.

______________________________________________________________________
--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://urldefense.com/v3/__https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/CH0PR11MB5739393F19DD5282E3D7EF549F6A9*40CH0PR11MB5739.namprd11.prod.outlook.com__;JQ!!FJ-Y8qCqXTj2!cQPnK1SOIs1r8xM1OYWVawbIa-o1FJJaQBpP6v4DaGtIq7GF7FJZwl4KqY3locbLDLUiMJJ0TDa7D8fgscKx8lKgHPb5$ .

Massimo, Jake

unread,
Aug 17, 2022, 3:40:30 PM8/17/22
to Scott Fluhrer (sfluhrer), Mike Ounsworth, LAMPS, cf...@irtf.org, pqc-forum, Kampanakis, Panos, se...@ssn3rd.com, b...@westerbaan.name
Thanks Mike, Scott.

I've added to the github repo so we can track discussions on this topic https://github.com/jakemas/draft-massimo-pq-pkix-00/issues/23

>> So it seems like the Dilithium designers explicitly want the hash to differ
>> across repeated attempts.
>>

> Hmmm, I don't see that in Dilithium; are they referring to the internal ExpandMask function? That isn't applied to the input message.
>In any case, it's easy to derive SHAKE( M || 1 ), SHAKE( M || 2 ), ... without multiple passes through M; you compute the partial SHAKE state after process M, and then apply that partial state to 1, 2, ...

I think we are referring to different parts of the signing process here. For reference, my security consideration was referring to page 4 of the Dilithium spec that states:
"Our full scheme in Fig. 4 also makes use of basic optimizations such as pre-hashing the message M so as to not rehash it with every signing attempt." and Figure 4 itself.

It was my understanding that the signing procedure may need to be repeated several times to produce a signature, and thus pre-hashing would prevent the need to individually hash the input message with each attempt. I believe the desired differing of the hash you mentioned is within the internals of the signing procedure and not on the input message itself.

>> Third, I can imagine that some applications (like TLS) will want to use non-pre-hashed versions of Dilithium and Falcon, but other applications (like code-signing) would prefer pre-hashed versions. These are not interoperable with each other. Is NIST planning to produce algorithm definitions, OIDs, Codepoints, etc, for both versions?

>Expanding on the code-signing example: the messages to be signed can be very large; consider a several GB firmware image. Assuming our understanding below is correct, a direct-sign algorithm would require the entire thing to be streamed to a network HSM for signing and to a TPM for verification. Conversely code-signing environments often include counter-signatures from Time Stamping Authorities which protect against future discovery of collision attacks against the hash function -- as an example, Windows still accepts RSA-SHA1 signatures produced before SHA1 was deprecated. I can imagine that the code-signing community will decide that the performance gains of hash-then-sign outweigh the security loss.

>So, will NIST standardize both direct-sign and some variant of hash-then-sign for PQC signature primitives?

I do agree that there may be optimizations that users may wish to make dependent on the context, i.e., hash-then-sign vs direct-sign. It's for this reason I tried to give an overview of the security of each option in the draft, but ultimately leave that up to the user. It is a good point regarding NISTs perspective on what should be explicitly standardized here.

>> This provides strong security against pre-computed
>> collision attacks since an attacker has no a-priori knowledge of `r` and
>> provides per-key hash-domain separation of the message to be signed.

>Rather, it limits the usability of any found collision to a specific public key; however it does nothing to frustrate a collision attack against a specific public key.

Right, more details on the advantages of message binding on the PQC-forum from C. Peikert's https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/eAaiJO1qzkA/m/K66R_ftNBwAJ. It was this discussion I was trying to encompass in the draft.

Cheers,
Jake

------


On 17/08/2022, 10:51, "'Scott Fluhrer (sfluhrer)' via pqc-forum" <pqc-...@list.nist.gov> wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/CH0PR11MB5444B9D3A0CB6E447A2FA3E5C16A9%40CH0PR11MB5444.namprd11.prod.outlook.com.

Phillip Hallam-Baker

unread,
Aug 18, 2022, 11:29:30 PM8/18/22
to John Gray, Tadahiko Ito, Massimo, Jake, Scott Fluhrer (sfluhrer), Mike Ounsworth, LAMPS, cf...@irtf.org, pqc-forum, Kampanakis, Panos, se...@ssn3rd.com, b...@westerbaan.name


On Thu, Aug 18, 2022 at 11:29 AM John Gray <John.Gray=40entr...@dmarc.ietf.org> wrote:

Thanks Tadahiko,

 

I read through your paper, and it covers exactly the usability issues we have come across!   We were wondering if it is possible to perform the specific hashing external to the server (which could be an HSM as in your paper, or timestamp server, etc).   For example, for Dilithium the mu := CRH( tr || M) and for Falcon it would be  c <- HashToPoint(r || m, q, n).    Your paper answers that question, it can be done for Falcon, but  not Dilithium (without changing the signature output).   So part of our question is whether using a regular external Hash as we do today for RSA and ECDSA (and what you call a boundary type B) somehow reduces the security and we shouldn’t recommend it.   We are interested in this because we are looking at defining composite pairs or triples which combine existing signature algorithms like RSAWithSHA256 and ECDSAWithSHA256 with Falcon or Dilithium.    Having to change the operational paradigm for an HSM or something like a timestamping server would result in large amounts of data having to be piped across the internet for signatures (as you point out in your paper).   

 

For our composite signature use case it brings up similar questions.   We can support a mode where external hashing is done once, and then individually signed by the components (this makes it much more efficient) both internally and externally for the HSM, timestamping, code signing use-cases.   However, in the case of Dilithium there would need to be two signature modes Sig = Dilithium (Message) and the other would be Sig = Dilithium ( HASH (Message)).   I don’t think that is necessarily a bad thing as long as it is standardized and secure.   Alternatively, we could support independent hashing for each component, but that gets strange if you are doing an external hash for ECDSA, but then need to send the whole data for Dilithium.   We would likely have to end up supporting sending the whole data if external hashing compromises security of the PQC composites, but then it is even more inefficient as each component would need to hash independently.    You also covers this in section 4.3 your paper:

 

“We can construct type B cryptographic boundaries by adding one more hash function before the execution of PQC’s signature generation algorithm… This approach would improve the efficiency of lattice based digital signature schemes deployed in HSM. It would have a greater impact on Dilithium, but also be applicable to Falcon and other digital signature schemes. Two modes of PQC algorithms utilizing this approach will be able to exist, namely, a PQC algorithm without an additional hash (i.e. original PQC algorithm) and a PQC algorithm with an additional hash. If there are two modes of a digital signature scheme, then the asymmetric operation for those two modes must not be identical. The reason is that, obtaining a signature from the mode with an additional hash function would help attackers who can attack another mode which is without the additional hash function.”


That is not a new issue and it is one that we discussed at great length for RSA. The hash of the digest is part of the signature payload to prevent substitution attacks.

I am finding the description of the problem here to be making assumptions about what 'formal proofs' are demonstrating that are probably demonstrating something else. If you assume there is only one true digest function, you 'solve' the digest substitution problem by pretending it does not exist...


In my view, the signature function should be over the payload and the additional authenticated data. Whether this is the OID of the digest algorithm alone or something more depends on the signature scheme. That is what JOSE etc. have to sort out. 

Preventing a digest downgrade attack may be considered the concern of the cryptographic envelope format because there may be more than one form of downgrade attack to be considered. But it is also something we have traditionally included in the signature...


Mike Ounsworth

unread,
Aug 19, 2022, 1:56:13 PM8/19/22
to Phillip Hallam-Baker, John Gray, Tadahiko Ito, Massimo, Jake, Scott Fluhrer (sfluhrer), LAMPS, cf...@irtf.org, pqc-forum, Kampanakis, Panos, se...@ssn3rd.com, b...@westerbaan.name

Phillip,

 

> That is not a new issue and it is one that we discussed at great length for RSA. The hash of the digest is part of the signature payload to prevent substitution attacks.

 

 

I think you are in fact mistaking the problem that we’re trying to solve / trying not to un-solve. Nobody is suggesting hash downgrade or substitution attacks. As you point out, that is already solved by existing protocols.

 

Taking Falcon as an example, the problem as I understand it is that the first internal step of Falcon is:

 

Sign(m):

  r = rand(320);

  c = SHAKE(r || m);

  …

  output (r, s)

 

Verify(r, s, m):

  c = SHAKE(r || m);

 

 

this has the nice property that `r` is generated at signing time by the signer. So should SHAKE develop a collision attack similar to the one we saw for SHA1, then the collision pre-computation as still intractable for the attacker because the attacker cannot not know `r` in advance. (Disclaimer: I am just an engineer trying to wrap my head around this, corrections welcome!).

 

So the concern with hash-then-sign is that if `m` above is itself a message digest, ie `m = SHA256(M)`, then you have un-done the collision resistance that is built in to the Falcon primitive, essentially reducing the security properties of Falcon. Adding the message digest to the signed content does not help if the attacker is holding two messages with the same SHA256 hash.

 

My simplified-to-the-point-of-being-wrong understanding is that this would be no worse than what we do with RSA and therefore might be acceptable in some use cases, but it is also ignoring cryptographic improvements beyond RSA.

 

Disclaimer: this whole email should be viewed as a question that we are seeking confirmation of.

 

---

Mike Ounsworth

Any email and files/attachments transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If this message has been sent to you in error, you must not copy, distribute or disclose of the information it contains. Please notify Entrust immediately and delete the message from your system.

Phillip Hallam-Baker

unread,
Aug 19, 2022, 3:44:38 PM8/19/22
to Mike Ounsworth, LAMPS, cf...@irtf.org, pqc-forum
On Fri, Aug 19, 2022 at 1:55 PM Mike Ounsworth <Mike.Ou...@entrust.com> wrote:

Phillip,

 

> That is not a new issue and it is one that we discussed at great length for RSA. The hash of the digest is part of the signature payload to prevent substitution attacks.

 

 

I think you are in fact mistaking the problem that we’re trying to solve / trying not to un-solve. Nobody is suggesting hash downgrade or substitution attacks. As you point out, that is already solved by existing protocols.

 

Taking Falcon as an example, the problem as I understand it is that the first internal step of Falcon is:

 

Sign(m):

  r = rand(320);

  c = SHAKE(r || m);

  …

  output (r, s)

 

Verify(r, s, m):

  c = SHAKE(r || m);



Oh right. So far I am up the Kyber pass and have not done Dilithium yet.

this has the nice property that `r` is generated at signing time by the signer. So should SHAKE develop a collision attack similar to the one we saw for SHA1, then the collision pre-computation as still intractable for the attacker because the attacker cannot not know `r` in advance. (Disclaimer: I am just an engineer trying to wrap my head around this, corrections welcome!).


Meh, seems like an unnecessary concern. And I am not sure how this actually helps against collision attacks in general.

If SHAKE is broken, then we are going to have so many problems that the security of the signature algorithm is irrelevant.

As a matter of protocol composition, the signature algorithm should not be trying to fix a potential issue in the digest algorithm because things are going to be breaking all over.

 

So the concern with hash-then-sign is that if `m` above is itself a message digest, ie `m = SHA256(M)`, then you have un-done the collision resistance that is built in to the Falcon primitive, essentially reducing the security properties of Falcon. Adding the message digest to the signed content does not help if the attacker is holding two messages with the same SHA256 hash.


If this is a useful approach, then move it out of the signature primitive and push it into the envelope format.

D. J. Bernstein

unread,
Aug 20, 2022, 2:29:40 PM8/20/22
to pqc-...@list.nist.gov, sp...@ietf.org, cf...@irtf.org
Phillip Hallam-Baker writes:
> If this is a useful approach, then move it out of the signature
> primitive and push it into the envelope format.

Sure, people converting a signature system m |-> Sign(m) into a one-pass
signature system m |-> Sign(H(m)) can maybe be convinced to instead
convert it into a safer one-pass signature system m |-> r,Sign(H(r,m))
where r is chosen randomly at signing time. Both conversions are
generic, and the performance differences are minor.

But moving this _out_ of the underlying signature system is dangerous.
Applications will often expose the underlying signature system directly
to attackers. For example, an RSA HSM that returns h^d given h, trusting
the environment to choose h as a hash, is breakable by essentially the
attack of https://link.springer.com/article/10.1007/s00145-015-9205-5.

> If SHAKE is broken, then we are going to have so many problems that the
> security of the signature algorithm is irrelevant.

Certainly we'd like to end up in a world where everyone is using a hash
function that we can confidently assume is collision-resistant. However,
avoiding this assumption turns out to simplify security review:

* Avoiding collision resistance means that cryptanalysts can avoid
worrying about some important, hard-to-analyze attack avenues
(e.g., what Wang dubbed "message modification").

* Some components of security analyses can be computer-verified
today, but (because of well-known formalization difficulties)
assuming collision resistance creates problems for this.

* In the opposite direction, the random r above creates its own
complications in security review, but looking at the details shows
that these complications are _relatively_ easy to analyze.

Simplifying security review is important because the overall situation
in post-quantum cryptography is that security reviewers are terribly
overloaded. Compare, e.g., https://eprint.iacr.org/2021/543 ("A decade
unscathed") to https://eprint.iacr.org/2022/975. For many more examples
see https://ntruprime.cr.yp.to/latticerisks-20211031.pdf.

If a cryptosystem takes 10 unnecessary risks, each independently
incurring a 10% chance of disaster, then the cryptosystem has, overall,
a 65% chance (1-0.9^10) of disaster from these risks. Systematically
avoiding risks is much safer.

---D. J. Bernstein
Reply all
Reply to author
Forward
0 new messages