Scalable ciphertext compression techniques for FrodoKEM, Kyber, Saber, SIKE and other schemes

880 views
Skip to first unread message

Thomas Prest

unread,
Sep 16, 2020, 5:22:51 AM9/16/20
to pqc-...@list.nist.gov
Dear all,

We would like to draw your attention to our recent work [1]:

    "Scalable Ciphertext Compression Techniques for Post-Quantum KEMs
and their Applications"

We revisit the notion of multi-recipient KEM (or mKEM). At a high level,
a mKEM encapsulates the same session key K to a group of n recipients,
by creating a ciphertext that can be sent to the n recipients and
decrypted by any of them. If the ciphertext size is much smaller than n
times the size of individual ciphertexts, a mKEM can be a more efficient
alternative to a parallel use of n KEMs (one per recipient).

In our work, we propose a generic construction of an IND-CCA secure mKEM
from any IND-CPA secure PKE satisfying a mild multi-recipient and
decomposability flavour (which we call mPKE). Our IND-CCA secure mKEM is
proven secure in the (quantum) random oracle model, and the underlying
mPKE can be instantiated from versatile assumptions, including
post-quantum ones such as variants of LWE, LWR and (C)SIDH-related
assumptions. Notably, we can base our generic construction on the
following Round 3 KEM/PKE schemes:

- FrodoKEM;

- Kyber;

- Saber;

- SIKE;

as well as Round 2 candidates LAC, NewHope, Round5, ThreeBears and (not
a NIST candidate) CSIDH. Our mKEM variants require making some changes
to the original KEMs, such as modifying the CCA transform and making the
public matrix A fixed across all recipients (for lattice-based schemes).
For large numbers of recipients, we found that the mKEM variants were
more efficient in bandwidth and computation than their standard KEM
counterparts by one order of magnitude, sometimes two. As an example, if
Alice wants to encapsulate the same session key to n = 64 recipients,
she needs to broadcast:

- 1,007,616 bytes with FrodoKEM-976, and 23,808 bytes with its mKEM variant;

- 69,632 bytes with Kyber-768 or Saber, and 9,152 bytes with either's
mKEM variant;

- 25,728 bytes with SIKE-p503, and 1,914 bytes with its mKEM variant.

A natural application for mKEMs is secure group messaging. As an
illustration, we describe how mKEMs can be combined with the TreeKEM
sub-protocol of the IETF draft MLS [2] and provide significant savings.
Due to the conceptual simplicity of mKEMs, we find it plausible that
many other applications exist.

Best regards,
Shu, Kris, Federico and Thomas


[1] Shuichi Katsumata, Kris Kwiatkowski, Federico Pintore, Thomas Prest:
Scalable Ciphertext Compression Techniques for Post-Quantum KEMs and
their Applications. ASIACRYPT (2020)
https://eprint.iacr.org/2020/1107


[2] Various authors: Messaging Layer Security (mls). IETF
https://datatracker.ietf.org/wg/mls/about/

Christopher J Peikert

unread,
Sep 16, 2020, 10:42:13 AM9/16/20
to Thomas Prest, pqc-forum
Dear Thomas, thanks for sharing this interesting work.

From a brief skim through the paper, I did not see any mention of code-based or NTRU-style lattice schemes. Do you think these could also be adapted straightforwardly, or is there some structural difficulty that would need to be overcome?

Sincerely yours in cryptography, Chris

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/1952cb42-739c-8587-e1ba-ea695401aa28%40pqshield.com.

Thomas Prest

unread,
Sep 17, 2020, 10:01:09 AM9/17/20
to Christopher J Peikert, pqc-forum

Dear Chris,

Thanks for this comment. We didn't study code-based schemes due to time constraints, however there are indeed a few code-based NIST candidates that seem amenable to our construction, such as BIKE-3, HQC, and Round 2 candidates ROLLO-3 and RQC. For BIKE-3 and HQC, it seems likely that security would rely on a variant of Ring-LPN, and for ROLLO-3 and RQC it would be variants of rank metric problems.

Regarding NTRU-style schemes, it would indeed be very interesting if there was a way to obtain efficient mKEMs from them. We didn't find a way to do it so it would likely require new insights.

We also want to thank Dan Bernstein for pointing out to us that our construction can be applied to NTRU LPRime. We will update the ePrint article shortly. In the meanwhile here are the numbers that one would get for NTRU LPRime in Table 1 of our article (similarly to NewHope, the CCA variant of NTRU LPRime includes a 32-byte hash for key confirmation, hence |ct| = |ct_0| + |^ct_i| + 32):

    ntrulpr653:    |ct_0| = 865,     |^ct_i| = 128,    |ct| = 1025,    k_comm = 8.0
    ntrulpr761:    |ct_0| = 1007,    |^ct_i| = 128,    |ct| = 1167,    k_comm = 9.1
    ntrulpr857:    |ct_0| = 1152,    |^ct_i| = 128,    |ct| = 1312,    k_comm = 10.3



Best regards,
Shu, Kris, Federico and Thomas

Christopher J Peikert

unread,
Sep 17, 2020, 11:44:25 AM9/17/20
to Thomas Prest, pqc-forum
Dear Thomas and coauthors,

thanks very much for this detailed reply! If I'm understanding correctly, your multi-recipient KEM (mKEM) template can also be applied to code-based schemes like BIKE-3 etc., and NTRU LPRime. However, the hardness assumptions needed for security are different from those of the original (single-recipient) KEMs.

Importantly, it appears that the requisite assumptions may be broken in some cases, so the resulting mKEM would be insecure.

For instance, all variants of NTRU Prime generate ciphertexts using Ring-LWR with ternary rounding errors (and secret), i.e., having coefficients in {-1,0,1}. We know from Arora-Ge [1] and follow-up work [2] that breaking this is feasible when given about n^2 samples---though many fewer likely suffice [2]---where n is the ring dimension. However, the mKEM requires hardness for more than N samples, where N is the number of recipients (see Theorem 5.1).

So, if the number of recipients exceeds the number of samples for which "ternary" R-LWR can be broken---which seems plausible for some use cases---then instantiating the mKEM with NTRU Prime will be insecure.

The same considerations apply for other lattice-based KEMs, except that they tend to use "larger" errors, which significantly increases the number of samples required to mount Arora-Ge-style attacks. For example, FrodoKEM-640 takes its errors from {-12, ..., 12}, so something on the order of 640^25 ~= 2^233 samples would be needed for Arora-Ge. Kyber has errors from {-2, ..., 2}, which provides some significant margin over ternary, but perhaps not enough for total comfort.

There are several applications of LWE/LWR beyond ordinary KEM that require hardness for "many" samples, so larger errors are needed for them as well. Given that variants of NIST-approved algorithms are likely to be adopted for such applications, I think it's very important to consider the robustness of the underlying LWE/LWR problems to variations like this.

Sincerely yours in cryptography, Chris

D. J. Bernstein

unread,
Sep 17, 2020, 12:55:43 PM9/17/20
to pqc-...@list.nist.gov
Thomas Prest writes:
> We also want to thank Dan Bernstein for pointing out to us that our
> construction can be applied to NTRU LPRime.

For the record: That's not what I said. I haven't reviewed the
construction from this paper, and I don't vouch for any of its claimed
properties regarding Frodo, Kyber, Saber, SIKE, or NTRU LPRime. I simply
asked the following question: "Is there some reason you didn't include
the NTRU LPRime options from NTRU Prime?"

I also have some clarification questions regarding the followup argument
that cryptosystem parameters must be chosen to also be safe in different
cryptosystems. Doesn't this prohibit Kyber's choice of distribution as
{-2,...,2}, given the requirement to remain secure with 2^64 data?
Doesn't this also prohibit Kyber's choice of ring shape as a cyclotomic,
since the same rings are broken in quantum polynomial time in Gentry's
STOC 2009 cryptosystem? Doesn't this also prohibit Frodo's choice of its
dimension as being <=1344, since such small dimensions are broken in
various FHE systems? Doesn't this also prohibit typical choices of curve
for isogeny-based cryptosystems, since the same curves are broken when
used in conventional ECC? If not, why not?

---Dan
signature.asc

Christopher J Peikert

unread,
Sep 17, 2020, 1:06:03 PM9/17/20
to pqc-forum
the followup argument
that cryptosystem parameters must be chosen to also be safe in different
cryptosystems.

As you put it, "That's not what I said."

What I did say: "Given that variants of NIST-approved algorithms are likely to be adopted for such applications, I think it's very important to consider the robustness of the underlying LWE/LWR problems to variations like this [namely, hardness for many samples]."

Martin R. Albrecht

unread,
Sep 17, 2020, 1:21:19 PM9/17/20
to pqc-...@list.nist.gov
Hi all,

Is this is an okay summary of the setup:

- In the mKEM for Kyber, we have {-2,-1,0,1,2} (d=5) and n=768
- We get n equations of degree d in the secret
- We get m*n equations of degree d when m recipients are selected
- For MLS we have m ≤ 50000
- Thus we have:

sage: load("https://bitbucket.org/malb/lwe-estimator/raw/master/estimator.py")
sage: m = 50000
sage: n = 768
sage: d = 5
sage: gb_cost(n, ((d, n + m*n),))
rop: 2^342.4
Dreg: 28
mem: 2^342.4

http://aleph.sagemath.org/?q=ryyhsw

[This doesn’t evaluate on aleph as that doesn’t have magma installed which is used for the power series evaluation when available]

At m = 10000000 I’m getting 2^160 with this but I might have the wrong assumptions above?

The source code for gb_cost is here

https://bitbucket.org/malb/lwe-estimator/src/master/estimator.py#lines-3193

Cheers,
Martin
--

_pgp: https://keybase.io/martinralbrecht
_www: https://malb.io

daniel.apon

unread,
Sep 17, 2020, 2:49:26 PM9/17/20
to pqc-forum, martinr...@googlemail.com
Hi all,

First -- Thomas -- this is very interesting work. Thank you for sharing!

A brief aside (given the discussion so far in this thread):
Last year, colleagues and I had early, more theoretical work on this style of topic at PQCrypto 2019; cf https://eprint.iacr.org/2019/398.pdf

A few brief comments, that someone out there may find helpful for framing the new discussion here:

1) We were not aiming at exactly the same protocol as Shu, Kris, Federico and Thomas -- whereas they are constructing multi-recipient KEMs, we were aiming for a Group-KEM that is "contributory" (where each party contributes entropy to the shared secret). We don't exactly formalize this in the work, or what security properties it might imply, or whether a "contributory" protocol is even inherently useful compared to an mKEM, but it's clear from inspection that our goal was a little different than this new work. It could be that dropping the inherent "contributory" requirement is needed to get a practically useful multi-party protocol..

2) We performed a mostly theoretical analysis, whereas this work aims to take actual NIST PQC candidates and instantiate them practically.

3) For simplicity, we focused on a case study of cyclotomic Ring-LWE (although, it seemed/still seems our results would translate to other ring structures, and also to modules, and also to LWR type schemes).

4) We briefly looked at achieving the same type of "contributory" Group-KEM from other assumptions e.g. isogenies, and were not able to make it work at the time..

5) A key difficulty in our RLWE-based protocol was managing Renyi divergence arguments in the context of a large number of RLWE samples N, where N is the number of parties involved in the protocol. Given our (now-old) techniques, this required us to increase the modulus size to a point where the protocol was no longer obviously practical..

So, the main thing I wanted to say -- Especially in view of this 5th comment, it seems that the preceding discussion in this thread is especially on point. =)

Cheers,
--Daniel

Thomas Prest

unread,
Sep 18, 2020, 1:52:26 PM9/18/20
to Christopher J Peikert, pqc-forum

Dear Chris and others,

Thank you for all these interesting comments. The potential concrete security loss is not something that we discussed in our paper, but it is indeed something very important. We hope our work can be a starting point for research on post-quantum KEMs, and see the points Chris mentioned (concrete security, code-based schemes) as important research directions. In particular, we would be interested in seeing the evolution of the security of NIST candidates when used as mKEMs, as well as potential bounds on the number of recipients.


Regarding NTRU Prime, an interesting phenomenon happens. With the notations of [1], Section 2.4, a (single-recipient) ciphertext is of the form (ct_0, ^ct_i), where:

- ct_0 = Round(b * G) performs a "ternary rounding" of b * G

- ^ct_i is a rounding of b * A, but this rounding is much coarser: only 4 top bits per coefficient are preserved (also, only 256 of the p={653,761,857} coefficients of b * A are considered, but this is not too unimportant for our discussion). Since the modulus q verifies q > 2^12, the (deterministic) added error is rather large.

A mKEM ciphertext with N recipients will be of the form (ct_0, ^ct_1, ..., ^ct_N), and therefore will give p "LWE" samples with ternary errors (given by ct_0), and 256 * N samples with larger errors (given by the ^ct_i). Because most of the equations are "super noisy" and not "ternary noisy", we don't see an obvious way of applying Arora-Ge, but we haven't seen this unbalanced problem considered before, so there might be room for improvement.

Fun fact: dropping bits in the right part of the ciphertext is also done in Kyber, Saber, etc. While it is good for efficiency, it also seems to be helpful for security in the mKEM case, at least with respect to the Arora-Ge attack.


Regarding TreeKEM, in our paper we recommend to use mKEM with a number of recipients that is *independent* of the number of users and only depends on the (m)KEM used. Indeed, TreeKEM arranges the N users as the leaves of a m-ary tree, and send one mKEM ciphertext per level of the tree. We suggest to take (m - 1) ~= (|pk| + |ct_0|) / |^ct_i|. If we take Martin's example with Kyber-768 and N = 50000 users, it would entail m ~= 18 and therefore a tree of depth ceil(log(N, m)) = 4. One important point is that the ciphertext corresponding to each level encapsulates a different session key, so for N = 50000 there would be 4 (multi-recipient) ciphertexts, each encapsulating a different session key and each having at most m - 1 = 17 recipients, not 50000. So TreeKEM is actually an example where the number of recipients per ciphertext would be really small.


Best regards,
Shu, Kris, Federico and Thomas

PS: many thanks to Eamonn Postlethwaite, Martin Albrecht and Chris Peikert for useful discussions


[1] Daniel J. Bernstein, Chitchanok Chuengsatiansup, Tanja Lange, Christine van Vredendaal
NTRU Prime: round 2
https://ntruprime.cr.yp.to/nist/ntruprime-20190330.pdf

--

Dr. Thomas PREST
Senior Cryptography Researcher

PQShield Ltd
.
E:
            thomas...@pqshield.com
W:
          www.pqshield.com

This email and any attachment contain information that is private and confidential and is intended for the addressee only. If you are not an addressee, you are not authorised to read, copy or use the email or any attachment. If you have received this email in error, please notify the sender by return email and then destroy it. Any views expressed in this e-mail are not that of PQShield Ltd or its employees unless specifically stated. As e-mail communications are known to be unsecure, we accept no liability for any legal issues that may arise as a result of this e-mail. It is advised that you carry out a virus check with your own software before opening any attachments sent to you.

Christopher J Peikert

unread,
Sep 18, 2020, 3:22:52 PM9/18/20
to Thomas Prest, pqc-forum
Dear Thomas and all, thanks for pointing out the mixed ternary/large rounding that is used on the ciphertext components in NTRU LPRime. That does indeed complicate the picture for applying Arora-Ge-style attacks, so we can't say anything conclusive about it.

(As far as I can tell, Streamlined NTRU Prime exclusively uses ternary rounding, but it's not clear whether your mKEM template applies to it.)

Sincerely yours in cryptography, Chris
Reply all
Reply to author
Forward
0 new messages