Low-Order Factors for NTRUPrime

310 views
Skip to first unread message

Jacob Alperin-Sheriff

unread,
Nov 19, 2019, 9:56:52 AM11/19/19
to pqc-forum

Hi all,

 

NTRUPrime has been extensively advertised as avoiding potential pitfalls of having automorphisms/etc that may potentially plague the other “lattice realm” submissions, as well as having a modulus q over which the polynomial x^p-x-1 is irreducible.

 

However,  you can modulus-switch to a modulus over which the polynomial x^p-x-1 does factor. Moreover, unlike in the case of power-of-2 cyclotomics (where it can be shown to be impossible stemming from properties of cyclotomic polynomials/roots of unity) it appears possible for x^p-x-1 to have a linear factor (x-a) where a has very low order modulo q (e.g. a=4), which would be almost as good as having a factor (x-1), in that you can have a basis of [1,a] and guarantee the error will be very short with respect to this basis when you take the public key modulo (x-a).  

 

Note that modulus switching can be significantly less costly in this realm than in its traditional use in homomorphic encryption where the modulus being switched to is significantly smaller, e.g. the additional error from rounding in each coefficient can [when switching to certain moduli, anyway] be upper bounded at significantly less than ½ unlike in the homomorphic encryption case.

 

At any rate, if the proper q and a can be found (assuming they do exist), it would quite possibly yield a distinguishing attack, and if many such q and a can be found, it might even potentially yield a key recovery attack.

 

Now it’s possible that for the parameters chosen (or even for any parameters) that this never actually happens in practice, but

 

I would have loved to have been able to investigate this in more detail, but unfortunately last Friday was my last day at NIST so I won’t be able to (though I will still follow the forum in my new position, I will be otherwise occupied and unable to devote time to research).

 

Anyway, someone ought to either investigate this or explain why it definitely couldn’t lead to a weakness. Thanks.

 

—Jacob Alperin-Sheriff 

daniel.apon

unread,
Nov 21, 2019, 12:45:51 PM11/21/19
to pqc-forum
Hi all,

Jacob and I have discussed this direction previously (prior to his leaving NIST), and I've also previously discussed this general line with others as well..

If you are already/currently researching this direction and would like to collaborate, please feel free to contact me one-on-one by email. (Alternatively, if you would prefer to investigate independently, please do continue by all means!)

As yet, I'm not (yet) convinced this particular attack vector fundamentally fails, and I consider it a high priority (among all NIST-PQC-related lattice-crypto research) for near-term cryptanalytic investigation.
Any new insights in this line (in either direction) will be quite valuable to us.

Best,
--Daniel Apon, NIST PQC

D. J. Bernstein

unread,
Nov 22, 2019, 3:46:15 AM11/22/19
to pqc-...@list.nist.gov
This attack appeared in https://eprint.iacr.org/2014/784. It's covered
in the NTRU Prime paper (including the modulus switch) and in earlier
NTRU Prime postings. It's much less effective against NTRU Prime than
the lattice attacks used to select NTRU Prime parameters. A closer look
shows that NTRU Prime is easier to review for security against this
attack strategy than cyclotomics are.

Take, e.g., a secret s in the ring (\Z/4591)[x]/(x^761-x-1), with weight
250 and all coefficients in {-1,0,1}. Imagine that an attacker is given
two random-looking elements A_1 and A_2 of the ring and noisy multiples
B_1 = A_1 s + e_1 and B_2 = A_2 s + e_2, where e_1 and e_2 are secrets
with coefficients in {-1,0,1}. (The cryptosystem actually has a much
wider range of e_2, but let's be generous to the attacker here.)

Let's switch modulus to, say, 5801. The attacker computes

A'_1 = round(5801 A_1/4591), B'_1 = round(5801 B_1/4591),
A'_2 = round(5801 A_2/4591), B'_2 = round(5801 B_2/4591)

and considers the differences

e'_1 = B'_1 - A'_1 s,
e'_2 = B'_2 - A'_2 s,

in the ring (\Z/5801)[x]/(x^761-x-1). Compared to the original small
differences e_1 = B_1 - A_1 s and e_2 = B_2 - A_2 s, we've

* multiplied by 5801/4591,
* added B'_i-B_i, and
* added (A'_i-A_i)s.

Let's look more closely at the third contribution:

* Each coefficient of A'_i-A_i is a random-looking number in the
real interval [-1/2,1/2] known to the attacker---typically around
1/4 in absolute value, sometimes smaller, sometimes larger.

* These coefficients are then multiplied by the 250 nonzero
coefficients of the secret s, random +-1 coefficients unknown to
the attacker.

* The resulting 190250 products in [-1/2,1/2] are distributed across
the 761 output coefficients, adding on average 250 of these
products to each output coefficient. (Plus extra weight about half
the time because the modulus is x^761-x-1 rather than x^761-1, but
let's not bother counting this.)

The sum of 250 random elements of [-1/2,1/2] has standard deviation
around 4.56, several times larger than the original noise. The Sage
script below prints out a random example of an actual e'_1 such as

... - 4*x^9 - 2*x^8 + 6*x^7 + 7*x^6 - 8*x^4 - 12*x^3 - x^2 - x

immediately illustrating how much the noise has increased compared to
the original {-1,0,1} noise.

This extra noise is what the NTRU Prime paper is referring to in the
attack description when it says "an attacker switching from q to another
modulus will noticeably increase noise". This also illustrates the
advantage of selecting an inert q to start with: attackers trying to use
factorizations are immediately faced with the extra noise from a modulus
switch.

Of course it's still conceivable that some severe weakness in *LWE is
able to handle such large errors for polynomial x^761-x-1 and modulus
5801. So let's move on to looking at the algebraic part of the attack.

The polynomial x^761-x-1 has, modulo 5801, a root 3738 of order "only"
25. The fact that 3738 is a root means that the equations

e'_1 = B'_1 - A'_1 s,
e'_2 = B'_2 - A'_2 s

in the ring (\Z/5801)[x]/(x^761-x-1) imply equations

e'_1(3738) = B'_1(3738) - A'_1(3738) s(3738),
e'_2(3738) = B'_2(3738) - A'_2(3738) s(3738)

in \Z/5801. The fact that 3738 has order 25 means that the noise
e'_1(3738) is a weighted sum of the 25 terms

t_0 = e'_1[0] + e'_1[25] + e'_1[50] + ...,
t_1 = e'_1[1] + e'_1[26] + e'_1[51] + ...,
...
t_24 = e'_1[24] + e'_1[49] + e'_1[74] + ...

with weights 1, 3738, 3738^2=3836, ..., 3738^24=5188 respectively. The
computation concluding that e'_1[i] has standard deviation around 4.56
(counting only the third contribution and ignoring the extra -x) also
says that t_j has standard deviation above 20. To confirm this, the Sage
script also prints out the polynomial t_0 + t_1 x + t_2 x^2 + ....
Example output:

32*x^24 + 35*x^23 + 45*x^22 + 26*x^21 + 49*x^20 - 9*x^19 + 17*x^18 -
59*x^17 - 2*x^16 - 26*x^14 - 49*x^13 - 35*x^12 + 28*x^11 + 39*x^10 +
46*x^9 - 14*x^8 - 25*x^7 + 13*x^6 - 22*x^5 - 47*x^4 - 45*x^3 + 12*x^2
+ 25*x + 6

Evidently t_0 is still far from uniform, but adding 3738 t_1 gives a
much better spread of possibilities, and then adding 3836 t_2 gets
closer to uniform, and so on. As a simplified model, let's imagine that
each t_i is chosen independently and uniformly at random from
{-32,-31,...,0,1,...,32}. A short calculation of the exact distribution
of the sum t_0 + 3738 t_1 + ... + 5188 t_{24} then shows that each
possibility for the sum has probability very close to 1/q, the maximum
difference being slightly below 2^(-135). One would expect to use about
2^270 samples to observe this with good probability.

For a better estimate, one can run more experiments to see the actual
distribution of each 3738^i t_i, and then convolve those distributions,
but the point is that there's obviously far too much randomness in each
t_i to be able to see anything from the weighted sum until the number of
samples is gigantic. The Arora--Ge attack uses far fewer samples and
much less time. For comparison, the cryptosystem gives the attacker a
grand total of 2 samples.

There were two major reasons for the sample blowup here: first, the
modulus switch; second, the large order---25 is large in this context
because, basically, it ends up in the exponent of the attack cost. This
is why the papers on this topic

* emphasize much smaller orders (the 2014 paper focuses on order 1
and only briefly considers larger orders), and

* select q's where the polynomial already has a factorization, rather
than paying for a modulus switch.

These are the best cases for the attack. Given the first point, let's
look at which primes q exist for order 1, 2, 3, 4:

* Can there be a prime q for which x^p-x-1 has 1 as a root modulo q?
No, there can't: no prime q divides 1^p-1-1 = -1.

* How about order 2, i.e., -1 as a root modulo q? Nope: no prime q
divides (-1)^p-(-1)-1 = -1.

* How about order 3, i.e., b^p-b-1 = 0 mod q with b^2+b+1 = 0 mod q?
Plug b^3 = 1 mod q into b^p-b-1 = 0 mod q to see that b^2-b-1 = 0
if p mod 3 is 2, forcing b = -1, contradiction; or b-b-1 = 0 if p
mod 3 is 1, again a contradiction.

* How about order 4, i.e., b^p-b-1 = 0 mod q with b^2+1 = 0 mod q?
The same calculation allows q=5 if p is 3 mod 4, but q=5 is far too
small to be useful compared to the noise. There's no q if p is 1
mod 4.

More generally, for any order, one can simply factor the resultant of
the polynomial with the appropriate cyclotomic polynomial to see all of
the possible primes q: e.g., for x^761-x-1, the only possibilities are
q=199 for order 11, q=61 for order 15, q=257 for order 16, q=307 for
order 17, q=19 for order 18, q=43 for order 21, q=139 for order 23,
q=461 for order 23, q=5801 for order 25 (my example above), etc.

To summarize: What the attack wants, namely a large q with a tiny order,
simply doesn't exist for these polynomials. This is conceptually the
same as a previously published calculation concluding that the attack
also doesn't affect cyclotomics.

Even more generally, given the polynomial f, one can try to find a prime
q and a divisor d of f mod q for which the reduced error e mod d is
visibly non-uniform. Compared to cyclotomics, NTRU Prime gives security
reviewers extra tools to analyze this generalized attack:

* For NTRU Prime, one can enumerate all divisors d for a huge range
of q and, for each d, compute various numerical features of e mod d
to see if these features detect deviations from uniform.

* For cyclotomics, one quickly bumps into q with an exponential
number of divisors d, and then it's unclear how to even get started
on figuring out whether there's some exploitable d. Security here
has _not_ been proven. The cases that have been analyzed here are
only for very small degrees of d.

Of course one _hopes_ that the generalized attack doesn't work beyond
the cases identified in the literature, but it's hard to justify any
confidence in this, so it's good to be able to investigate _all_
divisors d for a huge range of q.

> the additional error from rounding in each coefficient can [when
> switching to certain moduli, anyway] be upper bounded at significantly
> less than 1/2

How can this be true, given that many coefficients in [-1/2,1/2] from
A'_i-A_i are multiplied by many coefficients of s and added up in each
position as additional noise?

If the intention is to say that this large noise isn't necessarily
dominant compared to the scaling of the original ternary noise: Sure,
switching to a _much_ larger modulus would limit the impact of rounding,
but then what's the next step of the attack supposed to be?

> if the proper q and a can be found (assuming they do exist), it would
> quite possibly yield a distinguishing attack

The 2014 attack is inapplicable to NTRU Prime, for very much the same
reasons that it's inapplicable to cyclotomics. See above.

The generalized attack could conceivably be applicable to cyclotomics or
to NTRU Prime. The generalized attack applied to NTRU Prime has a more
restricted parameter space---and is easier for security reviewers to
check---than the generalized attack applied to cyclotomics. See above.

> if many such q and a can be found, it might even potentially yield a
> key recovery attack.

This is why the attack papers on this topic, beyond the basic objective
of finding examples of number fields with *LWE distinguishers, have set
the more ambitious objective of finding _Galois_ examples (degree-n
number fields with Galois groups of size only n). There are *LWE cases
broken more efficiently by this line of attacks than by any other known
attacks. For cryptanalysts interested in understanding the state of the
art, I recommend reading the original sequence of attack papers:

https://eprint.iacr.org/2014/784
https://eprint.iacr.org/2015/106
https://eprint.iacr.org/2015/971
https://eprint.iacr.org/2016/193

Of course, Galois number fields---such as cyclotomic fields---are
extremely rare among all number fields. The NTRU Prime fields were
designed from the outset to be as far as possible from being Galois.

---Dan (speaking for myself)


p = 761
w = 250
q = 4591
q2 = 5801

ZZx.<x> = ZZ[]
R.<y> = ZZx.quotient(x^p-x-1)

def qreduce(c):
return ((c+(q//2))%q)-(q//2)

def q2reduce(c):
return ((c+(q2//2))%q2)-(q2//2)

s = [0]*p
while sum(si**2 for si in s) < w:
s[randrange(p)] = randrange(2)*2-1

s = R(s)
e1 = R([randrange(3)-1 for i in range(p)])
A1 = R([qreduce(randrange(q)) for i in range(p)])

B1 = A1*s+e1
B1 = R([qreduce(B1[i]) for i in range(p)])

A1prime = R([round(A1[i]*q2/q) for i in range(p)])
B1prime = R([round(B1[i]*q2/q) for i in range(p)])

e1prime = B1prime-A1prime*s
e1prime = R([q2reduce(e1prime[i]) for i in range(p)])

print(e1prime)
print(lift(e1prime) % (x^25-1))
signature.asc

D. J. Bernstein

unread,
Nov 22, 2019, 5:14:57 AM11/22/19
to pqc-...@list.nist.gov
I wrote the rounding error for

A'_i = round(5801 A_i/4591)

as "A'_i-A_i"; of course this should say "A'_i-5801 A_i/4591". Same for
the B'_i rounding error. This has no effect on the stated calculations
and conclusions: the hand calculations simply use that the rounding
error is between -1/2 and 1/2, and the Sage script does an end-to-end
calculation of the actual noise after the modulus switch.

---Dan
signature.asc

Jacob Alperin-Sheriff

unread,
Nov 22, 2019, 9:06:18 AM11/22/19
to pqc-...@list.nist.gov
As far as I can tell the “Weak Instances of PLWE” paper contains this idea only for where f(x) has degree 1 factors modulo q (ie where it has roots f(a)=0 mod q where a is low order modulo q). 

It’s not fully clear to me how this plays out for factors g(x) of f(x) when the degree of g(x) > 1, and where eg the multiplicative subgroup of Z_q^* generated by the coefficients of g(x) has quite low order. 

In particular, I was able to find (using rather naive exhaustive strategies) some moduli q’ near your chosen moduli q for NTRUprime such that for your chosen p (in categories 3 and 5), there existed a factor of x^p-x-1 modulo q’ of the form (x^2+x+a) where the subgroup generated by the coefficients had order 4 (a,-a,1,-1). 

While the rate of growth for error/noise (when taking the degree p-1 polynomials modulo x^2+x+a) was too high here to yield a reasonable distinguisher, it does appear that the growth rate (in terms of a basis [1,a]  is subexponential in p and possibly even polynomial in p). I did not get a chance to prove it but it amounts to an alternating series of binomial coefficients in a manner that should yield that, would be nice to see explicit proof of bounds 

This might end up only being potentially relevant if q is quite larger in terms of p (as might happen in homomorphic encryption applications) versus being roughly p log p as in ntruprime. 

However, tldr is that this doesn’t appear to have been covered in the previous attack literature. 

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20191122084600.27084.qmail%40cr.yp.to.
--
-Jacob Alperin-Sheriff

D. J. Bernstein

unread,
Nov 24, 2019, 5:17:46 PM11/24/19
to pqc-...@list.nist.gov
This message gives a (fully specified) cyclotomic attack example sharing
the set of features claimed in the most recent message for a (not fully
specified) NTRU Prime attack example.

This message also explains why these features are mis-aimed from the
perspective of everything known about the attack performance---there's
no reason to think that either attack example works.

Regarding the possibility of the general attack strategy succeeding for
unknown reasons, this message explains in detail why this possibility
causes more problems for security reviewers in the cyclotomic case than
it does in the NTRU Prime case.

Jacob Alperin-Sheriff writes:
> tldr is that this doesn’t appear to have been covered in the previous
> attack literature. 

On the contrary, the attack is already covered in the NTRU Prime paper,
including citations to the relevant attack papers. This includes the
special case from the first message in this thread, the attack in the
latest message, and the generalized attack in my previous message.

> As far as I can tell the “Weak Instances of PLWE” paper contains this
> idea only for where f(x) has degree 1 factors modulo q (ie where it
> has roots f(a)=0 mod q where a is low order modulo q). 

This covers the message that started this thread ("it appears possible
for x^p-x-1 to have a linear factor (x-a) where a has very low order
modulo q"). That's why I wrote that the attack had appeared in this 2014
paper. An easy computation shows that this attack fails for NTRU Prime,
contrary to what the first message suggested.

Followups to the 2014 paper considered the more general idea of using
higher-degree factors of the polynomial. For example, here's a quote fom
the 2016 paper:

The basic principle of this family of attacks is to find a
homomorphism ρ: Rq → F to some small finite field F, such that the
error distribution on Rq is transported by ρ to a non-uniform
distribution on F.

Obviously one can generalize further to drop the requirement of "small"
degree and to drop the field requirement. This work is cited and
summarized in the NTRU Prime paper as using "ring homomorphisms from
$(\Z/q)[x]/P$" to a "smaller nonzero ring"---same paragraph that ends by
pointing out the noise increase from modulus switching. This covers the
full scope of the attacks discussed in this thread. I had already posted
various messages in 2015 and 2016 summarizing the advances in exploiting
"factors of f mod q" and encouraging further attention to the ideas, for
example posting the following:

I don't mean to overstate the breadth of the attacks so far---none of
them are things you'd run into by accident---but I doubt that this is
the end of the story of how an attacker can exploit these extra ring
homomorphisms.

Let me emphasize that _no_ polynomials have been proven immune to this
more general attack. From the perspective of a security reviewer looking
at the more general attack, NTRU Prime has three advantages over earlier
ring choices:

* For NTRU Prime, one can try a huge range of q; for each q,
enumerate all divisors d of the polynomial mod q; and, for each d,
apply all known statistical tests to the error modulo d (perhaps
first applying Bleichenbacher/BKW steps for large-degree d). For
cyclotomics, the security reviewer trying to gain this confidence
runs into a huge problem---see concrete examples below.

* If one restricts attention to divisors d of degree, say, 1 or 2
(which can be enumerated for any polynomial for any given q), then
cyclotomics enable a standard strategy to convert a powerful enough
distinguishing attack modulo each divisor into a key-recovery
attack. For NTRU Prime, this conversion doesn't apply: any q small
enough to be useful won't have enough small divisors to combine.

* NTRU Prime chooses an inert q, so the attacker has to start with a
modulus switch before applying factorizations. This increases
noise (as the NTRU Prime paper says), so the reviewer can limit
attention to _super-effective_ *LWE attacks against the new
modulus. For comparison, cyclotomic proposals often choose a split
q, giving the attacker many free factorizations for the original
size of error.

I'm not aware of any way that cyclotomics make this attack strategy
_easier_ to analyze. The notion that cyclotomics are better protected
seems to come from an analysis saying that a special case of the attack
setup doesn't work for cyclotomics; a conceptually identical analysis
says that the same special case doesn't work for NTRU Prime. This is
hardly a surprise given how rare the attacked polynomials are.

> It’s not fully clear to me how this plays out for factors g(x) of f(x)
> when the degree of g(x) > 1, and where eg the multiplicative subgroup
> of Z_q^* generated by the coefficients of g(x) has quite low order. 

Such as x^4+x^2+4489, which divides the NewHope polynomial x^512+1
modulo 10753, and has 4489^2 = -1? That's degree above 1, and the
multiplicative subgroup generated by {1, 1, 4489} has only 4 elements.

Could a NewHope error, after a switch to modulus 10753, have a visibly
non-uniform distribution in (\Z/10753)[x]/(x^4+x^2+4489)? For example,
could it fail to hit some of the 406067677556641 elements of this
quotient ring?

The usual error-compression argument obviously doesn't apply to this
example (see below), but maybe there's some other reason that the
distribution misses some elements, enabling a distinguishing attack with
probability large enough to disprove the hypotheses of the NewHope
security analysis. This is feasible for a large-scale attacker to check
experimentally for this ring, but looks quite expensive for the rest of
us to check, given the size of the ring. Is there an algorithm that can
check ring coverage more efficiently? And this is just one divisor; how
do we handle the rapid increase in the number of degree-j divisors of
x^512+1 as j grows?

Regarding "the usual error-compression argument": The way that the
papers on this topic started generating _successful_ examples of this
whole attack strategy was by forcing x to have only k different powers
in the quotient ring for some tiny k. The error is a small combination
of powers of x, so in the quotient ring it's a fairly small combination
of just k different powers of x---see the example in my previous message
with k=25. If k is small enough compared to q (which the 25 obviously
wasn't compared to 5801) then the error terms can line up enough to
enable an attack.

This argument requires the divisor d to also divide x^k-1 for some tiny
k. Here are two cases to consider:

* If d is a binomial x^j-b then asking for d to divide x^k-1 is the
same as asking for j to divide k and b^(k/j) to be 1; i.e., b has
to have very low order. This is why, e.g., the 2014 paper asks for
a "root" of "small order".

* In more general cases, asking for the coefficients of d to have low
order is neither necessary nor sufficient for d to divide some tiny
x^k-1. This is why the usual error-compression argument fails for,
e.g., x^4+x^2+4489 mod 10753: the order of x in the quotient ring
is 1024. The low order of 4489 doesn't limit the order of x.

Of course this argument isn't necessarily the end of the story. In this
example, since x^512 is -1, the errors obviously start lining up after
"only" 512 powers of x. Maybe something interesting happens even sooner.
Furthermore, as the divisor degree grows, the error distribution is
forced to become statistically non-uniform, and maybe someone will
figure out a way to detect this faster than by the attacks known today.

Suppose we're looking at examples of divisors and trying to study them
to find a weakness. With x^761-x-1 mod 4591, the divisors are

* _one_ divisor (the original polynomial, no degree reduction) for
the original small errors mod 4591,
* some divisors for larger errors after switching to modulus 2,
* some divisors for larger errors after switching to modulus 3,
* etc.,

and we can compute all the divisors up to very large moduli. With
x^1024+1 mod 12289, the divisors are

* 2^1024 divisors for the original small errors mod 12289,
* a few divisors for larger errors after switching to modulus 2,
* a few divisors for larger errors after switching to modulus 3,
* ...,
* 2^128 divisors for larger errors after switching to modulus 257,
* ...,
* 2^1024 divisors for larger errors after switching to modulus 18433,
* etc.,

so it's easy to imagine how we could miss a divisor-specific attack
simply because there are too many divisors to try for small moduli.

The Galois group plays an important role in this analysis, since the
chance of a polynomial splitting into linear factors mod q is the
reciprocal of the size of the Galois group. (One can similarly compute
the chance of other factorization patterns.) Obviously this says that
_eventually_ there will be some split q even for a maximum-size Galois
group, but switching to a gigantic modulus lets a security review skip
past all of the attack strategies that guess elements of {0,1,...,q-1}.

> While the rate of growth for error/noise (when taking the degree p-1
> polynomials modulo x^2+x+a) was too high here to yield a reasonable
> distinguisher, it does appear that the growth rate (in terms of a
> basis [1,a] is subexponential in p and possibly even polynomial in p).

I must be misunderstanding what you mean to say here, since the first
few interpretations I've tried are easy to prove false by standard proof
techniques. In context this seems to be describing an experiment, so can
you please post a numerical example and point to the effects you're
talking about in this example?

---Dan
signature.asc
Reply all
Reply to author
Forward
0 new messages