Yes, there is an implementation of the NTT in almost the same ring as mentioned by Gregor (see https://falcon-sign.info/impl/falcon-vrfy.c.html, function mq_NTT_ternary).
Thomas
--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
Visit this group at https://groups.google.com/a/list.nist.gov/group/pqc-forum/.
And just to avoid any misunderstandings about our claims -- we are not
saying that we invented a new NTT technique.
Multiplication uses 3
FFTs (and fewer if some FFTs can be cached); the usual reduction from
arbitrary polynomials to FFT-friendly polynomials sacrifices a factor of
2 (e.g., multiply modulo x^761-x-1 by multiplying modulo an FFT-friendly
polynomial of degree 1536); the usual reduction from arbitrary primes to
FFT-friendly primes sacrifices a factor of 2 when one input is small (as
it is in NTRU), or 3 in general.
> a really efficient KEM
Of course a full analysis would also have to study the security
level, which leads to many open questions, but presumably your security
loss from a larger q outweighs the slight difference in dimension etc.
Have you considered Rounded NTRU as a smaller alternative to Noisy NTRU?
How about a smaller q, and fixed weight instead of variable weight?
One of the reasons for upwards pressure on your q is that you're using
the old idea of taking f as 1+3*small instead of small. Your paper says
that (\Z/3)[x]/(x^768-x^384+1) doesn't support an NTT, but there are
many fast multiplication methods---for example, as in the HRSS software,
you could simply reuse your (\Z/q)[x]/(...) multiplier. As for division,
that's only an issue for keygen, and it's going to be quite fast anyway
since x^768-x^384+1 factors mod 3 as (x^64+x^32-1)^6 (x^64-x^32-1)^6.
Your polynomial x^768-x^384+1 is irreducible over \Z but has 2^256 monic
divisors (and more non-monic divisors) modulo the prime 7681 that you're
using. Have you at least analyzed the impact that this has upon security
against _known_ NTRU attacks? For example:
* Do you know how small the coefficients can be in divisors of
degree, say, 384? Sufficiently small coefficients would seem to
open up the ancient NTRU attack strategy from, e.g., 2001 Gentry.
* A multi-target attack can scan for keys that reduce to 0 modulo any
of your cubic factors. Doesn't this happen more than once in a
billion keys, allowing a slightly faster lattice attack since g is
then constrained to a lattice of slightly smaller dimension?
NTTRU HRSS Prime feature
----- ----- ----- -------------------------------------------
no yes yes avoid decryption failures
yes yes yes irreducible polynomial (over \Z)
no no yes avoid nontrivial number-field automorphisms
yes no yes prime modulus q
no no yes the ring is a field
The last two "no" entries for NTTRU are inherent in this use of NTT,
but the first "no" would be very easy to fix.
---Dan
--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
I think what Dan means is doing Chinese remaindering modulo 2 or 3
different primes.
Cheers,
Gregor
Consequently, I expect your work to end up saving more cycles for
sntrup4591761 and ntrulpr4591761 than it has saved for similarly sized
NTT-friendly rings. This should make the NTRU-Prime-vs.-NTT cycle gap
even less noticeable than it was before, compared to the primary cost
incurred by all these systems, namely bytes transmitted.
Sorting int32[768] takes 6512 cycles with verified code, and it's
trivial to choose the weight so that there are no decryption failures.
> I am really confused as to why you draw this solid line between
> 0 decryption error and >0 decryption error.
eprint 2019/043 appeared a few days ago, saying that ss-ntru-pke "is
attacked with complexity below the claimed security level". Recall that
ss-ntru-pke is part of the original NTRUEncrypt submission; it was
claimed to be "extremely conservative" and provably secure. The attack
exploits decryption failures.
How many lines of failure-probability calculations has this competition
been throwing at security reviewers? And how many lines of alleged
proofs that the failures are too rare to cause trouble? How many
mistakes and security losses do we need to see before deciding that this
minor optimization within lattice-based cryptography is a bad idea?
The important metric here is the complexity of a comprehensive review of
security. This varies considerably from one system to another, and it's
directly connected to the risk of a breakable system being standardized
and/or deployed.
Of course there's also a much more recent line of attacks exploiting
automorphisms, subfields, etc., most recently (2018) breaking through
the 2^sqrt approximation-factor barrier. It would hardly be a surprise
if the next few years of developments also exploit divisors. I don't
understand why anyone would recommend giving the attacker an arsenal of
attack tools that have produced attacks at SODA 2016, Eurocrypt 2017,
etc., and that we know how to eliminate at low cost. But, even if you
don't agree, surely you have to analyze whether your system survives the
specific attack strategies that have broken some earlier systems.
> You can't do useful FHE in some rings and we're working on much
> improved zero-knowledge proof in which complete splitting is crucial.
Assume that performance constraints force NIST to standardize a small
lattice-based system. How do you imagine that we're going to get from
the current situation, with security flaws being announced in 2019 in
"provably secure" lattice systems, to a situation where we can actually
be confident in the security of what NIST standardizes? Do you think
that adding extra desiderata such as FHE is going to help this process?
---Dan
* By now quite a few people, including you, have noticed that the
attacker is allowed to inspect the secret key in the definition of
failure probability in HHK17. (I mean the final version of HHK17;
the initial version claimed theorems using a different definition,
and as far as I know nobody has proven any of those theorems.)
Often being able to see the secret key makes failing messages
easier to find. How much have cryptanalysts looked at this?
Even if NIST submissions are calculating failure probabilities
correctly, they're usually (I'm not saying always) doing it with
the wrong definition, so we usually don't have IND-CCA2 security
proofs, even assuming IND-CPA.
Let's take simple rounding with no decryption failures as the baseline.
Mike estimates that aggressive size optimization---take noise instead of
rounding, use an error-correcting code, and aim for "probably low
enough" failure probability---reduces total key+ciphertext size by a
factor between 1.19 and 1.25 compared to the baseline. (See below.)
Nobody has pointed to an application where this minor change in size
suddenly has users saying "Okay, _now_ that's small enough to deploy."
On Jan 23, 2019, at 5:10 AM, D. J. Bernstein <d...@cr.yp.to> wrote:Let's take simple rounding with no decryption failures as the baseline.
Mike estimates that aggressive size optimization---take noise instead of
rounding, use an error-correcting code, and aim for "probably low
enough" failure probability---reduces total key+ciphertext size by a
factor between 1.19 and 1.25 compared to the baseline. (See below.)
Two years from now, are we going to be able to look at the remaining
lattice submissions and confidently say that any nonzero failure rates
aren't a problem? How much more research is this going to take? Wouldn't
it be smarter to simply eliminate decryption failures and focus the same
research effort on trying to reduce other worrisome lattice risks?
The rest of this message has some comments on Mike's size estimates.
The size changes we're talking about generally accommodate changes in
security level, so the problem of reviewing size comparisons includes
all the problems of reviewing security comparisons. Recall
* one estimate saying that EMBLEM-611 is hundreds of times easier
to break than uRound2.KEM-500 (2^76 vs. 2^84) and
* another estimate saying that, no, uRound2.KEM-500 is thousands of
times easier to break than EMBLEM-611 (2^126 vs. 2^142),
where all four of these wildly inconsistent estimates appear on the same
"Estimate all the LWE/NTRU schemes" page. This is a 1.25x swing in
relative security level. This could be the aggregate result of even
larger swings that the estimates end up assigning to specific design
elements. How are we supposed to extract anything meaningful from this?
What's the scientific justification for the claim that any of these
estimates are "good enough for this purpose”?
Here's a concrete example. Attack papers such as
https://www.iacr.org/archive/crypto2007/46220150/46220150.pdf
https://eprint.iacr.org/2011/458.pdf
https://eprint.iacr.org/2014/880.pdf
https://eprint.iacr.org/2015/823.pdf
https://eprint.iacr.org/2016/177.pdf
show how to exploit rotation (multiplication by x) to speed up various
attacks---but the same speedup idea seems somewhat less effective
against NTRU Prime. Fully quantifying this requires more algorithm
analysis. Qualitatively, this effect is in the opposite direction of the
more obvious effect that NTRU Prime has to choose slightly fewer error
positions to handle the extra coefficient of x^n-x-1.
As far as I know, the rotation impact is taken into account in _none_ of
the estimates from the "Estimate all the LWE/NTRU schemes" page. This
isn't surprising given the history of these estimates, pursuing simple
formulas at the expense of accuracy. Presumably Mike's new "combination"
of estimates, producing his conclusion that switching to the NTRU Prime
ring produces a 1.11x size increase, also fails to take the rotation
impact into account.
Another source of noise is the switch from a proper two-dimensional
size-and-security graph (as in eprint 2018/1174) to a one-dimensional
size-per-bit-of-security ratio. Size shouldn't be _exactly_ linear in
bits of security, and we have no way to see what effect this has on tiny
claimed ratios such as 1.06 (Mike's "costs about 6% more") since we
haven't been told any of the security numbers.
Finally, let me comment on the "factor between 1.19 and 1.25" mentioned
above. This comes from dividing the following two estimates provided by
Mike, disregarding all of the above caveats:
* Starting from the baseline, switching from rounding to noise is
estimated to cost something between 1.15x and 1.21x. ("With errors
instead of rounding, NTRU LPRime would cost about 15% more. Kyber
costs 21% more than Saber, due in large part to more rounding.")
* Using an error-correcting code to aim for "probably low enough"
failure probability is then estimated to save 1.44x. ("With perfect
correctness, ThreeBears costs 44% more".)
I'm unable to figure out why Mike claims 1.33x "overall", rather than
something between 1.19x and 1.25x, as the cost for "perfect
correctness". He estimates 1.33x for NTRU LPRime vs. ThreeBears, but
this includes a cost for the ring switch. He also estimates 1.34x for
NTRU-HRSS vs. ThreeBears, but
https://cr.yp.to/talks/2018.06.27-2/slides-djb-20180627-lattice-a4.pdf
and eprint 2018/1174 already showed NTRU-HRSS parameter improvements
that maintain perfect correctness.
—Dan
Perhaps you mean something special by "math proofs", but the
literature on pure mathematics has many bogus "proofs", as the following
sample illustrates:
https://en.wikipedia.org/wiki/List_of_incomplete_proofs