Re: [pqc-forum] Re: S-unit attacks

881 views

Skip to first unread message

D. J. Bernstein

unread,

Oct 23, 2021, 3:55:19 PM10/23/21

to pqc-...@list.nist.gov

I gave a talk on 20 August; see my pqc-forum email dated 20 Aug 2021
22:23:05 +0200. The centerpiece of a pqc-forum message posted four days
later was a specific argument for disputing the final slide of the talk.
That argument is wrong. The reason that it's wrong was already announced
in the talk, and is now publicly documented in detail in the paper
https://cr.yp.to/papers.html#spherical that I announced a moment ago.

Now that this paper is available, I'm replying to the 24 August message.
There are other errors in that message beyond the centerpiece; I'll go
through the message linearly.

Léo Ducas & Alice Pellet-Mary write:
> TL;DR:
> - We thank the authors for the bug report on [DPW19]. We repeat the
> already stated limitations of [DPW19].

You're welcome.

> - We note that conjecture of slide 47 (a subexp_{1/2} algorithm for
> (1+ε)-Ideal-SVP) is given absolutely no justification or explanation.

Actually, Section 3 of the talk begins by highlighting that large
integers have a noticeable probability of factoring into much smaller
primes. This is, of course, also why various well-known number-theoretic
algorithms (e.g., ECM) take subexponential time. The talk continues with
experimental evidence of S-unit-attack performance for various values of
n. The factorizations presented in the talk for n=64 show the usual
smoothness patterns.

If one simply asks, as a function of the discriminant n^n, how large #S
needs to be to reach a spectacular approximation factor, and compares
the Attack10/12/14 n=32 data in the talk (discriminant 2^160) to the
n=64 example in the talk (discriminant 2^384), then one immediately sees
growth that looks very much like the usual subexponential smoothness
curves, and not at all like exponential growth.

Is there more to say about this than what appeared in the talk? Of
course. This talk came out of a massive research project, and there were
many other items competing for talk time. But a proper request for more
information starts by clearly acknowledging the information given and
then asking more specific questions.

> - With what is given, and standard heuristic, we infer that the
> proposed algorithm is no better than existing literature [PHS19].

This is the centerpiece of the message at hand. Let's label this as
Sentence X for future reference. This sentence combines five notions,
including four errors. Here are the facts:

1. It's correct that _if_ one applies standard lattice heuristics to
S-unit lattices _then_ one has to conclude that increasing S
beyond PHS19 makes S-unit attacks exponentially less effective.
Let's label this conclusion as Conclusion X for future reference.

2. https://cr.yp.to/papers.html#spherical collects extensive evidence
that _for S-unit lattices_ the standard heuristics are highly
inaccurate. The talk already announced this, saying that the
S-unit lattice is an "amazingly special lattice with all sorts of
wonderful algebraic and analytic features" as a prerequisite for
the success of the algorithm. Using the standard heuristics to
make S-unit predictions is an error.

3. "With what is given" in Sentence X is an error. The reader
understands this to mean that Sentence X accounts for what the
talk already said; but there was no acknowledgment of the talk
stating that the lattice is special, and there was none of the
double-checking that this statement should have triggered.

4. Conclusion X is wrong. Consider, e.g., the experiments presented
in the talk, where increasing S (with cost scaling as roughly #S)
makes S-unit attacks _more and more_ effective, not less and less.
For further analysis see https://cr.yp.to/papers.html#spherical.
Note that seeing that Conclusion X is wrong is easier than
quantifying how inaccurate the heuristics are.

5. Finally, many of the new improvements presented in the talk
_don't_ rely on increasing S. The "no better" in Sentence X is
thus an error, an exaggeration of Conclusion X. This is a separate
issue from Conclusion X being wrong, and is a separate issue from
the underlying heuristics being highly inaccurate for S-units.

The overconfident usage of these heuristics is a pervasive problem
within the literature on lattice-based cryptography. The core evidence
for the heuristics has always indicated that some lattices should be
exceptions, and in particular the heuristics are obviously very wrong
for the important lattice Z^d. The heuristics should never be applied
to any specific class of lattices without careful verification of the
legitimacy of this application.

> - We invoke Sagan's standard.

See below.

> First, I (Leo) would like to thank you for spotting the "graphing typo"
> from our paper [DPW19]. It has now been corrected on the eprint.

Good to hear.

> I'm also jumping on the opportunity to point at the stated limitation of
> our lower bound to specific optimization to the poly-time algorithm
> obtained by combining [EHKS14]+[CDPR16]+[BS17]+[CDW17]. Indeed, we
> already acknowledge in the paper that S-unit approach [PHS19] with
> sub-exponential complexity already violate this lower bound. This is
> explicit in intro and again in section 7.4. Breaking it is unsurprising and
> not new.

Actually, the introduction of DPW19 describes PHS19 as

a heuristic algorithm that should reach even lower approximation
factors than discussed above, but at the cost of a pre-computation
using EXPONENTIAL time and memory, and a computation using
sub-exponential time and memory

(emphasis added), and similarly Section 7.4 of DPW19 says

It has been proved that ideas beyond this framework make it
asymptotically possible to go below those lower bounds [PMHS19], but
at the cost of a sub-exponential running time, together with an
EXPONENTIAL amount of precomputation

(emphasis added). Summarizing these as acknowledging "sub-exponential
complexity", and omitting the exponential-precomputation qualifier that
was stated in the original paper as a limitation on the acknowledgment,
is an oversimplification, and is particularly inappropriate in the
context of comments on a talk that, among other things, has much faster
precomputation.

Furthermore, the main topic of DPW19 is the performance of models of
certain S-unit attacks for _concrete_ sizes of n. DPW19 summarized these
_concrete_ calculations as "somewhat reassuring for NIST candidates".
The fact that this talk demonstrates S-unit attacks finding much shorter
vectors than DPW19 for various _concrete_ sizes---something that earlier
work did not accomplish---should be clearly acknowledged, independently
of any disputes regarding asymptotics.

> Thank you also for sharing your interesting work in progress.

The talk presented results. It also said "see upcoming paper". It did
not say "work in progress".

> The new way of constructing S-units sounds quite interesting.

Thank you. Regarding new "way", note that the talk covers several
constructions of S-units, including multiple constructions that are new
in the context of S-unit attacks.

> However, we (Leo and Alice) are surprised to see a sudden jump to an
> extraordinary conjecture on slide 47: a subexp_{1/2} algorithm for
> (1+ε)-Ideal-SVP. No justification or explanation is given.

See above.

> The rest of the talk includes irrelevant experiments in dimension 32
> and 64.

As a reminder, DPW19 includes, inter alia, models of attacks in
dimensions 32 and 64 (see Figure 6 in DPW19). The new attack experiments
presented in this talk directly show that DPW19 underestimated the power
of S-unit attacks to find short vectors in exactly these dimensions.

This direct demonstration is something that PHS19 (and BR20) did not
achieve. The experiments presented in the talk also include publicly
verifiable examples using digits of pi, help illustrate the general
shape of S-unit attacks, and provide useful information for people who
want to understand the scalability of these attacks.

This undermines DPW19's "somewhat reassuring" conclusions regarding the
shortness achievable for other concrete sizes.

> These experiments concern a different algorithm than the one of slide
> 47.

At the level of detail of the analysis on slide 47, many of the speedups
in the talk don't matter, so the slide presents an explicitly simplified
algorithm ("Simple algorithm variant, skipping many speedups"). As an
analogy, the NFS literature contains simplified versions of NFS for
seeing asymptotics such as exp(cbrt((64/9+o(1))(log n)(log log n)^2)),
and optimized versions that have the goal of getting factorizations done
as efficiently as possible.

> In particular, the dimension 64 experiment brute-force collisions in
> the class group:

Any competent attacker will take advantage of the streamlining afforded
by a small class group in cases where the class group is small. As a
close analogy, consider the famous 1993 NFS factorization of 2^512+1 by
Lenstra--Lenstra--Manasse--Pollard, and recall that the most important
bottleneck in NFS is an S-unit search:

https://www.ams.org/journals/mcom/1993-61-203/S0025-5718-1993-1182953-4/S0025-5718-1993-1182953-4.pdf

This factorization took advantage of the trivial class group in that
case; see Section 4.10 and subsequent sections of that paper. NFS _also_
works for much larger class groups, of course without this streamlining.

> this would unavoidably scale super-exponentially in
> time and memory just because of the cardinality growth of this group.

What is "this" referring to here? If it's referring to the n=64 part of
the talk, then "scale" is undefined. If it's imagining a huge search
through much larger class groups for larger values of n, then it's not
referring to what the talk said. A talk should be able to present a
wide range of speedups, _including_ size-specific speedups, without
having the latter misrepresented as damaging speed for other sizes.

> Our analysis of the algorithm of slide 47 suggests that it leads to a
> sub-exponential factor of at least subexp_{1/4}(n), similar to prior
> claims [PHS19].

This is the second statement of the centerpiece of the message. See
above regarding errors in Sentence X.

> Indeed, this algorithm seems to follow [PHS19]

See Appendix D of https://cr.yp.to/papers.html#spherical for proper
credits here.

> with the following variations:
> - a different construction for the set of S-units (precomputation)
> - the use of d = subexp_{1/2}(n) many places rather than O(n)

PHS19 indeed uses a much less efficient precomputation, and---because of
the aforementioned misapplication of heuristics---limits itself to much
smaller sets S. Another difference is that PHS19, unlike the original
proposal of S-unit attacks in 2016 and unlike the subsequent paper BR20,
doesn't follow the standard number-theoretic weighting of finite places.

> In particular, the main loop appears to be essentially the Laarhoven
> [Laa16] algorithm, written multiplicatively.

See Appendix D of https://cr.yp.to/papers.html#spherical for proper
credits here, including a textbook from 2000 directly on point. The
_analysis_ from Laarhoven was new, but applying that analysis to S-unit
lattices (as in PHS19 and the message I'm replying to) produces
incorrect conclusions.

> In the literature, the key remark for analyzing how often Log(u/v)
> reduces Log(g) is the following fact:
>
> in dimension d, if x,y are uniform on spheres of radii a, b respectively
> (with a > b), then the probability that y reduces x (i.e., ||x-y|| <
> ||x||) is (1-(b/2a)^2)^{Θ(d)}.
>
> According to this heuristic, the probability of reducing log(g) to a
> size comparable to Log(u/v) would be *ridiculously* small : exp(-Θ(d)) =
> exp(-subexp(n)). This heuristic might need refinements given the integer
> and sparse nature of the Log on the finite places. Yet, even just
> projecting on the O(n) infinite places, we do not see how your algorithm
> could find something smaller than Ω(n^1/4) * ||Log(u/v)|| for the given
> parameters. That leads to an approximation factor of at least
> subexp_{1/4}(n), and maybe much more because of the huge amount of
> ignored finite places. That is no better than [PHS19] for the same
> online running time.

This is the full statement of the centerpiece of the message.

https://cr.yp.to/papers.html#spherical debunks this model of S-unit
lattices. The necessary "refinement" is to (1) discard this model of
S-unit lattices and (2) instead pay attention to S-unit experiments,
S-unit analyses, etc.

> Of course, this may miss subtleties not presented on your slide.
> But after all, the burden of the proof lies on your side.

To recap the status: The smoothness probabilities highlighted in the
talk and the experiments presented in the talk support the conjecture of
subexponential scalability. A prerequisite for all of this, as mentioned
in the talk, is how special S-unit lattices are. The claims of contrary
evidence boil down to mis-modeling S-unit lattices as random lattices;
those claims have been disproven and should be withdrawn.

> *Extraordinary claims require extraordinary evidence.*

Regarding this particular aspect of what has happened, I appreciate your
acknowledging this as a breakthrough, even though you have not yet
acknowledged that it has in fact happened. Beyond this particular
aspect, please acknowledge that the talk---unlike previous work---also
demonstrates S-unit attacks finding much shorter vectors than DPW19 in
various concrete dimensions considered by DPW19. Thanks in advance.

---Dan

signature.asc

Leo Ducas

unread,

Nov 1, 2021, 7:02:37 AM11/1/21

to pqc-forum, pqc-...@list.nist.gov

Dear Dan,

> Regarding this particular aspect of what has happened, I appreciate your

> acknowledging this as a breakthrough [...]

I'm flattered that you value so much our opinion. Alice being unavailable
at that time, I'll be answering on my own.

> The claims of contrary evidence boil down to mis-modeling S-unit
> lattices as random lattices; those claims have been disproven and
> should be withdrawn.

I am afraid that your counter-counter-analysis is not sufficient to dismiss
the counter-analysis we brought. Indeed, the following fact trivially
generalize to arbitrary x over the sphere, and only random y:

>> in dimension d, if x,y are uniform on spheres of radii a, b respectively
>> (with a > b), then the probability that y reduces x (i.e., ||x-y|| <
>> ||x||) is (1-(b/2a)^2)^{Θ(d)}.

This means, that the set S = {Log(u/v)} can be arbitrary, and only the
target t = Log(g) needs to have a random direction. The lower-bound on
the output length of the iterative slicer stands by union bound. The
spherical model for lattices is only necessary to prove upper-bound
on this output. In other word: the lattice being non-spherical can only
makes this particular algorithm worse, not better.

So, for the lower bound we brought up, the randomness of the target
suffices. Because the algorithm fails to reach the desired length for
random spherical t, it also fails in the worst case. And that's what we
are typically after when we talk about ideal-SVP. A relevant average-case
is also defined in [BDPW20] if you care.

At a higher point of view, it seems that your mistake is a confusion
between SVP and CVP; or purely geometrically, between the minimal
distance and the covering radius. You brought up Z^n as a valuable
example. Yes, Z^n has a minimal distance of 1. However, it's covering
radius is still Θ(sqrt(n)). The average distance to Z^n of a random
point modulo Z^n is also Θ(sqrt(n)). In fact, the constant hidden in
the Θ is even worse for Z^n than it is for random lattice.

I hope this clarifies things.
-- Léo

[BDPW20]: Random Self-reducibility of Ideal-SVP via Arakelov Random Walks
Koen de Boer and Léo Ducas and Alice Pellet-Mary and Benjamin Wesolowski

D. J. Bernstein

unread,

Nov 4, 2021, 3:32:45 PM11/4/21

to pqc-...@list.nist.gov

Leo Ducas writes, in email dated 1 Nov 2021 04:02:37 -0700:

> I am afraid that your counter-counter-analysis is not sufficient to
> dismiss the counter-analysis we brought.

Presumably the wording here is referring to the following publications:

* P0, "analysis": my 20 August 2021 talk on S-unit attacks.

* P1, "counter-analysis": the 24 Aug 2021 07:45:56 -0700 message
from Dr. Ducas and Dr. Pellet-Mary, 511 words (centerpiece:
claiming that the attacks have "*ridiculously* small" success
probability, dropping exponentially with #S).

* P2, "counter-counter-analysis": my 23 Oct 2021 21:55:07 +0200
message, 1510 words.

* P3, "counter-counter-counter-analysis"?: this 1 Nov 2021 04:02:37
-0700 message from Dr. Ducas, 267 words.

The core problem is that P1 applies standard lattice heuristics to
S-unit lattices without ever bothering to check whether S-unit lattices
are exceptions to the heuristics. The conclusions of P1 are
qualitatively and quantitatively wrong. An erratum for P1 is required.

Application of the same heuristics had already prevented a 2019 paper,
PHS19, from seeing most of the power of S-unit attacks. The research
project that led to P0 discovered, inter alia, that S-unit lattices have
a special analytic structure that PHS19 had missed. P0 announced this,
and full details now appear in https://cr.yp.to/papers.html#spherical.
There are other important things that PHS19 missed about S-unit attacks,
but this is the most fundamental.

It's understandable that PHS19 missed this. Overconfident application of
the same heuristics was already standard practice in the literature on
lattice-based cryptography, not something new in PHS19. But it's less
understandable to see P1 continuing to mis-model S-unit attacks _after_
P0 announced that S-unit lattices have special analytic features.

P2, published shortly after https://cr.yp.to/papers.html#spherical, is
structured as point-by-point responses to P1, identifying the points of
agreement (e.g., "It's correct that _if_ one applies standard lattice

heuristics to S-unit lattices _then_ one has to conclude that increasing

S beyond PHS19 makes S-unit attacks exponentially less effective") and
the points of dispute (e.g., P1 says "With ... standard heuristic, we

infer that the proposed algorithm is no better than existing literature

[PHS19]"; P2 says "https://cr.yp.to/papers.html#spherical collects

extensive evidence that _for S-unit lattices_ the standard heuristics

are highly inaccurate").

The quickest way to see that P1 is wrong is to consider the case K=\Q,
with the primes in S chosen to be exactly the primes <=y, and with
simple reduction using the logs (and minus logs) of those primes, as in
Section 6 of https://cr.yp.to/papers.html#spherical:

* These S-unit attacks succeed with probability 1 at finding a
shortest nonzero vector in the input ideal.

* P1 instead says "*ridiculously* small" probability for large #S
(i.e., large y), dropping exponentially with #S. That's wrong.

Of course K=\Q can be handled efficiently in other ways, but this has no
relevance to P1's errors. The idea that one can somehow ignore the K=\Q
counterexample by waving vaguely at class groups is debunked by Section
7 of the same paper: for _each_ number field K, as y->infty, with the
finite part of S chosen as the prime ideals of norm <=y, with simple
reduction modulo the y shortest S-units (the tie-breaking mechanism
doesn't matter), the success probability of finding a shortest nonzero
vector in the input ideal converges to 1. This again directly
contradicts P1's claim of "*ridiculously* small" probability.

Note that finding the _shortest_ nonzero vector is overkill for solving
_approximate_ Ideal-SVP, the problem that matters for cryptography. This
ends up letting the attacker take smaller S. There's much more material
in https://cr.yp.to/papers.html#spherical _quantifying_ how wrong the
standard heuristics are. But taking y->infty is the easiest way to see
how thoroughly P1 botched the analysis of S-unit attacks.

Does P3 address these glaring contradictions between P1 and reality? No.
I think it's fair to summarize P3 as merely (1) briefly spelling out the
thought process that led to P1 and then (2) briefly claiming to be
disputing something, without pinpointing the text under dispute.

I'll now go point by point through P3, again identifying the points of
agreement and the points of dispute, and showing how various errors in
P3 were already debunked by https://cr.yp.to/papers.html#spherical.
Errata are now required for both P1 and P3.

> Indeed, the following fact trivially
> generalize to arbitrary x over the sphere, and only random y:
> >> in dimension d, if x,y are uniform on spheres of radii a, b respectively
> >> (with a > b), then the probability that y reduces x (i.e., ||x-y|| <
> >> ||x||) is (1-(b/2a)^2)^{Θ(d)}.

As a preliminary note on the order of Theta quantifiers, I presume the
intention here was to fix a and b as d goes to infinity. Without such
restrictions, the statement is incorrect: for example, if b/a goes
rapidly enough to 0 as d increases then (1-(b/2a)^2)^Theta(d) = 1-o(1).

See Theorems 3.13 and 3.18 of https://cr.yp.to/papers.html#spherical for
statements that allow a and b (and, more to the point, the ratio b/a) to
vary with d. This is useful for the following analysis.

Qualitatively, (1-(b/2a)^2)^{Θ(d)} is close enough for understanding the
relevant phase change in reduction effectiveness in spherical models as
the distances change---I'll refer back to this below:

* If the distance ratio b/a is 1/2 then the probability drops
exponentially with the dimension d. (See the upper bound in Theorem
3.18.)

* If the distance ratio b/a is 1/sqrt(d) then the probability is
>=1/poly(d) as d grows. (See the lower bound in Theorem 3.18.)

Regarding the trivial generalization: Yes, rotational symmetry means
that the probabilities work for one uniform random sphere point and one
fixed sphere point. See, e.g., Theorem 3.13, where mu is a random
variable while nu is fixed.

> This means, that the set S = {Log(u/v)} can be arbitrary, and only the
> target t = Log(g) needs to have a random direction.

Given the context (see "uniform" above), I _presume_ that "random" here
is intended to mean specifically "uniform random". Please avoid
ambiguous terminology.

"Can be" and "needs" indicate that this sentence isn't claiming that
Log(g) _does_ have a uniform random direction, but is instead saying
that P1+P3's desired conclusion (in a nutshell, that S-unit attacks for
large S don't work) _would follow from_ Log(g) having a uniform random
direction.

The reason that it's important to clarify the epistemological status is
that, in fact, the target _doesn't_ have a uniform random direction.
It's not even close, with regard to the relevant metrics. See below.

Anyway, I agree that the statements above regarding distance ratios work
when one of the two directions is uniform random; again, see Theorem
3.13 of https://cr.yp.to/papers.html#spherical. However, P3's emphasis
on this point is simply tweaking an irrelevant detail of how the
"*ridiculously* small" analysis was stated in P1.

> The lower-bound on the output length of the iterative slicer stands by
> union bound.

Quantification is missing here. If the claimed "lower-bound" is anything
like P1 then it's predicting that the vector length found by S-unit
attacks becomes longer and longer as y increases; but that's simply
wrong. See Sections 6 and 7 of https://cr.yp.to/papers.html#spherical.

> The spherical model for lattices is only necessary to prove
> upper-bound on this output. In other word: the lattice being
> non-spherical can only makes this particular algorithm worse, not better.

This conclusion is disproven by Z^d, where the shorter vectors than
predicted also make reduction work better than predicted; see Section 4
of https://cr.yp.to/papers.html#spherical. The paper, with more work,
also shows that S-unit lattices have shorter vectors than predicted and
that S-unit attacks are much more effective than predicted.

> So, for the lower bound we brought up, the randomness of the target
> suffices.

Regarding "suffices", see above regarding "needs".

> Because the algorithm fails to reach the desired length for
> random spherical t, it also fails in the worst case.

Here P3 appears to be claiming that the actual log vectors _are_
distributed like "random spherical t" (whereas earlier P3 was merely
saying that P3 _wants_ them to be distributed that way).

I see no justification for this claim anywhere in P3. Let's look at how
one might attempt to justify this claim, and then look at how
https://cr.yp.to/papers.html#spherical already debunks the claim.

A spherical model of the S-unit lattice predicts more than just vector
lengths: it also says that the vectors are pointing in uniform random
directions. This is rotationally symmetric, so _if_ one uses a spherical
model _then_ for a reduction analysis one can freely apply a uniform
random rotation to the actual target vector being reduced, obtaining a
modeled target vector pointing in a uniform random direction.

What happens if one recognizes that this model of the S-unit lattice has
been debunked, and replaces it with the actual S-unit lattice? Suddenly
there are the actual lattice vectors, the logs of S-units, pointing in
their actual directions. This isn't rotationally symmetric. This breaks
the equivalence between

* the actual target vector and
* a uniform random rotation of that vector.

What, then, is the justification for the "[uniform] random spherical t"
hypothesis supposed to be? Forget that it came from spherical models of
lattices, and treat it as a free-floating axiom?

The lack of justification doesn't by itself imply that the hypothesis
is wrong. Let's move on to what the paper says about this hypothesis.

To review the main paper contents first: There are eight core analyses
in https://cr.yp.to/papers.html#spherical, addressing four types of
lattices (Z^d as a warmup, and then all three directions in the space of
S-unit lattices) times two questions of

(1) vector lengths and
(2) reduction effectiveness,

in each case showing that the standard heuristics are very wrong.

#1 doesn't care about the directions. #2 does care, and the paper has
various comments about this. It's easiest to see the flaw in the "random
spherical t" hypothesis by considering a simplified version of #2.

This brings me to Section 7.3 of the paper. Section 7.3 (like various
other parts of the paper) considers the actual S-unit lattice, not a
model of it. Section 7.3 explicitly addresses a simplification of the
question of reduction effectiveness---it asks

how close this point is to the S-unit lattice, without regard to the
cost of finding a closest lattice vector

---but draws an important distinction between "two different probability
spaces for the random point". The two specified probability spaces are
two different models of a log vector, where

* the second model accounts for ord_P being an integer and
* the first model fails to do so.

Section 7.3 concludes that the second model has bounded distance from
the S-unit lattice as as y->infty, while the first one doesn't. The
analysis of the second model applies to the actual log vectors of any
distribution of elements of K^*, so the first model is simply wrong.

Section 7.3 closes with a paragraph noting that the problems with the
first model apply to spherical models: "For comparison, a spherical
model is invariant under rotation, so the point being reduced is
equivalent to a uniformly distributed point nu on a sphere. ... This
leads to the incorrect conclusion that that almost all reduced points
are farther and farther from the S-unit lattice as y->infty. This
conclusion relies critically on the inaccurate model of Log v as having
essentially arbitrary real coordinates, a model that fails to account
for the number-theoretic structure of Log K^*."

This is a 58-page paper. I noted in a separate pqc-forum message that it
was fine if the authors of P1 needed time before issuing an erratum. But
what we're seeing here is P3 attempting to defend P1 by adopting a
"random spherical t" model that's specifically debunked in the paper. P3
doesn't dispute what the paper says here; it simply ignores it.

> And that's what we are typically after when we talk about ideal-SVP.

The target under discussion here is indeed worst-case Ideal-SVP.

> A relevant average-case is also defined in [BDPW20] if you care.

As a historical note, Cohen's 1993 textbook on computational algebraic
number theory is already perfectly clear regarding the relevant
directions of randomization. The applicable number-theoretic notions are
older than this (and older than Arakelov's papers, btw). GRH-based
proofs have some value but are weaker than what's observed in
experiments to work.

> At a higher point of view, it seems that your mistake is a confusion
> between SVP and CVP; or purely geometrically, between the minimal
> distance and the covering radius.

This attack fails to pinpoint any text that it claims to be disputing.

> You brought up Z^n as a valuable
> example. Yes, Z^n has a minimal distance of 1. However, it's covering
> radius is still Θ(sqrt(n)).

As Section 4 of the paper says for Z^d, "iterating the reduction process
reduces any lattice point to 0, and reduces a general vector to the box
[−1/2,1/2]^d, with length at most d^{1/2}/2".

Let's now review how to quantitatively and qualitatively botch the
analysis of reduction for Z^d. See below for the analogy to how P1 and
P3 have botched the analysis of S-unit attacks.

First step: Apply standard heuristics to predict that the minimum length
of nonzero vectors is (1+o(1))(d/C)^(1/2) as d goes to infinity, where C
is the constant 2 pi e.

[This prediction is highly inaccurate. The actual minimum length is 1.]

Second step: Use these predictions of short-vector lengths to draw
conclusions regarding how effectively those vectors reduce a uniform
random target of a particular length.

Specifically, take a vector v of length (1+o(1))(d/C)^(1/2). What's the
chance that v successfully reduces a uniform random target of length
(10+o(1))d^(1/2)? Answer: "*ridiculously* small", namely exp(-Theta(d)),
since the distance ratio b/a is constant+o(1).

How about a target of length Theta(d^(3/4))? This time the distance
ratio b/a is Theta(d^(-1/4)), so the reduction probability is
exp(-Theta(d^(1/2))). To obtain a high reduction probability one would
need exp(Theta(d^(1/2))) vectors of this (1+o(1))(d/C)^(1/2) length.

Even with that many vectors, reduction wouldn't get much _below_ length
Theta(d^(3/4)); it wouldn't have a >=1/d^O(1) chance of reaching
Theta(d^(0.74)), for example. [At least formally, care is required here
regarding how the distribution changes across multiple reduction steps.
It isn't necessary to get into this for the dispute at hand.]

To summarize, the heuristics predict that, with exp(Theta(d^(1/2)))
vectors, reduction will obtain length at best roughly Theta(d^(3/4)).

[This prediction is also highly inaccurate. Simple reduction using the d
standard basis vectors and their negatives produces an output of length
at most d^(1/2)/2.]

Third step: Model the actual target vector as pointing in a uniform
random direction---which, again, is effectively guaranteed if one uses a
spherical model of the lattice. Based on this, translate the above
predictions regarding reduction of a uniform random target into
predictions regarding reduction of the actual target.

[This prediction degrades accuracy even more in contexts where the
actual target vector is, for example, a lattice vector: a lattice vector
is reduced to a length that's bounded as d increases, namely length 0.]

The simplest way to see that this botched analysis of Z^d is analogous
to what we've seen for S-unit lattices is to observe that S-unit attacks
for any particular field produce outputs whose length is bounded as d
increases (see Section 7 of the paper), whereas P1 claims that the
success probability for large S is "*ridiculously* small", dropping
exponentially with d. This analogy focuses on what goes wrong in the
third step; with more work, one can also see that the first and second
steps are wrong for S-unit lattices. See the paper.

> The average distance to Z^n of a random point modulo Z^n is also
> Θ(sqrt(n)).

Yes, uniform random points are equivalent to worst-case points at this
level of detail.

> In fact, the constant hidden in the Θ is even worse for
> Z^n than it is for random lattice.

Given the systematic suppression of constants earlier in P3, it's
puzzling to see P3 highlighting this particular constant. Anyway, the
errors under dispute are asymptotically much larger than any constant.

> I hope this clarifies things.

Looking forward to the errata for P1 and P3.

---Dan

signature.asc

Christopher J Peikert

unread,

Nov 4, 2021, 5:23:24 PM11/4/21

to pqc-forum

Dan, this is the second long email you've written disputing some "counter-analysis," but without answering the central, simple question:

Is the algorithmic claim* from the end of your talk of August 20 still in effect? If so, when can the community expect to see it substantiated?

The question has now been raised multiple times, but you haven't answered or even acknowledged it. This is highly irregular.

[* "Heuristics imply [Hermite factor] <= n^{1/2+o(1)} in time exp(n^{1/2+o(1)})" for cyclotomic Ideal-SVP.]

Sincerely yours in cryptography,

Chris

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20211104193154.27663.qmail%40cr.yp.to.

Leo Ducas

unread,

Nov 5, 2021, 3:26:35 AM11/5/21

to pqc-forum, cpei...@alum.mit.edu

Dear Dan,

you're previous message insist on discussing things relatively to the quantity
d=subexp_1/4(n), but my original message (P1) on lower bound made this
argument about projecting things down to the Θ(n) many infinite places, to
reach a lower bound of Ω(n^1/4).

By confusing d and n, you appear to be contesting a bound that is way off
your initial claim.

I wrote:
> According to this heuristic, the probability of reducing log(g) to a
> size comparable to Log(u/v) would be *ridiculously* small : exp(-Θ(d)) =
> exp(-subexp(n)). This heuristic might need refinements given the integer
> and sparse nature of the Log on the finite places. Yet, even just

> *projecting on the O(n) infinite places,* we do not see how your algorithm

> could find something smaller than Ω(n^1/4) * ||Log(u/v)|| for the given
> parameters.

You wrote:
> Even with that many vectors, reduction wouldn't get much _below_ length
> Theta(d^(3/4)); it wouldn't have a >=1/d^O(1) chance of reaching
> Theta(d^(0.74))

and that may be the case. Θ(d^(0.74)) is still *way off* Θ(n^1/4) (recall
d=subexp_1/4(n)). Which bound are you contesting ?

You wrote:

> Here P3 appears to be claiming that the actual log vectors _are_
> distributed like "random spherical t" (whereas earlier P3 was merely
> saying that P3 _wants_ them to be distributed that way).
> I see no justification for this claim anywhere in P3

I did justify it by invoking worst-casedness of the problem. In more details,
consider the following generation of instances: one can aim near a specific
distribution for the infinite places, by choosing t and set I to be the ideal
generated by Exp(t). Apply Extract on Exp(t) from [BDPW20] if you prefer
rational inputs.

All the best
-- Leo

D. J. Bernstein

unread,

Nov 5, 2021, 12:06:47 PM11/5/21

to pqc-...@list.nist.gov

Leo Ducas writes, email dated 5 Nov 2021 00:26:35 -0700 ("P5"):

> you're previous message insist on discussing things relatively to the quantity
> d=subexp_1/4(n), but my original message (P1) on lower bound made this
> argument about projecting things down to the Θ(n) many infinite places, to
> reach a lower bound of Ω(n^1/4).
>
> By confusing d and n, you appear to be contesting a bound that is way off
> your initial claim.

I'm not confusing anything. My message makes each statement using
appropriate terminology and notation for that statement.

> I wrote:
> > According to this heuristic, the probability of reducing log(g) to a
> > size comparable to Log(u/v) would be *ridiculously* small : exp(-Θ(d)) =
> > exp(-subexp(n)). This heuristic might need refinements given the integer
> > and sparse nature of the Log on the finite places. Yet, even just
> > *projecting on the O(n) infinite places,* we do not see how your algorithm
> > could find something smaller than Ω(n^1/4) * ||Log(u/v)|| for the given
> > parameters.
>
> You wrote:
> > Even with that many vectors, reduction wouldn't get much _below_ length
> > Theta(d^(3/4)); it wouldn't have a >=1/d^O(1) chance of reaching
> > Theta(d^(0.74))

This juxtaposition of quotes makes the reader think that the second
quote was in reply to the first, and---given the allegation a moment
earlier regarding "confusing d and n"---makes the reader wonder what
happened to the n variable.

However, readers who take a minute to check the original message can see
that the second quote has been ripped out of its explicit context, which
said "Let's now review how to quantitatively and qualitatively botch the
analysis of reduction for Z^d". In that context, the n variable simply
doesn't exist.

> and that may be the case. Θ(d^(0.74)) is still *way off* Θ(n^1/4)
> (recall d=subexp_1/4(n)). Which bound are you contesting ?

This is nonsense as a reply to the second quote, since there's no n in
the context for that quote. If the question was meant to be more broadly
what I'm contesting: P2 and P4 pinpointed one error after another in P1
and P3. Looking forward to the errata for P1 and P3.

> You wrote:
> > Here P3 appears to be claiming that the actual log vectors _are_
> > distributed like "random spherical t" (whereas earlier P3 was merely
> > saying that P3 _wants_ them to be distributed that way).
> > I see no justification for this claim anywhere in P3
>
> I did justify it by invoking worst-casedness of the problem. In more details,
> consider the following generation of instances: one can aim near a specific
> distribution for the infinite places, by choosing t and set I to be the ideal
> generated by Exp(t). Apply Extract on Exp(t) from [BDPW20] if you prefer
> rational inputs.

This paragraph from P5 doesn't justify the claim from P1+P3. It is
making a new statement that is (1) much weaker than the original claim
from P1+P3 and (2) useless for claiming limits on S-unit attacks. The
original claim has been debunked; an erratum is required.

To review:

* P1 said, and P3 repeated, that "in dimension d, if x,y are uniform

on spheres of radii a, b respectively (with a > b), then the
probability that y reduces x (i.e., ||x-y|| < ||x||) is
(1-(b/2a)^2)^{Θ(d)}."

* P1+P3 both applied this to dimension-d S-unit lattices. For
example, starting from taking "d = subexp_{1/2}(n)" (as in slide 47
of P0), P1 obtained an "exp(-Θ(d)) = exp(-subexp(n))" conclusion
regarding the reduction probability.

* P3 said that for this argument "only the target t = Log(g) needs to
have a random direction" and then said "Because the algorithm fails

to reach the desired length for random spherical t, it also fails
in the worst case".

The claim that Log(g) is distributed like a uniform random sphere point
(with regard to the relevant metrics) was, as P4 noted, already debunked
by Section 7.3 of https://cr.yp.to/papers.html#spherical before P3 was
issued. Where's the admission that this claim is wrong?

The revised statement appears to be merely that _the vector of length
Theta(n) obtained by removing all finite places from Log(g)_ can come
arbitrarily close to any point. Applying the same removal process to the
S-unit lattice also comes arbitrarily close to each point (for sensible
choices of S). This fails to keep Log(g) away from the S-unit lattice.

---Dan

signature.asc

Christopher J Peikert

unread,

Nov 5, 2021, 12:27:58 PM11/5/21

to pqc-forum

As fascinating as these long counter-(counter-(counter-...))) emails are, they aren't addressing the heart of the matter.

Since shortly after the talk of August 20, the significant algorithmic claim* made at the end of the talk has been called into question.

As always, the burden is on the one making the claim to substantiate it or retract it in a timely manner.

For many weeks, the speaker has ignored multiple requests to substantiate the claim, and will not even say whether it is still in effect.

This is not how science should be done, and the community deserves clarity on this matter.

Reply all

Reply to author

Forward

0 new messages