Non-asymptotic cost of lattice sieves

John Schanck

unread,

Apr 11, 2019, 12:28:07 PM4/11/19

to pqc-...@list.nist.gov

Dear pqc-forum,

All of the second round lattice-based KEMs --- Frodo, Kyber, LAC, NewHope,
NTRU, NTRU Prime, Round5, Saber, ThreeBears --- make some security claim based
on the sieve algorithm of Becker--Ducas--Gama--Laarhoven [BDGL]. Moreover, they
use cost estimates like
* sqrt(3/2)^b ~ 2^0.2925b operations, and
* sqrt(3/2)^b ~ 2^0.2925b bits of memory, or
* sqrt(4/3)^b ~ 2^0.2075b bits of memory.

One could view these numbers as approximations to the asymptotic complexity of
the BDGL sieve, i.e.
* 2^((0.2925...+o(1))b) operations, and
* 2^((0.2925...+o(1))b) bits of memory, or
* 2^((0.2075...+o(1))b) bits of memory.

However, one can also view them as approximations to
* A(b)/p(b) and
* A(b),
where A(b) is the kissing number in dimension b and p(b) is the probability
that two random points on the b-1 sphere at angular distance pi/3 lie
in a random spherical cap of angle ~ pi/3. As b goes to infinity A(b) goes to
2^((0.2075...+o(1))b) and A(b)/p(b) goes to 2^((0.2925...+o(1))b).

This interpretation of the "2^0.2925b" and "2^0.2075b" values avoids the
discussion of subexponential terms that contribute to the complexity of
the sieve, e.g. those from
* sampling lattice vectors,
* computing inner products,
* list decoding random spherical codes,
* etc.
It also allows you to write down a lower bound on the cost of the
sieve in any fixed dimension using only
1) The lower bound on the kissing number from [JJP], and
2) The (exact) wedge volume formula from [BDGL].

Note that this is how the asymptotic cost of BDGL is derived, modulo
arguments that other parts of the algorithm have sub-exponential cost.

I'd like to suggest that we replace the inaccurate formulas "0.2925b"
and "0.2075b" with tables of log A(b)/p(b) and log A(b). I've posted some
preliminary tables at [1]. The repository includes scripts for generating
the tables. These scripts also produce tables for "bgj1" from [G6K].

This community might be surprised by the inaccuracy of the 0.2925b formula.
In dimension b the table value exceeds 0.2925b by
b=300 : 22.9
b=400 : 24.4
b=500 : 25.5
b=600 : 26.5
b=700 : 27.3
b=800 : 28.0
More pointedly: the attempts to estimate the sub-exponential terms
using formulas like "0.2925b + 16.4" have not accounted for the dominant
sub-exponential term in the relevant range.

Some technical notes follow.

Cheers,
John

- The Chabauty--Shannon--Wyner lower bound on the kissing number was
recently improved by Jenssen, Joos, and Perkins [JJP]. The improvement
is "only" linear in the dimension, but this is significant in our
setting. I've used the JJP bound for the tables.

- In finite dimension, the angle of the cap in the definition of p(b)
has to be optimized. With the notation from BDGL, the tables entries
correspond to a local minimum along "alpha=beta" in the neighborhood
of alpha=beta=1/2.

[1] https://github.com/jschanck/sieve-tables
[BDGL] https://eprint.iacr.org/2015/1128
[G6K] https://eprint.iacr.org/2019/089
[JJP] https://arxiv.org/abs/1803.02702

D. J. Bernstein

unread,

Apr 11, 2019, 6:20:05 PM4/11/19

to pqc-...@list.nist.gov

John Schanck writes:
> All of the second round lattice-based KEMs --- Frodo, Kyber, LAC, NewHope,
> NTRU, NTRU Prime, Round5, Saber, ThreeBears --- make some security claim
> based on the sieve algorithm of Becker--Ducas--Gama--Laarhoven [BDGL].

No.

The second-round NTRU Prime submission reviews the security-estimation
strategy from the "Estimate" page, but it does _not_ endorse the
strategy. On the contrary, it states that the strategy is wrong, and it
points out problems with many aspects of the strategy, including the
same 0.292b that you're criticizing here. The accompanying script

https://ntruprime.cr.yp.to/estimate-20190329.sage

includes summaries of the problems, for example labeling the 0.292b as
"UNDER/OVER: misuse of asymptotics". Here "UNDER/OVER" means that this
could be an underestimate or could be an overestimate. The submission
explains in detail why replacing o(1) with 0 is unjustified; see "How
concrete cost formulas for BKZ are produced" on pages 54-55.

Pointing out errors in analyses is the first step towards correcting the
errors. Of course I'm happy to see further work in this direction, but
you should also acknowledge the first step. This is more than just an
issue of scientific credit: NIST's rules for the standardization project
consider "the quality of the analysis provided by the submitter".

> I'd like to suggest that we replace the inaccurate formulas "0.2925b"
> and "0.2075b" with tables of log A(b)/p(b) and log A(b).

This sounds like a good idea. Of course security reviewers will need to
check the new computations, but you can also use much stronger words
than "inaccurate" to describe the old formulas. How about "fake math"?

The real issue isn't how far off the old formulas are. The real issue is
the lazy, unscientific process that produced the formulas in the first
place: speculating that o(1) is close to 0, and refusing to do the work
necessary to evaluate whether this is actually true. Is the security
reviewer supposed to take each of these speculative formulas and do the
work that the sources of the formulas failed to do? Maybe the formulas
are occasionally correct, but this doesn't reduce the review burden.

To be clear, I'm not blaming the authors of the (0.29248...+o(1))b
formula; I'm blaming the authors of the (0.29248...)b formula, the
people who incorrectly equate o(1) with 0. The first formula says
nothing about any particular value of b. The second formula makes
unjustified statements about every value of b.

Security review is tough even under the best conditions. Sometimes there
are mistakes in proofs and in experiments. But it's much worse to try to
review (0.29248...)b. There are no proofs; there are no experiments;
there isn't even a statement clear enough to meet the basic scientific
standard of falsifiability.

It's also important to recognize that there are many other problems with
the existing analyses of the cost of known attacks, and these problems
could be creating even larger errors. For example, it's reasonably clear
that ignoring the cost of memory often creates massive underestimates,
and that ignoring hybrid attacks often creates massive overestimates.
There are many other effects whose impact is less clear. Figuring out
the true bottom line is going to take much more work.

---Dan (speaking for myself)

signature.asc

Noah Stephens-Davidowitz

unread,

Apr 11, 2019, 8:40:37 PM4/11/19

to pqc-forum

Hi John,

I like the idea of getting rid of the asymptotics.

But, can you give more intuition for why you think the worst-case kissing number is the right number to use? If there happens to be some complicated configuration of points (or lattice points) that achieve very large kissing number, does this have any affect on the running time of sieving algorithms on the instances relevant to cryptography?

Intuitively, I would expect an average-case notion of kissing number to be more accurate. E.g., the number of points you can place on the unit sphere by repeatedly placing a random point subject to the constraint that all pairwise distances remain larger than 1/2. Or, better still, we might want to use a bound that holds with high probability, since at the end of the day what we're trying to measure is the probability that a sieving algorithm starting with N vectors will succeed on a random instance.

-Noah

Thijs Laarhoven

unread,

Apr 11, 2019, 8:44:30 PM4/11/19

to pqc-forum

Hi John, all,

To add to the discussion, it is good to note that there are some

crucial differences between the kissing constant ("how can we fit

as many points as possible on a sphere") and the "sieving constant"

("how many points do we need for lattice sieving to succeed").

1.

The argumentation for the space complexity for heuristic lattice

sieving algorithms commonly does *not* go via the kissing constant,

but directly via volume arguments of spherical caps:

- If we have N = (4/3 - eps)^{d/2 + o(d)} random points {x1,...,xN}

on a d-dimensional unit sphere, and a vector v also randomly drawn

from the unit sphere, then it is unlikely that v is "close" to any

of the points in this list; w.h.p. |v - xi| > 1 for all i. So we

cannot "reduce" the length of v by subtracting any of these vectors,

and the lattice sieve will not make progress.

- If we have N = (4/3 + eps)^{d/2 + o(d)} random points as above,

then w.h.p. |v - xi| < 1 for at least one index i. So probably we

can reduce v's norm by subtracting one of the list vectors, and

the underlying lattice sieve will continue to make progress.

Here the cutoff point (4/3)^{d/2 + o(d)} comes from the "very

elegant" argument in the introduction of [JJP], for proving

[JJP, (1)]: this is the reciprocal of the volume of a spherical

cap of angle pi/3.

2.

In principle, I believe there is no real reason to assume that the kissing

constant, even in high dimensions, cannot scale as the best known

*upper* bound of Kabatiansky-Levenshtein; eventually it might even

turn out that A(b) >= 2^{0.401b + o(b)}. Looking at the densest

packings known in low dimensions [NS], this does not even seem that

unlikely. From Schanck's message, one might guess this would then

imply that the 2^{0.208b + o(b)} list size for sieving would need

to be adjusted to 2^{0.401b + o(b)}.

However, an improved lower bound on A(b) has no direct impact on the

lattice sieving space complexity; it only means that if those lower

bounds *also* hold for lattice-based sphere packings (for which the

best known bounds of [Vla] are nowhere near the general lower bounds

on the kissing constant), then there would exist worst-case lattices

in high dimensions for which lattice sieving will require space

2^{0.401b + o(b)}. This says nothing about average-case behavior on

random lattices, for which the volume arguments likely give more

accurate estimates.

Best,

Thijs

[NS]: A catalogue of lattices. G. Nebe and N.J.A. Sloane.

http://www.math.rwth-aachen.de/~Gabriele.Nebe/LATTICES/kiss.html.

[Vla]: Lattices with exponentially large kissing numbers. S. Vladut.

https://arxiv.org/abs/1802.00886.

Op donderdag 11 april 2019 18:28:07 UTC+2 schreef John Schanck:

John Schanck

unread,

Apr 11, 2019, 10:04:28 PM4/11/19

to Noah Stephens-Davidowitz, pqc-forum

Hi Noah,

* Noah Stephens-Davidowitz <nsd...@gmail.com> [2019-04-11 17:40:36 -0700]:

> But, can you give more intuition for why you think the worst-case kissing
> number is the right number to use?

Well, I actually don't think that the worst-case kissing number is the
right number to use. I agree with you that some average-case notion is
what's needed. I also agree with Thijs that we should distinguish
between the kissing constant and the "sieving constant."

The Jenssen--Joos--Perkins lower bound on the kissing constant is
actually measuring the expected size of random spherical codes.
My guess is that their constant is a better estimate for the sieving
constant than the spherical cap volume estimate. But I could be wrong.
Maybe some of the large sieving experiments have produced data that
could tip the scales one way or the other...?

Cheers,
John

Leo Ducas

unread,

Apr 12, 2019, 5:24:22 AM4/12/19

to pqc-forum

Dear John,

I certainly share the ultimate goal of having refined estimates for
those costs, however my assessment of the state of the art is that it is
still not stable; in that light such effort may soon be obsolete. My
approach for the last few years was to take the problem from the other
end: improve those algorithm until we get stuck. And in fact, your
estimate for SVP via BGJ1-sieving is already obsolete and invalidated by
experiments (see paragraph (*) ).

<digression>

I wish to use this opportunity to recall the context and rationales of
the `lower bounds' introduced in NewHope [1], for which I claim no
strong theoretical foundation. They have been chosen to be both simple
and convincingly on the safe side, accounting for an educated guess of
progress to come.

Previous works based their estimates on a table from [2,Table 1]
for enumeration. I've always been a bit frustrated that this table was
very hard to reproduce, and put a lot of effort into making this state
of the art more easily reproducible (contributing a pruner to FPLLL).

Eventually, I gave up enumeration because I figured out a number of
potential practical optimizations of sieving, that I guessed would
outperform enumeration. I'm still working on pushing this line further,
but already we should all be convinced by the latest records [5] that
SVP-160 cost quite less than the 100 * 2^79.9 CPU operations given
in [2, Table 1].

Of course one could also rely on the lower bound of [2, Table 4], but
I would consider that 100 * 2^28.8 CPU operations for SVP-160 is too
far on the safe side when compared to practice. Furthermore the
difficulty of reproduction and extrapolation remained.

</digression>

Getting back to evaluating the cost of Sieving, the lower bounds
introduced in NewHope [2] did indeed ignore a number of factors.
The most significant ones being:

A: cost of inner products

B: the fact that one need to reduce vectors many times, not just once

C: gap between exact and asymptotic spherical caps / wedges volumes

Factor A has since then been tackled using XOR-POPCNT trick [7, 4, 5].
Factor B has since then been tackled using progressive-Sieving [8, 4, 5].

I can only encourage you to study factor C into more depth. My guess is
the same as Thijs' that what is relevant are caps and wedges and not
average sizes of spherical codes. In fact, the issue may be even more
subtle: for now people reasoned over the sphere, but it may be more
relevant to consider balls, as the assumption that all vectors have
the same length is convenient for asymptotic analysis, but not accurate
in practice.

(*) At last, I did not find any traces of estimations of the `dimensions
for free' [4, 5] in your scripts. This feature may explain why G6K-BGJ1 [3]
solves exact SVP-100 within 2^42 CPU operations, while your table [6] for
BGJ1 predicts 2^56.1 unspecified unit operations.

I am looking forward to see your work eventually lead to an accurate model
that would match the repeatable experiments from the literature. In PS,
please find some comments on tuning BDGL for practice so as to avoid
studying the naive version of this algorithm.

Nevertheless, I would still not recommend using those fixed estimates
directly for choosing parameters of a to-be-standardized scheme. Indeed,
a desired quality for cryptographic a standard is to hold up to its
security claims for more than 18 months, and I do not think that we are
stuck yet when it comes to practical speed-ups for those algorithms.

For example, an improvement factor of at least O(n) on time and memory
of sieving is known to apply to symmetric lattices of dimension n, but
it remains to tweak BKZ (and G6K) in such a way that it preserves some
subgroup of symmetries in its blocks. This sounds more than plausible to
me, though it is not an open coding project of mine at the moment.

Best of luck with this refined study

-- Leo Ducas

[1] https://eprint.iacr.org/2015/1092
[2] https://www.iacr.org/archive/asiacrypt2011/70730001/70730001.pdf
[3] https://eprint.iacr.org/2015/1128
[4] https://eprint.iacr.org/2017/999
[5] https://eprint.iacr.org/2019/089
[6] https://github.com/jschanck/sieve-tables/blob/master/b.3496.csv
[7] https://eprint.iacr.org/2014/788
[8] https://eprint.iacr.org/2018/079
[9] https://github.com/jschanck/sieve-tables/blob/master/b.2075.csv

PS:

When applying such a reasoning to BDGL [3], I should also warn that this
algorithms ends itself to quite some tuning (and so does BGJ1, to a lesser
extent). When we wrote [3], sieving was so slow in practice that we focused
on the simplest version achieving these asymptotic speed. In practice, it
is for example possible:

- to use less filters per trial but retry, so as to maintain memory
complexity to 2^{.2075n + o(n)}.

- to use weaker filters to increase buckets content to more than O(1)
vectors, so as to amortize the cost of bucketing by testing more pairs.

Leo Ducas

unread,

Apr 12, 2019, 5:49:05 AM4/12/19

to pqc-forum

Apologies for the messed up bibliography:

All references to NewHope should point to [1] and references to G6K should point to [5].

D. J. Bernstein

unread,

Apr 12, 2019, 6:40:00 AM4/12/19

to pqc-...@list.nist.gov

Thijs Laarhoven writes:
> Here the cutoff point (4/3)^{d/2 + o(d)} comes from the "very
> elegant" argument in the introduction of [JJP], for proving
> [JJP, (1)]: this is the reciprocal of the volume of a spherical
> cap of angle pi/3.

Here's a Sage script that is intended to compute, in two ways, the
(d-1)-volume of the sphere divided by the (d-1)-volume of this spherical
cap. Please check---it's entirely possible that I've made synchronized
mistakes in both computations. (As a sanity check, increasing 0.75 to 1
increases integral2 to 1/2.) Output of the script:

dim -lg(integral1) -lg(integral2) lg(fakemath) error
8 3.5524046323 3.5524046323 1.6601499971 1.8922546352
16 5.6093250618 5.6093250618 3.3202999942 2.2890250676
32 9.3582484988 9.3582484988 6.6405999885 2.7176485103
64 16.4542018048 16.4542018048 13.2811999769 3.1730018279
128 30.2096319053 30.2096319053 26.5623999538 3.6472319515
256 57.2579997617 57.2579997617 53.1247999077 4.1331998540

The "fakemath" column is (4/3)^(d/2). This is labeled "fakemath" because
of the indefensible methodology used to derive it. Current security
estimates for lattice-based cryptography are full of applications of the
same methodology; any particular application could produce errors even
larger than 4 bits, and presumably these errors will accumulate in both
directions. I presume that you don't endorse the unjustified leap from
(4/3)^(d/2+o(d)) to (4/3)^(d/2).

I should emphasize that computing this cap volume accurately is only one
step towards a proper algorithm analysis for concrete values of d. For
example, the volume argument _guarantees_ a near-collision, but the
algorithm can work better than the guarantees. On the other hand, the
algorithm needs a huge number of near-collisions. The algorithm also
uses a huge number of operations per point.

---Dan

RR = RealField(1000)

def integral1(d):
return integral(sin(x)^(d-2),(x,0,pi/3))*gamma(d/2)/(sqrt(pi)*gamma((d-1)/2))

def integral2(d):
T = RealDistribution('beta',[(d-1)/2,1/2])
return (1/2)*T.cum_distribution_function(0.75)

print 'dim','-lg(integral1)','-lg(integral2)','lg(fakemath)','error'
for d in [8,16,32,64,128,256]:
out1 = -log(RR(integral1(RR(d))),2)
out2 = -log(RR(integral2(RR(d))),2)
fake = (d/2)*log(RR(4/3),2)
print d,'%.10f'%out1,'%.10f'%out2,'%.10f'%fake,'%.10f'%(out1-fake)

signature.asc

Leo Ducas

unread,

Apr 12, 2019, 7:11:13 AM4/12/19

to pqc-forum, d...@cr.yp.to

Here's a Sage script that is intended to compute, in two ways, the
(d-1)-volume of the sphere divided by the (d-1)-volume of this spherical
cap.

As explained in my previous message, spherical caps are irrelevant in practice,
what matter are ball intersections. In practice 3.2*(4/3)^(n/2) vectors suffices for
G6K to run a full-sieve (no dims for free) up in all tested dimension (up to 128).
The 3.2 can be lowered, but seems optimal for the runtime. It invalidates your
claimed error of 2^3.6472319515 = 12.529282993, and therefore your reasoning.

I must also reiterate the irrelevance of those non-fake maths if one does not
account for further algorithmic progress, such as dimensions for free. Giving out
more significant digits alone does not suffice to make a prediction more accurate.

-- L

D. J. Bernstein

unread,

Apr 12, 2019, 8:52:23 AM4/12/19

to pqc-...@list.nist.gov

Leo Ducas writes:
> already we should all be convinced by the latest records

No, security reviewers should be much more careful than this. For
example, a careful security reviewer will notice that

* these records used large amounts of memory, and scaling up to
cryptographic sizes would involve vastly more memory;

* the comparison ignores various papers that improve the competing
algorithms in ways not reflected in recent records; and

* the asymptotics suggest much larger quantum improvements in the
competing algorithms.

At cryptographic sizes, within the spectrum of cost-of-known-attack
estimates reviewed (and critiqued) in the NTRU Prime submission (see
Section 6), the smallest post-quantum estimates come from sieving, but
only under the physically unreasonable assumption that memory is free.
When this assumption is corrected, the smallest post-quantum estimates
come from hybrid enumeration (which, not coincidentally, is also the
strategy that was highlighted in the first-round submission). This is
true for the entire range of security levels considered. (See Table 2.)

Of course it's possible that correcting other errors (or adding newer
algorithms) will change the comparison. What's perhaps most important
for security reviewers to realize is the point explained in "The cutoff
between sieving and enumeration" in the NTRU Prime submission (in
Section 6.9): tiny improvements in either direction can make huge
changes in the cutoff. This makes it particularly important to continue
research into understanding the performance of, and possibilities for
improvements in, algorithms in both directions.

> As explained in my previous message, spherical caps are irrelevant in
> practice, what matter are ball intersections.

I can't figure out what you think you're arguing against here.

I clearly identified the quantity that I was calculating (the "cutoff
point" defined by Thijs), showed two ways to calculate it correctly
(modulo review), and showed that fake math (replacing o(1) with 0)
produces different results. I emphasized that calculating this quantity
correctly is "only one step towards a proper algorithm analysis".

My current guess is that similarly easy calculations will work for
various types of ball intersections, but that more serious effort is
required for the more complicated quantities that show up in the actual
algorithm analysis.

> invalidates your claimed error of 2^3.6472319515 = 12.529282993, and
> therefore your reasoning.

Please refrain from attributing your strawman positions to me.

> I must also reiterate the irrelevance of those non-fake maths if one
> does not account for further algorithmic progress, such as dimensions
> for free.

Of course a careful security reviewer will complain if the fastest
attacks are missing. A careful security reviewer will also complain
about speculative analyses, such as analyses that replace o(1) with 0.

---Dan (speaking for myself)

signature.asc

John Schanck

unread,

Apr 23, 2019, 1:15:23 PM4/23/19

to pqc-...@list.nist.gov

Dear all,

I've revised my tables [1] in response to comments here and off
list.

It was premature to suggest that the Jenssen--Joos--Perkins
lower bound on the kissing constant would be relevant to the
"sieving constant" --- the community will need time, and new
experiments, to assess this claim. The revised tables are based
on the spherical cap volume estimate for the sieving constant.
I've also used more precise calculations for cap and wedge volumes.
See the "readme" file at [1] for details on the calculations.

I previously wrote that

> attempts to estimate the sub-exponential terms using formulas
> like '0.2925b + 16.4' have not accounted for the dominant
> sub-exponential term in the relevant range.

And I justified this using the tables based on the JJP bound.
I still believe that this is true, for reasons like those given
in Section 6.9 of the second round NTRU Prime submission. However,
the revised table value (2924.csv) only exceeds log2(sqrt(3/2))b by
b=300 : 7.56,
b=400 : 7.96,
b=500 : 8.27,
b=600 : 8.53,
b=700 : 8.75,
b=800 : 8.94.

The readme file explains my current position on the relevance
of the table entries to the cost of sieving.

Finally, let me say that it was not my intention to blame any
submissions for using the inaccurate formulas, nor was it my intention
to claim priority for pointing out the inaccuracy of the formulas. I
acknowledge the fact that the NTRU Prime team, in particular, did not
endorse the inaccurate cost estimates

Cheers,
John

[1] https://github.com/jschanck/sieve-tables

Thijs Laarhoven

unread,

Apr 27, 2019, 11:04:54 AM4/27/19

to pqc-forum

Hi John, all,

Thanks for computing these numbers, which can be useful for obtaining more precise estimates for sieving. Still, it may be good to add some further caveats about these tables.

1.

The 0.2075-exponent does not apply to all sieving algorithms, but only to *pairwise* sieving algorithms. The tuple sieving line of work [BLS16, HK17, HKL18] shows how to work with slightly less memory, and how this affects the time complexity.

Note that [G6K] also describes a triple sieve (3-sieve) implementation, which they experimentally showed to be competitive with a pairwise sieve in terms of time, while saving memory.

2.

Even in the pairwise sieving framework, these numbers only give estimates for *sieving* in dimension n, and not necessarily for *solving SVP* in dimension n. As shown in [Duc18], running a sieve in dimension n - k suffices to solve SVP in dimension n, with k = O(n/log n), and this idea has also been used for the records from [G6K].

Concretely, for n = 500 the heuristic estimates from [Duc18, Section 3.4] suggest that it may suffice to run a sieve in dimension approximately 460 to solve SVP in dimension 500. From the 0.2924-table this suggests savings of roughly 12 bits, which is more than the gap of 7-8 bits between the numbers in the table and the "o(1)=0"-underestimates.

3.

Of course this applies to most such results, but these "lower bounds" are only for a restricted model of known approaches. Currently we do not know whether the 0.2924 constant in the exponent is optimal, and other techniques may exist for reducing the effective sieving dimension for SVP. (In [DLW19, Section 1.6] we proposed a different way of saving O(n/log n) dimensions "for free", but at this point it is not clear whether this is of practical interest.)

Best,

Thijs

[BLS16]: https://ia.cr/2016/713

[HK17]: https://ia.cr/2017/017

[HKL18]: https://ia.cr/2017/1228

[G6K]: https://ia.cr/2019/089

[Duc18]: https://ia.cr/2017/999

[DLW19]: https://ia.cr/2016/888

Leo Ducas

unread,

Apr 29, 2019, 7:14:11 AM4/29/19

to pqc-forum

1.
The 0.2075-exponent does not apply to all sieving algorithms, but only to *pairwise* sieving algorithms. The tuple sieving line of work [BLS16, HK17, HKL18] shows how to work with slightly less memory, and how this affects the time complexity.

Note that [G6K] also describes a triple sieve (3-sieve) implementation, which they experimentally showed to be competitive with a pairwise sieve in terms of time, while saving memory.

The remark about [G6K] is only marginally true, I should note. First, we used the ``3-sieve'' opportunistically, for most of the experiments, in the sense that we gave it as much memory as a 2-sieve, and it was a bit faster than bgj1 in on very specific case (sieve during the down phase of a pump). Otherwise, in this regime, performance are comparable to bgj1 in practice (and asymptotically they also match). The trade-off in practice when using less memory is given on Fig 2, p19, for a full sieve. We have not explored how this trade-off also affects dimensions for free.

Best
-- Leo

Thijs Laarhoven

unread,

Apr 29, 2019, 7:56:43 AM4/29/19

to pqc-forum

The remark about [G6K] is only marginally true, I should note. First, we used the ``3-sieve'' opportunistically, for most of the experiments, in the sense that we gave it as much memory as a 2-sieve, and it was a bit faster than bgj1 in on very specific case (sieve during the down phase of a pump). Otherwise, in this regime, performance are comparable to bgj1 in practice (and asymptotically they also match). The trade-off in practice when using less memory is given on Fig 2, p19, for a full sieve. We have not explored how this trade-off also affects dimensions for free.

Sure; I did not mean to claim that the 3-sieve is much faster than the fastest 2-sieve, but only that the current state of the art for the 3-sieve is close enough to the 2-sieve that, at least for now, it would be premature to dismiss the tuple sieving line of work as practically irrelevant. (And therefore the 0.2075 lower bound does not necessarily apply to all "practical" sieving algorithms, as one could also e.g. use 0.2074 memory for a 3-sieve without making a huge sacrifice in the time complexity.)

Best,

Thijs

John Schanck

unread,

Apr 29, 2019, 1:08:24 PM4/29/19

to Thijs Laarhoven, pqc-forum

Hi Thijs,

Thanks for your input. To be even more explicit: I don't endorse
interpretations of the tables beyond those given in the readme [1].
In particular,
- 2075.csv does not list the memory usage of any particular sieve,
- 2924.csv does not list the cost of BDGL,
- 3494.csv does not list the cost of bgj1,
- neither 2924.csv nor 3494.csv list the cost of solving SVP,
- none of the tables give lower bounds on any cost associated
with sieving (I said something to the contrary in my first email
in this thread; I was wrong.)

* Thijs Laarhoven <ma...@thijs.com> [2019-04-27 08:04:53 -0700]:

> these numbers only give estimates for *sieving* in dimension n,
> and not necessarily for *solving SVP* in dimension n.

Right. They have nothing to say if you simply insist (as in the Round5
submission) that "2^0.292d" is a good estimate for the cost of SVP in
dimension d.

They _might_ say something if the use of "2^0.2924d" refers specifically
to the heuristic analysis of the BDGL sieve that gives a complexity of
2^((0.2924...+o(1))d). Especially so if you've also said (as in the New
Hope submission) that the "hidden sub-exponential factor is known to be
much greater than one in practice".

> As shown in [Duc18], running a sieve in dimension n - k suffices to

> solve SVP in dimension n, with k = O(n/log n) [...] in dimension 500
> [...] this suggests savings of roughly 12 bits, which is more than

> the gap of 7-8 bits between the numbers in the table and the
> "o(1)=0"-underestimates.

Some submissions need to explain why they're ignoring [Duc18].
That's largely orthogonal to the current discussion.

I'm more concerned about the fact that there's evidence that the
"o(1)=0"-underestimates (as you call them) are _not_ underestimates.
Leo said earlier in this thread that

> In practice 3.2*(4/3)^(n/2) vectors suffices for G6K to run a
> full-sieve (no dims for free) up in all tested dimension (up to 128).

If you interpret 3.2*(4/3)^(d/2) as a refinement of (1/C(d))^(1+o(1))
for d < 128, then the conclusion is a _negative_ o(1) term (for 5 < d < 128).
This kind of evidence has to be countered with specific reference
to subexponential terms that are larger than one to maintain that the
"hidden sub-exponential factor is known to be much greater than one in
practice." You can also claim (as in the New Hope submission)
that "quantified claims [of the o(1) terms] seems premature,"
but the "greater than one" claim still needs justification.

Cheers,
John

[1] https://github.com/jschanck/sieve-tables/blob/e3ba0b/README.md

Leo Ducas

unread,

Apr 30, 2019, 4:55:04 PM4/30/19

to pqc-forum, ma...@thijs.com

Dear John,

I think that you are refuting your own claims, not mines. Yo wrote:

> If you interpret 3.2*(4/3)^(d/2) as a refinement of (1/C(d))^(1+o(1))
> for d < 128, then the conclusion is a _negative_ o(1) term (for 5 < d < 128).

> ...

>    You can also claim (as in the New Hope submission)
>    that "quantified claims [of the o(1) terms] seems premature,"
>    but the "greater than one" claim still needs justification.

I do not think anyone claimed that the o(1) in (1/C(d))^(1+o(1)) was bigger than one. I certainly did not.

I did claim, and this is back by every single paper reporting on 2-sieve implementation (beyond my owns), that it requires more than (4/3)^(n/2) vectors.

I've already invited you to revise the use of caps (which are the analysis of a worst-case scenario) to the one of ball intersections, so as to understand the gap between your predictions and reality. I furthermore invite you to consider the `+/-' trick, halving memory and time requirement. If all this still fails to match reality, well, maybe you will start to consider the evolution of the distribution of length of vectors when iterating a NV-sieve. You'll quickly convince yourself that it is not uniform. However figuring its fix point is a question I tried to solve for more theoretical reasons, but ended empty handed.

Good luck.

-- L

Thijs Laarhoven

unread,

May 1, 2019, 6:35:58 AM5/1/19

to pqc-forum, ma...@thijs.com

Hi John, all,

I'm more concerned about the fact that there's evidence that the
"o(1)=0"-underestimates (as you call them) are _not_ underestimates.
Leo said earlier in this thread that
> In practice 3.2*(4/3)^(n/2) vectors suffices for G6K to run a
> full-sieve (no dims for free) up in all tested dimension (up to 128).

If you interpret 3.2*(4/3)^(d/2) as a refinement of (1/C(d))^(1+o(1))
for d < 128, then the conclusion is a _negative_ o(1) term (for 5 < d < 128).
This kind of evidence has to be countered with specific reference
to subexponential terms that are larger than one to maintain that the
"hidden sub-exponential factor is known to be much greater than one in
practice." You can also claim (as in the New Hope submission)
that "quantified claims [of the o(1) terms] seems premature,"
but the "greater than one" claim still needs justification.

Just to clarify: when I was referring to "o(1)=0"-underestimates, I was specifically referring to time complexities, and not to space complexities or list sizes. For instance, a plain 2^(0.2924d) time complexity "estimate" for the LDSieve in dimension d does not take into account:

- the number of iterations (reductions per vector) the sieve needs to terminate;

- the overhead of the list decoding step in the nearest neighbor searches;

- the inaccuracy of approximating "random codes" with "random product codes" in the LDSieve;

- the hidden order terms in the asymptotic analysis of the nearest neighbor exponent;

- the actual costs of memory lookups when using large amounts of memory;

- potential issues/overhead one may encounter when trying to parallellize this efficiently.

Due to a combination of these and other factors, most/all reported experimental time complexities for sieving are quite far above the heuristic estimates one would obtain by setting o(1)=0 in the exponent. This in contrast with list size estimates, for which experiments give numbers which are quite close to the estimates obtained by setting o(1)=0 in the exponent.