Faster finite fields beyond 2^16?

Kiran Kedlaya

unread,

Jan 31, 2011, 12:19:07 AM1/31/11

to sage-nt

Currently sage uses Givaro to implement finite fields of order up to
2^16. Beyond that range, it goes to either NTL in characteristic 2, or
Pari in odd characteristic.

The performance breakdown at 2^16 is pretty noticeable. Shouldn't it
be possible to get good performance at least up to 2^32? If so, how?
(E.g., can FLINT help with this?)

Kiran

David Roe

unread,

Jan 31, 2011, 2:13:04 AM1/31/11

to sag...@googlegroups.com

This is task 1 of project 1 among the p-adics project list I made in November. :-) See http://wiki.sagemath.org/padics/GeneralExtensions and more generally http://wiki.sagemath.org/padics.

If someone wants to work on this (I can't right now), feel free to get in touch with me. My main suggestion would be to try to incorporate the model from sage.rings.polynomials.polynomial_template.
David

--
You received this message because you are subscribed to the Google Groups "sage-nt" group.
To post to this group, send an email to sag...@googlegroups.com.
To unsubscribe from this group, send email to sage-nt+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sage-nt?hl=en-GB.

William Stein

unread,

Jan 31, 2011, 3:10:24 AM1/31/11

to sag...@googlegroups.com

On Sun, Jan 30, 2011 at 11:13 PM, David Roe <ro...@math.harvard.edu> wrote:
> This is task 1 of project 1 among the p-adics project list I made in
> November. :-) See http://wiki.sagemath.org/padics/GeneralExtensions and
> more generally http://wiki.sagemath.org/padics.

Are you planning to use log tables for speed? That's why Givaro is so
fast. It doesn't seem like you mention anything like that on your
page.

> If someone wants to work on this (I can't right now), feel free to get in
> touch with me. My main suggestion would be to try to incorporate the model
> from sage.rings.polynomials.polynomial_template.

When you copy code out to make your new code, please, please think
hard and benchmark each function. The code in
sage.rings.polynomials.polynomial_template has some annoying
inefficiencies due to lack of very careful benchmarking and use of
Cython, which surprised me last summer... (Of course, one can make
this sort of complaint about most code...)

--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

David Roe

unread,

Jan 31, 2011, 3:28:06 AM1/31/11

to sag...@googlegroups.com

Are you planning to use log tables for speed? That's why Givaro is so
fast. It doesn't seem like you mention anything like that on your
page.

I wasn't currently planning on using log tables, for two reasons:

* One of my motivations was to work with rings (Z/NZ)[x]/(f) where N is not prime (though usually a prime power). In this case there's more than one non-unit, and the group of units is not cyclic. These are surmountable problems, but they make the implementation more complicated.
* As the size of the ring increases, the memory footprint of log tables increases. 2^16 seems like a reasonable cutoff (especially since we get to use Givaro in this case).

That being said, I would certainly be open to the possibility of adding a log-table implementation beyond 2^16.

> If someone wants to work on this (I can't right now), feel free to get in
> touch with me. My main suggestion would be to try to incorporate the model
> from sage.rings.polynomials.polynomial_template.

When you copy code out to make your new code, please, please think
hard and benchmark each function. The code in
sage.rings.polynomials.polynomial_template has some annoying
inefficiencies due to lack of very careful benchmarking and use of
Cython, which surprised me last summer... (Of course, one can make
this sort of complaint about most code...)

That surprises me as well. Do you think the inefficiencies are inherent in the general approach, or would a better implementation address the problem? Once I actually start working on it, I would be interested in talking to you about specific inefficiencies that you've noticed. And I'll try to benchmark things as I go.
David

William Stein

unread,

Jan 31, 2011, 4:41:13 AM1/31/11

to sag...@googlegroups.com, Jean-Guillaume Dumas, Martin Albrecht

On Mon, Jan 31, 2011 at 12:28 AM, David Roe <ro...@math.harvard.edu> wrote:
>
>
>> Are you planning to use log tables for speed? That's why Givaro is so
>> fast. It doesn't seem like you mention anything like that on your
>> page.
>
> I wasn't currently planning on using log tables, for two reasons:
>
> * One of my motivations was to work with rings (Z/NZ)[x]/(f) where N is not
> prime (though usually a prime power). In this case there's more than one
> non-unit, and the group of units is not cyclic. These are surmountable
> problems, but they make the implementation more complicated.
> * As the size of the ring increases, the memory footprint of log tables
> increases. 2^16 seems like a reasonable cutoff (especially since we get to
> use Givaro in this case).
>
> That being said, I would certainly be open to the possibility of adding a
> log-table implementation beyond 2^16.

Going up to 2^20 would only use a few MB, since the memory usage is
O(field cardinality). You just store logs of all elements, but you
still have to add them. It's a 1xn table, not an nxn table.

I wrote to the Givaro author just now and he wrote back to say that
Givaro already supports bigger fields, though he's not really
answering Kiran's question (which includes F_q, and emphasizes that
the speed must be optimal, since non-optimal speed is already in Sage
for big fields):

"Dear William,
ZpzDom<Std64> should work with larger primes.
(as Modular<double> from LinBox should work with primes up to roughly 2^26)
Best, Jean-Guillaume Dumas."

>> > If someone wants to work on this (I can't right now), feel free to get
>> > in
>> > touch with me. My main suggestion would be to try to incorporate the
>> > model
>> > from sage.rings.polynomials.polynomial_template.
>>
>> When you copy code out to make your new code, please, please think
>> hard and benchmark each function. The code in
>> sage.rings.polynomials.polynomial_template has some annoying
>> inefficiencies due to lack of very careful benchmarking and use of
>> Cython, which surprised me last summer... (Of course, one can make
>> this sort of complaint about most code...)
>
> That surprises me as well. Do you think the inefficiencies are inherent in
> the general approach, or would a better implementation address the problem?

The general approach is fine, I think. It's entirely an issue of
taking even more care in the implementation.

> Once I actually start working on it, I would be interested in talking to you
> about specific inefficiencies that you've noticed. And I'll try to
> benchmark things as I go.

Cool.

> David

William Stein

unread,

Jan 31, 2011, 4:44:22 AM1/31/11

to sag...@googlegroups.com, Jean-Guillaume Dumas, Martin Albrecht

Jean-Guillaume's next email gives many further details (so a possibly
good answer to Kiran's question is that somebody should wrap more of
Givaro for Sage):

"
Also, note that
GFqDom<long long> is limited by the memory (of size 3p) and precomputations
(on most architectures should nowadays go around 2^22).

So ZpzDom<Std64> is the only one to go up to 32 bits primes (except of
course ZpzDom<Integer> which only has GMP as limits).

Note also that if extension fields more than prime fields are required
Extension<> will use the largest possible base field for polynomial
coefficients.

Best,

--
Jean-Guillaume Dumas."

Jeroen Demeyer

unread,

Jan 31, 2011, 6:38:56 AM1/31/11

to sag...@googlegroups.com

PARI now has specialized code to deal with finite fields. This code is
currently not used in Sage because it is a recent addition to PARI. So
wrapping PARI's new finite field code would be a clear improvement.
This would be quite a big project though.

Jeroen Demeyer

unread,

Jan 31, 2011, 6:58:32 AM1/31/11

to sag...@googlegroups.com

On 2011-01-31 06:19, Kiran Kedlaya wrote:

> Currently sage uses Givaro to implement finite fields of order up to
> 2^16. Beyond that range, it goes to either NTL in characteristic 2, or
> Pari in odd characteristic.
>
> The performance breakdown at 2^16 is pretty noticeable. Shouldn't it
> be possible to get good performance at least up to 2^32? If so, how?

Can you give an example of a (simple) computation which demonstates this
performance difference? Just to see whether using PARI's finite field
code would make a difference.

Jeroen.

William Stein

unread,

Jan 31, 2011, 11:28:31 AM1/31/11

to sag...@googlegroups.com

It's striking:

sage: k.<a> = GF(3^9); x=k.random_element();y=k.random_element();
timeit('x*y'); timeit('x+y')
625 loops, best of 3: 219 ns per loop
625 loops, best of 3: 221 ns per loop
sage: k.<a> = GF(3^11); x=k.random_element();y=k.random_element();
timeit('x*y'); timeit('x+y')
625 loops, best of 3: 14 µs per loop
625 loops, best of 3: 13.5 µs per loop
sage: 14/.219
63.9269406392694

Jeroen Demeyer

unread,

Jan 31, 2011, 4:17:36 PM1/31/11

to sag...@googlegroups.com

Very roughly speaking, I would say that the new PARI code would improve
on the current code by a factor 10, but that would still be a factor 10
slower than Givaro:

# Convert a Sage finite field element to a PARI finite field element
sage: def pari_ffelt(x): parix=pari(x); return
parix.lift().subst(parix.variable(), "ffgen((%s).mod)"%parix)

# Sage (= old PARI) vs. new PARI

sage: k.<a> = GF(3^11); x=k.random_element(); y=k.random_element();

sage: px = pari_ffelt(x); py = pari_ffelt(y);
sage: timeit('x*y'); timeit('x+y'); timeit('x^1000')
625 loops, best of 3: 23 �s per loop
625 loops, best of 3: 17 �s per loop
625 loops, best of 3: 314 �s per loop
sage: timeit('px*py'); timeit('px+py'); timeit('px^1000')
625 loops, best of 3: 3.16 �s per loop
625 loops, best of 3: 1.06 �s per loop
625 loops, best of 3: 33 �s per loop

# Givaro vs. new PARI
sage: k.<a> = GF(3^10); x=k.random_element(); y=k.random_element();
sage: px = pari_ffelt(x); py = pari_ffelt(y);
sage: timeit('x*y'); timeit('x+y'); timeit('x^1000')
625 loops, best of 3: 217 ns per loop

625 loops, best of 3: 221 ns per loop

625 loops, best of 3: 1.41 �s per loop
sage: timeit('px*py'); timeit('px+py'); timeit('px^1000')
625 loops, best of 3: 3.14 �s per loop
625 loops, best of 3: 1.03 �s per loop
625 loops, best of 3: 30.2 �s per loop

Jeroen.

Maarten Derickx

unread,

Aug 27, 2012, 6:07:48 PM8/27/12

to sag...@googlegroups.com

Has anyone done any work on this in the mean time?

I just tried the timings of Jeroen in the current sage and noticed that there was no significant change in the last 1.5 years. I could only find http://trac.sagemath.org/sage_trac/ticket/12142 on which nothing happend so I guess the faster pari stuff has not been done yet (or does someone already has some hacks for private working on it). I could not find any tickets related to David Roe his suggestions.

Jeroen Demeyer

unread,

Aug 28, 2012, 4:08:50 AM8/28/12

to sag...@googlegroups.com

On 2012-08-28 00:07, Maarten Derickx wrote:
> Has anyone done any work on this in the mean time?
> I just tried the timings of Jeroen in the current sage and noticed that
> there was no significant change in the last 1.5 years. I could only
> find http://trac.sagemath.org/sage_trac/ticket/12142 on which nothing
> happend so I guess the faster pari stuff has not been done yet (or does
> someone already has some hacks for private working on it).

As far as I know, it has not been done yet. I also think it's a pretty
big project. One immediate problem is that these PARI finite fields
break the assumption that pari(str(x)) == x (if x is a PARI FFELT object).

Peter Bruin

unread,

Jun 24, 2013, 4:12:26 PM6/24/13

to sag...@googlegroups.com

Hello,

This week I decided to work on an interface to PARI's FFELT type. I hope to upload a patch soon to http://trac.sagemath.org/sage_trac/ticket/12142. Pickling turned out to be easy, by using the same infrastructure as for the other finite field types.

Peter

Op dinsdag 28 augustus 2012 10:08:56 UTC+2 schreef Jeroen Demeyer het volgende:

Reply all

Reply to author

Forward