devising a speed comparison test between different types

72 views
Skip to first unread message

Pierre

unread,
Dec 3, 2016, 6:00:58 PM12/3/16
to sage-devel
Hi,

I hope that this is appropriate for sage-devel, as it is technical. 

As some colleagues who are new to Sagemath were asking me what I knew, I realized that I was unable to comment very sensibly on the various types of integers and of real numbers treated by Sage (no doubt because in my own research, ZZ and QQ are basically fine, as far as real numbers go). This question is bound to come up again, and I'm curious, so I'd like to find out.

We are talking about very basic advice to give to a beginner, and so, I knew enough to say that well, C ints (so I guess numpy.int's) will be fast, but limited in size, and elements of ZZ can be as large as your memory allows etc (and a similar, but different, discussion with RDF (or numpy.float) and RR etc). However, I am unable to write a piece of code that would clearly show the difference in speed ! 

I tried naive things like setting x and y to be integers of a certain type, and then

sage: %timeit x^y

for example, but I always get ""  The slowest run took 59.81 times longer than the fastest. This could mean that an intermediate result is being cached. ""

This makes sense, but I'm not sure what else to try. Individual " %time x^y " statements seem to show no difference between ZZ and numpy.int, for example, which puzzles me (overhead?). Exact same issues when defining the factorial via

fac= lambda n : 1 if n == 0 else n*fac(n-1)

and trying to time fac(500): about the same with ZZ and numpy.int.

I thought about multiplying large matrices, but I'm afraid that completely different algorithms/libraries will be used depending on the parent ring (say, for numpy.int it would be numpy matrix multiplication, obviously very fast), and that's no good: I would just like to emphasize the difference in speed for basic operations like + or *.

Here i'm not talking about Cython: sure, I tried 'int' and 'double' in Cython code, it's blazingly fast, but that's not the question here.

So here is my question: does anybody know of a basic test/piece of code that would illustrate the difference in speed between various types of integers and/or floats?

thanks!

Pierre








Ralf Stephan

unread,
Dec 4, 2016, 1:53:01 AM12/4/16
to sage-devel
On Sunday, December 4, 2016 at 12:00:58 AM UTC+1, Pierre wrote:
I tried naive things like setting x and y to be integers of a certain type, and then

sage: %timeit x^y

for example, but I always get ""  The slowest run took 59.81 times longer than the fastest. This could mean that an intermediate result is being cached. ""

Yes, the first computation is relevant so instead of timeit I would use time.
 
This makes sense, but I'm not sure what else to try. Individual " %time x^y " statements seem to show no difference between ZZ and numpy.int, for example, which puzzles me (overhead?). Exact same issues when defining the factorial via

Both ZZ and numpy use libgmp internally so if you subtract other factors the times should be similar.
 
fac= lambda n : 1 if n == 0 else n*fac(n-1)

Note however that most of the time for fac is used for the Python loop.
 
sage: %time _=fac(500)
CPU times
: user 397 µs, sys: 0 ns, total: 397 µs
Wall time: 344 µs
sage
: %time _=factorial(500)
CPU times
: user 14 µs, sys: 6 µs, total: 20 µs
Wall time: 21.9 µs

Also note that time for converting the internal result to string output can be significant as well.

So here is my question: does anybody know of a basic test/piece of code that would illustrate the difference in speed between various types of integers and/or floats?

As shown the differences are irrelevant if you allow other factors.

Simon King

unread,
Dec 4, 2016, 4:54:11 AM12/4/16
to sage-...@googlegroups.com
Hi Pierre,

On 2016-12-03, Pierre <pierre....@gmail.com> wrote:
> We are talking about very basic advice to give to a beginner, and so, I
> knew enough to say that well, C ints (so I guess numpy.int's) will be fast,
> but limited in size, and elements of ZZ can be as large as your memory
> allows etc (and a similar, but different, discussion with RDF (or
> numpy.float) and RR etc). However, I am unable to write a piece of code
> that would clearly show the difference in speed !

I'd put the test loop into a Cythonized function (in order to reduce
the overhead of Python loops):

sage: cython("""
....: def testfunc(a,b,o):
....: cdef int i
....: for i in range(1000000):
....: x = o(a,b)
....: """)
sage: %timeit testfunc(1234123,1234123, operator.add)
10 loops, best of 3: 58.7 ms per loop
sage: %timeit testfunc(1234123,1234123, operator.mul)
10 loops, best of 3: 60.7 ms per loop
sage: %timeit testfunc(int(1234123),int(1234123), operator.add)
10 loops, best of 3: 49.5 ms per loop
sage: %timeit testfunc(int(1234123),int(1234123), operator.mul)
10 loops, best of 3: 48.9 ms per loop
sage: %timeit testfunc(RR(1234123),RR(1234123), operator.add)
1 loop, best of 3: 233 ms per loop
sage: %timeit testfunc(RR(1234123),RR(1234123), operator.mul)
1 loop, best of 3: 244 ms per loop
sage: %timeit testfunc(QQ(1234123),QQ(1234123), operator.add)
1 loop, best of 3: 430 ms per loop
sage: %timeit testfunc(QQ(1234123),QQ(1234123), operator.mul)
1 loop, best of 3: 686 ms per loop

and for exponentiation I'd use a shorter loop:

sage: cython("""
....: def testfunc(a,b,o):
....: cdef int i
....: for i in range(100):
....: x = o(a,b)
....: """)
sage: %time testfunc(1234123,1234123, operator.pow)
CPU times: user 18.2 s, sys: 0 ns, total: 18.2 s
Wall time: 18.2 s

and very surprisingly (to me at least) exponentiation takes a
lot longer when using int(1234123)**int(1234123). Should a Python int
not simply give a wrong result but at least be a lot faster than an
arbitrary precision integer??

Anyway, with shorter numbers, I get

sage: %timeit testfunc(1234,123, operator.pow)
10000 loops, best of 3: 44.8 µs per loop
sage: %timeit testfunc(int(1234),int(123), operator.pow)
1000 loops, best of 3: 212 µs per loop
sage: int(1234)**int(123)==1234**123
True
sage: type(int(1234)**int(123))
<type 'long'>

Cheers,
Simon


Simon King

unread,
Dec 4, 2016, 5:00:12 AM12/4/16
to sage-...@googlegroups.com
PS:

On 2016-12-03, Pierre <pierre....@gmail.com> wrote:
> I thought about multiplying large matrices, but I'm afraid that completely
> different algorithms/libraries will be used depending on the parent ring

Yes, this would measure the efficiency of matrix arithmetics but would
give no real clue about the efficiency of ring arithmetcs.

> Here i'm not talking about Cython: sure, I tried 'int' and 'double' in
> Cython code, it's blazingly fast, but that's not the question here.

Note that in my test functions I only used Cython to reduce the overhead
of the loop. I didn't cdef the things I'm doing computations with.

Best regards,
Simon

Pierre

unread,
Dec 4, 2016, 8:47:12 AM12/4/16
to sage-devel
Hi all,

Thanks for your answers. There are a few things that I now understand, but i'm still confused. One crucial mistake I made was, as I see now thanks to Ralf's message, to believe that numpy.int meant "C int": no, it means python int ! For C ints, use numpy.int32 (or 64). In fact :

sage: numpy.int(2) ^ numpy.int(32)
4294967296

sage
: numpy.int(2) ^ numpy.int(64)
18446744073709551616L   #notice the L, typical of python ints

sage
: numpy.int32(2) ^ numpy.int32(32)
/Applications/SageMath/src/bin/sage-ipython:1: RuntimeWarning: overflow encountered in int_scalars
#!/usr/bin/env python
0


All as expected. And to answer Simon's interrogation, if I'm correct a python int is actually arbitrary precision (and it prints with an L when above 2^64), but it's slower than ZZ. It's not a C int.

However, i've been playing with Simon's code for testing (nice idea!), and i'm still puzzled:

sage: %time testfunc(ZZ(1234123), ZZ(1234123), operator.add)
CPU times
: user 69.3 ms, sys: 787 µs, total: 70.1 ms
Wall time: 70.8 ms

sage
: %time testfunc(int(1234123), int(1234123), operator.add)
CPU times
: user 42.3 ms, sys: 631 µs, total: 42.9 ms
Wall time: 43.5 ms

sage
: %time testfunc(numpy.int(1234123), numpy.int(1234123), operator.add)
CPU times
: user 44 ms, sys: 538 µs, total: 44.5 ms
Wall time: 45 ms

sage
: %time testfunc(numpy.int32(1234123), numpy.int32(1234123), operator.add)
CPU times
: user 74.2 ms, sys: 822 µs, total: 75 ms
Wall time: 77.7 ms

So ZZ is slightly slower than python int = numpy.int, which is already weird since the arbitrary precision integers of ZZ are meant to faster than those of python.

But even more bizarre is the fact that numpy.int32 gives the slowest performance! i'm guessing that the operator.add includes overhead when asking numpy to add its ints.

Question: Is there a way of using very fast C integers within Sage, without using Cython? (the latter because the answer is meant to be understandable by beginners).

If I figure this out, I guess the analogous question with floats will be answered, too.

Thanks !

Pierre






 

Pierre

unread,
Dec 4, 2016, 9:11:16 AM12/4/16
to sage-devel
PS actually, i am a little wrong with Python ints being arbitrary precision. In fact there is no such thing as just a python int, there is "int" and "long", but computations with ints can give you an answer which is "long"; presumably this mostly means that everything is as slow as if everything was always a "long". Well, i don't really know, and I wish I did.

Pierre Guillot

unread,
Dec 4, 2016, 11:58:45 AM12/4/16
to sage-...@googlegroups.com
PPS come to think of it, my last PS explains it all. So in summary:

-- ZZ= always arbitrary precision.
-- int= a C int when the numbers stay < 2^64, so there is an increase
of speed -- but not nearly as much of an increase as I expected, which
is why I was confused at first, because I wasn't seeing this for what
it was. Also, computations start using "long" when the numbers are
big, so you don't see the overflow, and you don't always see the
increase in speed.
-- numpy.int32 or int.64: like "int" initially, but works mod 2^32 or
2^64, and gives an overflow warning when it happens. No increase in
speed, for general reasons which I will just call "overhead" for lack
of a better understanding. (Still good for numpy functions,
obviously).
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "sage-devel" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sage-devel/HNX5rHzuBc8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> sage-devel+...@googlegroups.com.
> To post to this group, send email to sage-...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sage-devel.
> For more options, visit https://groups.google.com/d/optout.



--
Pierre
06 06 40 72 28

Volker Braun

unread,
Dec 4, 2016, 12:42:12 PM12/4/16
to sage-devel
On Sunday, December 4, 2016 at 5:58:45 PM UTC+1, Pierre wrote:
-- numpy.int32 or int.64: like "int" initially, but works mod 2^32 or
2^64, and gives an overflow warning when it happens. No increase in
speed, for general reasons which I will just call "overhead" for lack
of a better understanding. (Still good for numpy functions,
obviously).

Signed overflow is undefined in C/C++ and not necessarily mod 2^x. I'm pretty sure that numpy inherits this from C/C++. You must use unsigned integer types like numpy.uint64 if you rely on overflow behavior.

The overhead is of course the wrapping in a Python object. If that is limiting your computation then you'll have to work in C/Cython with the internal integer (machine int or gmp/mpir).

William Stein

unread,
Dec 4, 2016, 1:18:56 PM12/4/16
to sage-devel

On Sat, Dec 3, 2016 at 10:53 PM, Ralf Stephan <> wrote:

“Both ZZ and numpy use libgmp internally “

No, ZZ uses libgmp (actually really MPIR, which is a fork of GMP), and numpy uses Python’s ints/longs. Python’s int/long type is arbitrary precision, despite the confusing naming. It only implements relatively naive algorithms – karatsuba, etc., – and not the asymptotically fast Fourier transform methods that GMP (and MPIR) implement and highly optimize. So Sage’s ZZ will start beating the pants off of Python (and numpy) when the numbers get large.

Example – try this with various values of “digits” and you’ll see ZZ being arbitrarily faster than Python’s longs:

def bench(digits):    print "digits =", digits    global n, m, n1, m1, n2, m2  # timeit uses global namespace    n = ZZ.random_element(10^digits)    m = ZZ.random_element(10^digits)    %timeit n*m    n1 = int(n)    m1 = int(m)    %timeit n1*m1    import numpy    n2 = numpy.int(n)    m2 = numpy.int(m)    %timeit n2*m2
bench(10)bench(100)bench(1000)bench(10000)bench(100000)bench(1000000)
digits = 10 625 loops, best of 3: 51.1 ns per loop 625 loops, best of 3: 126 ns per loop 625 loops, best of 3: 107 ns per loop digits = 100 625 loops, best of 3: 445 ns per loop 625 loops, best of 3: 315 ns per loop 625 loops, best of 3: 254 ns per loop digits = 1000 625 loops, best of 3: 2.16 µs per loop 625 loops, best of 3: 13.8 µs per loop 625 loops, best of 3: 13.5 µs per loop digits = 10000 625 loops, best of 3: 64.9 µs per loop 625 loops, best of 3: 588 µs per loop 625 loops, best of 3: 618 µs per loop digits = 100000 625 loops, best of 3: 1.47 ms per loop 25 loops, best of 3: 20.9 ms per loop 25 loops, best of 3: 21.9 ms per loop digits = 1000000 25 loops, best of 3: 18.7 ms per loop 5 loops, best of 3: 870 ms per loop 5 loops, best of 3: 883 ms per loop
# At 1 million digits (on this test machine): ZZ is 46.5 times faster than Python ints:870 / 18.7
46.5240641711230
Reply all
Reply to author
Forward
0 new messages