Arbitrary precision in Cython & NumPy?

KvS

unread,

Sep 8, 2010, 12:19:10 PM9/8/10

to sage-support

Dear all,

I am trying to implement a recursive algorithm that is rather complex,
in the sense that it uses a high number of variables and (elementary)
computations. The output in Sage looks fine but it gets quite slow, so
I am thinking of ways to speed it up. Given that it is mainly a lot of
looping and a lot of elementary computations, I would guess
translating it to Cython could help a lot.

However I am afraid that doubles won't have enough precision to avoid
too much numerical noise in the end result of the algorithm. So I
would like to use a higher precision real number representation. The
question is whether this is possible, and if so what is a sensible
choice? Could/Should I use mpmath e.g. or rather something else? What
I need to be doing, next to elementary computations, is:

- compute exponentials
- perform find_root's
- being able to store those real numbers in a few big Numpy-arrays.

I am very grateful for any hint!

Many thanks, Kees

Simon King

unread,

Sep 8, 2010, 2:46:05 PM9/8/10

to sage-support

Hi Kees!

On 8 Sep., 18:19, KvS <keesvansch...@gmail.com> wrote:
> ...

> I am thinking of ways to speed it up. Given that it is mainly a lot of
> looping and a lot of elementary computations, I would guess
> translating it to Cython could help a lot.
>
> However I am afraid that doubles won't have enough precision to avoid

> too much numerical noise in the end result of the algorithm. ...

"Translating to Cython" does not necessarily mean that you cdefine
*all* variables in your program. If you really have many tight loops
then you should cdefine the loop variable, such as:

def foo(x): # here, x is numerical input
# note that I do not say "def foo(double x)" or so!
cdef int i
for i from 0<=i<...: # instead of "for i in range(...)"
<do some trivial computation with x>

So, the type of x is not declared, only the type of the loop variable
is.
This may already provide some speedup, and you would still be able to
use high precision numerical data, rather than double.

Note that you can get an annotated version of your code. In the
notebook, this is available via some link; I don't remember what you
need to do on the command line (try to search the manual). In that
annotated version (that is a html file that you can look at in your
browser), you see which lines in your code are potentially time
consuming (the more yellow, the slower, as a rule of thumb). If you
have much yellow inside a loop, you should do something about it.

That's quite handy for Cython code optimisation!

Best regards,
Simon

Jason Grout

unread,

Sep 8, 2010, 2:57:07 PM9/8/10

to sage-s...@googlegroups.com

On 9/8/10 1:46 PM, Simon King wrote:
> Hi Kees!
>
> On 8 Sep., 18:19, KvS<keesvansch...@gmail.com> wrote:
>> ...
>> I am thinking of ways to speed it up. Given that it is mainly a lot of
>> looping and a lot of elementary computations, I would guess
>> translating it to Cython could help a lot.
>>
>> However I am afraid that doubles won't have enough precision to avoid
>> too much numerical noise in the end result of the algorithm. ...
>
> "Translating to Cython" does not necessarily mean that you cdefine
> *all* variables in your program. If you really have many tight loops
> then you should cdefine the loop variable, such as:
>
> def foo(x): # here, x is numerical input
> # note that I do not say "def foo(double x)" or so!
> cdef int i
> for i from 0<=i<...: # instead of "for i in range(...)"
> <do some trivial computation with x>

Actually, for a while now, "for i in range(...)" is translated into fast
C intelligently. In fact, I believe it's the recommended syntax now,
instead of 0<=i<...

I believe that Cython 0.13 (just released, and making its way into Sage)
does even more to optimize loops intelligently.

> Note that you can get an annotated version of your code. In the
> notebook, this is available via some link; I don't remember what you
> need to do on the command line (try to search the manual).

I usually do something like:

sage -cython -a my-file.pyx

which creates the .html file in the current directory.

Thanks,

Jason

Simon King

unread,

Sep 8, 2010, 3:15:58 PM9/8/10

to sage-support

Hi Jason!

On 8 Sep., 20:57, Jason Grout <jason-s...@creativetrax.com> wrote:
> Actually, for a while now, "for i in range(...)" is translated into fast
> C intelligently. In fact, I believe it's the recommended syntax now,
> instead of 0<=i<...

Even if you do *not* cdefine "cdef int i"? That's new to me.

Cheers,
Simon

Jason Grout

unread,

Sep 8, 2010, 3:22:47 PM9/8/10

to sage-s...@googlegroups.com

No, you still need to do cdef int i, I believe.

So:

cdef int i
for i in range(10):
....

should be fast.

Jason

Jason Grout

unread,

Sep 8, 2010, 3:27:02 PM9/8/10

to sage-s...@googlegroups.com

Doc reference:
http://docs.cython.org/src/userguide/language_basics.html#integer-for-loops

http://docs.cython.org/src/userguide/pyrex_differences.html#automatic-range-conversion

Thanks,

Jason

KvS

unread,

Sep 8, 2010, 4:29:55 PM9/8/10

to sage-support

Thanks a lot for the hints so far, I will go and try them out. I'd
still also be very interested if somebody could shed some more light
on my original questions!

Thanks, Kees

Greg Marks

unread,

Sep 8, 2010, 6:03:34 PM9/8/10

to sage-support

It sounds like a C program using MPFR (http://www.mpfr.org)
would do what you want. As MPFR is built into SAGE, you might
perhaps find it more convenient to invoke MPFR within SAGE.

Sincerely,
Greg Marks

------------------------------------------------
| Greg Marks |
| Department of Mathematics and Computer Science |
| St. Louis University |
| St. Louis, MO 63103-2007 |
| U.S.A. |
| |
| Phone: (314)977-7206 |
| Fax: (314)977-1452 |
| Web: http://math.slu.edu/~marks |
------------------------------------------------

Robert Bradshaw

unread,

Sep 8, 2010, 7:06:21 PM9/8/10

to sage-s...@googlegroups.com

On Wed, Sep 8, 2010 at 3:03 PM, Greg Marks <gtm...@gmail.com> wrote:
> It sounds like a C program using MPFR (http://www.mpfr.org)
> would do what you want. As MPFR is built into SAGE, you might
> perhaps find it more convenient to invoke MPFR within SAGE.

This is what I would recommend. You can do something like

from sage.rings.real_mpfr cimport *

def square(RealNumber x):
cdef RealNumber result = x._new()
mpfr_mul(result.value, x.value, x.value, GMP_RNDN)
return result

If you really need speed, you can declare, allocate, and free mpfr_t
variables themselves (declared via "cdef mpfr_t var") rather than
using the RealNumber Python objects as wrappers.

Just out of curiosity, what multi-precision root finding tools are you
using? Also, I don't think numpy supports high-precision floating
point arrays (except as arrays of generic objects).

- Robert

KvS

unread,

Sep 8, 2010, 9:39:44 PM9/8/10

to sage-support

Thanks all, however I am not very successful so far :(.

I tried both options mentioned before:

- only optimize the loops in Cython and keep using symbolic
expressions/infinite precision, but this is unfortunately rather
slow;
- fully optimize in Cython by turning to doubles everywhere, although
it gets very fast here happens what I was already afraid of: the
algorithm goes belly up after 15 steps :( (I would need a lot more
steps). In step 14 values like 2.51885860e-34 appear and in
combination with the numerical noise gathered this turns out to
produce very wrong values at the next step.

@Robert: I'm trying to implement an algorithm that computes a sequence
of functions v_k according to the rule v_k(x) = \max \{ K-e^x, \int
v_{k-1}(y) p(x+y) dy \}, v_0(x)=K-e^x, where p is a prob. density. The
background is an optimal stopping problem in stochastics. So at each
step I essentially first compute the integral and then determine where
the integral and x -> K-e^x intersect, this just by the basic
find_root function in Sage.

It's not difficult (however very annoying...) to write down a formula
for the integral (given the specific density p I'm using) and work out
formulae for all the coefficients that appear in this formula for the
integral in terms of those appearing in v_{k-1}. The number of
coefficients (and hence computations) involved in the formula for v_k
increases very quickly with k, that explains why the algorithm gets
slow in 'symbolic mode'. However 'double mode' is not good enough as
mentioned above.

I will try your suggestion for using mpfr, thanks Robert and Greg!
(however it looks a bit scary to me ;)). I need to store the huge
amounts of coefficients somewhere, I guess numpy is my best guess even
though I have to declare the dtype as object then, no?

Jason Grout

unread,

Sep 8, 2010, 10:21:17 PM9/8/10

to sage-s...@googlegroups.com

One of the things you might do is use RR (which is based on mpfr), and
when you need to do a sequence of calculations really fast, reach in to
the RR number, do stuff using mpfr, and then create a new RR number with
the new multiprecision result. The RR _add method shows how to do this,
for example:

cdef RealNumber x
x = self._new()
mpfr_add(x.value, self.value,
(<RealNumber>other).value,(<RealField_class>self._parent).rnd)
return x

Notice how it creates a new real number x, does something using mpfr to
manipulate the mpfr value, but then returns the RR number.

You can find lots more examples in devel/sage/sage/rings/real_mpfr.pyx

Thanks,

Jason

maldun

unread,

Sep 9, 2010, 5:02:19 AM9/9/10

to sage-support

Why don't using mpmath for the critical numerical steps?

It supports the features you need, is quite efficient, and also has
multi precision vectors and matrices.
And after the problematic steps, getting back to numpy arrays isn't
really a problem.

If you really want to get out the every tiny bit of efficiency out of
your code, then you have to stick to cython and mpfr.
But I'm not sure if this is really necessary.

greez
maldun

Robert Bradshaw

unread,

Sep 9, 2010, 12:18:10 PM9/9/10

to sage-s...@googlegroups.com

On Thu, Sep 9, 2010 at 2:02 AM, maldun <dom...@gmx.net> wrote:
> On Sep 8, 6:19 pm, KvS <keesvansch...@gmail.com> wrote:
>> Dear all,
>>
>> I am trying to implement a recursive algorithm that is rather complex,
>> in the sense that it uses a high number of variables and (elementary)
>> computations. The output in Sage looks fine but it gets quite slow, so
>> I am thinking of ways to speed it up. Given that it is mainly a lot of
>> looping and a lot of elementary computations, I would guess
>> translating it to Cython could help a lot.
>>
>> However I am afraid that doubles won't have enough precision to avoid
>> too much numerical noise in the end result of the algorithm. So I
>> would like to use a higher precision real number representation. The
>> question is whether this is possible, and if so what is a sensible
>> choice? Could/Should I use mpmath e.g. or rather something else? What
>> I need to be doing, next to elementary computations, is:
>>
>> - compute exponentials
>> - perform find_root's
>> - being able to store those real numbers in a few big Numpy-arrays.
>>
>> I am very grateful for any hint!
>>
>> Many thanks, Kees
>
> Why don't using mpmath for the critical numerical steps?

With the exception of lots of special functions or very large
precisions, this probably won't be a speed advantage over Sage's
native RealNumbers given that they both have Python overhead. (As the
precision grows, the arithmetic itself will be comprable, as they are
both using MPIR/GMP under the hood.)

> It supports the features you need, is quite efficient, and also has
> multi precision vectors and matrices.
> And after the problematic steps, getting back to numpy arrays isn't
> really a problem.
>
> If you really want to get out the every tiny bit of efficiency out of
> your code, then you have to stick to cython and mpfr.
> But I'm not sure if this is really necessary.
>
> greez
> maldun
>

> --
> To post to this group, send email to sage-s...@googlegroups.com
> To unsubscribe from this group, send email to sage-support...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-support
> URL: http://www.sagemath.org
>

Robert Bradshaw

unread,

Sep 9, 2010, 12:27:02 PM9/9/10

to sage-s...@googlegroups.com

On Wed, Sep 8, 2010 at 6:39 PM, KvS <keesva...@gmail.com> wrote:
> Thanks all, however I am not very successful so far :(.
>
> I tried both options mentioned before:
>
> - only optimize the loops in Cython and keep using symbolic
> expressions/infinite precision, but this is unfortunately rather
> slow;

What do you mean by symbolic? Are you actually using maxima and the
various calculus modules?

> - fully optimize in Cython by turning to doubles everywhere, although
> it gets very fast here happens what I was already afraid of: the
> algorithm goes belly up after 15 steps :( (I would need a lot more
> steps). In step 14 values like 2.51885860e-34 appear and in
> combination with the numerical noise gathered this turns out to
> produce very wrong values at the next step.
>
> @Robert: I'm trying to implement an algorithm that computes a sequence
> of functions v_k according to the rule v_k(x) = \max \{ K-e^x, \int
> v_{k-1}(y) p(x+y) dy \}, v_0(x)=K-e^x, where p is a prob. density. The
> background is an optimal stopping problem in stochastics. So at each
> step I essentially first compute the integral and then determine where
> the integral and x -> K-e^x intersect, this just by the basic
> find_root function in Sage.
>
> It's not difficult (however very annoying...) to write down a formula
> for the integral (given the specific density p I'm using) and work out
> formulae for all the coefficients that appear in this formula for the
> integral in terms of those appearing in v_{k-1}. The number of
> coefficients (and hence computations) involved in the formula for v_k
> increases very quickly with k, that explains why the algorithm gets
> slow in 'symbolic mode'. However 'double mode' is not good enough as
> mentioned above.

It sounds like you're trying too extreems--how about using your
"double mode" algorithm, but instead using elements of RealField(N)
where N > 53. This should be much faster than "symbolic" (if I
understand you right, calling find_root and integrate) but have higher
precision than using raw doubles. This may be enough, without getting
your hands on MPFR directly yourself.

You might also consider looking at how the dickman rho function is
implemented http://hg.sagemath.org/sage-main/file/b5dab6864f35/sage/functions/transcendental.py#l464
which sounds similar in spirit.

> I will try your suggestion for using mpfr, thanks Robert and Greg!
> (however it looks a bit scary to me ;)). I need to store the huge
> amounts of coefficients somewhere, I guess numpy is my best guess even
> though I have to declare the dtype as object then, no?

What do you mean by huge? Would a list suffice? Or if it's a
polynomial, what about a polynomial in RR? For high precision, the
per-element overhead is constant, so I don't see an advantage to using
NumPy here just as a storage divice if the dtype is object.

- Robert

Jason Grout

unread,

Sep 9, 2010, 12:46:38 PM9/9/10

to sage-s...@googlegroups.com

On 9/9/10 11:27 AM, Robert Bradshaw wrote:
> This should be much faster than "symbolic" (if I
> understand you right, calling find_root and integrate) but have higher
> precision than using raw doubles.

I believe the standard find_root uses scipy, which is limited to double
precision.

Jason

Robert Bradshaw

unread,

Sep 9, 2010, 1:20:40 PM9/9/10

to sage-s...@googlegroups.com

Shouldn't be too hard to follow up with Newton's method to increase
the precision if desired.

- Robert

KvS

unread,

Sep 9, 2010, 2:44:30 PM9/9/10

to sage-support

On Sep 9, 5:27 pm, Robert Bradshaw <rober...@math.washington.edu>
wrote:

> On Wed, Sep 8, 2010 at 6:39 PM, KvS <keesvansch...@gmail.com> wrote:
> > Thanks all, however I am not very successful so far :(.
>
> > I tried both options mentioned before:
>
> > - only optimize the loops in Cython and keep using symbolic
> > expressions/infinite precision, but this is unfortunately rather
> > slow;
>
> What do you mean by symbolic? Are you actually using maxima and the
> various calculus modules?

By symbolic/infinite precision I just mean keeping quantities like say
log(2) as they are, rather than approximating them by some finite
precision real number, sorry for the confusion. No, I am not using
Maxima or anything more advanced than the elementary operations, the
exponential function and the find_root function.

(I don't need to do any integration in the code, sorry if my previous
post seemed to suggest that, the integral that occurs there I have
worked out completely in terms of elementary operations.) Yes, I have
tried your suggestion of using RealField (without Cython though) and
this works good in the sense that the added precision in comparison
with floats makes the algorithm produce reliable output, thanks!

However, it is still too slow. If I could further speed it up by a few
factors that would be great. I will further try if I can figure out
how to use Cython to speed up parts of the algorithm where a lot of
looping and computing is done.

>
> You might also consider looking at how the dickman rho function is

> implementedhttp://hg.sagemath.org/sage-main/file/b5dab6864f35/sage/functions/tra...

> which sounds similar in spirit.
>
> > I will try your suggestion for using mpfr, thanks Robert and Greg!
> > (however it looks a bit scary to me ;)). I need to store the huge
> > amounts of coefficients somewhere, I guess numpy is my best guess even
> > though I have to declare the dtype as object then, no?
>
> What do you mean by huge? Would a list suffice? Or if it's a
> polynomial, what about a polynomial in RR? For high precision, the
> per-element overhead is constant, so I don't see an advantage to using
> NumPy here just as a storage divice if the dtype is object.

The function v_k is not a polynomial but a piecewise function
consisting of linear combinations of terms a*x^b*exp(c*x) (*) in each
part of its domain. 'Huge' means that the number of coefficients (i.e.
all the a, b and c's in (*)) to be computed in the k-th step of the
algorithm roughly grows like k^2. These coefficients are currently
organized in 4-dimensional Numpy-arrays: the k runs along one axis and
there are three other 'counters' (one to keep track of the part of the
domain etc.). I would prefer not to give up this way of representing
the coefficients as my notes use this form as well. Would you say I
could better use 4-d Python lists rather?

Thanks again, Kees

>
> - Robert

KvS

unread,

Sep 9, 2010, 2:43:30 PM9/9/10

to sage-support

On Sep 9, 5:27 pm, Robert Bradshaw <rober...@math.washington.edu>
wrote:

> On Wed, Sep 8, 2010 at 6:39 PM, KvS <keesvansch...@gmail.com> wrote:
> > Thanks all, however I am not very successful so far :(.
>
> > I tried both options mentioned before:
>
> > - only optimize the loops in Cython and keep using symbolic
> > expressions/infinite precision, but this is unfortunately rather
> > slow;
>
> What do you mean by symbolic? Are you actually using maxima and the
> various calculus modules?

By symbolic/infinite precision I just mean keeping quantities like say
log(2) as they are, rather than approximating them by some finite
precision real number, sorry for the confusion. No, I am not using
Maxima or anything more advanced than the elementary operations, the
exponential function and the find_root function.

>
>
>

(I don't need to do any integration in the code, sorry if my previous
post seemed to suggest that, the integral that occurs there I have
worked out completely in terms of elementary operations.) Yes, I have
tried your suggestion of using RealField (without Cython though) and
this works good in the sense that the added precision in comparison
with floats makes the algorithm produce reliable output, thanks!

However, it is still too slow. If I could further speed it up by a few
factors that would be great. I will further try if I can figure out

how to use Cython to speed up parts of the algorithm where a lot of
looping and computing is done.

>
> You might also consider looking at how the dickman rho function is

> implementedhttp://hg.sagemath.org/sage-main/file/b5dab6864f35/sage/functions/tra...

> which sounds similar in spirit.
>
> > I will try your suggestion for using mpfr, thanks Robert and Greg!
> > (however it looks a bit scary to me ;)). I need to store the huge
> > amounts of coefficients somewhere, I guess numpy is my best guess even
> > though I have to declare the dtype as object then, no?
>
> What do you mean by huge? Would a list suffice? Or if it's a
> polynomial, what about a polynomial in RR? For high precision, the
> per-element overhead is constant, so I don't see an advantage to using
> NumPy here just as a storage divice if the dtype is object.

Robert Bradshaw

unread,

Sep 9, 2010, 4:17:16 PM9/9/10

to sage-s...@googlegroups.com

Just out of curiosity, how much did it help?

> If I could further speed it up by a few
> factors that would be great. I will further try if I can figure out
> how to use Cython to speed up parts of the algorithm where a lot of
> looping and computing is done.

That should be feasible--you probably don't need to change the whole
thing--profile to see what parts. Excessive coercion may also impact
things.

>> You might also consider looking at how the dickman rho function is
>> implementedhttp://hg.sagemath.org/sage-main/file/b5dab6864f35/sage/functions/tra...
>> which sounds similar in spirit.
>>
>> > I will try your suggestion for using mpfr, thanks Robert and Greg!
>> > (however it looks a bit scary to me ;)). I need to store the huge
>> > amounts of coefficients somewhere, I guess numpy is my best guess even
>> > though I have to declare the dtype as object then, no?
>>
>> What do you mean by huge? Would a list suffice? Or if it's a
>> polynomial, what about a polynomial in RR? For high precision, the
>> per-element overhead is constant, so I don't see an advantage to using
>> NumPy here just as a storage divice if the dtype is object.
>
> The function v_k is not a polynomial but a piecewise function
> consisting of linear combinations of terms a*x^b*exp(c*x) (*) in each
> part of its domain. 'Huge' means that the number of coefficients (i.e.
> all the a, b and c's in (*)) to be computed in the k-th step of the
> algorithm roughly grows like k^2. These coefficients are currently
> organized in 4-dimensional Numpy-arrays: the k runs along one axis and
> there are three other 'counters' (one to keep track of the part of the
> domain etc.). I would prefer not to give up this way of representing
> the coefficients as my notes use this form as well. Would you say I
> could better use 4-d Python lists rather?

No, for multi-dimensional stuff, NumPy is better, even if you're using
the object dtype.

- Robert

KvS

unread,

Sep 9, 2010, 6:25:19 PM9/9/10

to sage-support

On Sep 9, 9:17 pm, Robert Bradshaw <rober...@math.washington.edu>
wrote:

To give an idea (will depend on chosen input), I just produced timings
for one particular input with 5 steps in the algorithm, and symbolic/
infinite precision takes ~12 secs versus ~0.2 secs using
RealField(100) rather.

Ok, I will keep the NumPy approach then, thanks!

>
> - Robert

KvS

unread,

Sep 18, 2010, 10:31:42 AM9/18/10

to sage-support

Hi again,

I hope you don't mind me bumping this thread one more time. I started
experimenting with trying a few things for fast arbitrary precision
computations using Cython. Above it was suggested to use MPFR
directly, so without the RealNumber wrapper, as the fastest way. Here
is a bit of code that does the same thing (computing x**2+i*x for
input x and i in range(10**7)) using RealNumber, using doubles and
using MPFR directly:

def run2(RealNumber x):
"""Using RealNUmber:"""
cdef RealNumber res=x._new()
cdef RealNumber tmp=x._new()
cdef int i
for i in range(10**7):
mpfr_mul_ui(tmp.value,x.value,i,GMP_RNDN)
mpfr_mul(res.value, x.value, x.value, GMP_RNDN)
mpfr_add(res.value,res.value,tmp.value,GMP_RNDN)
return res

def run3(double x):
"""Using doubles:"""
cdef double res
cdef int i
for i in range(10**7):
res=x**2+x*i
return res

def run4(double x):
"""Using MPFR directly:"""
cdef mpfr_t xx, res, tmp
cdef int i
mpfr_set_default_prec(53)
mpfr_init(xx); mpfr_init(res); mpfr_init(tmp)
mpfr_set_d(xx,x,GMP_RNDN)
for i in range(10**7):
mpfr_mul_ui(tmp,xx,i,GMP_RNDN)
mpfr_mul(res, xx, xx, GMP_RNDN)
mpfr_add(res,res,tmp,GMP_RNDN)
return mpfr_get_d(res,GMP_RNDN)

and using the following bit of code in Sage:

RRR=RealField(53)
xx=random()
y=RRR(xx)
print "Using RealNumber:"
time res=run2(y)
print res
print "Using doubles:"
time res=run3(float(xx))
print res
print "Using MPFR directly:"
time res=run4(float(xx))
print res

I get the following output:

Using RealNumber:
Time: CPU 2.41 s, Wall: 2.51 s
4.54560477019471e6
Using doubles:
Time: CPU 0.02 s, Wall: 0.02 s
4545604.77019
Using MPFR directly:
Time: CPU 2.38 s, Wall: 2.87 s
4545604.77019

What surprises/disappoints me a bit is that both RealNumber and MPFR
directly are a factor 100 slower than using doubles, even though I'm
using RealField(53). Is this something I just have to live with, i.e.
computations with doubles are somehow just more optimized (?), or did
I do something wrong/is there something I can do to improve the speed
of the RealNumber/MPFR directly variations?

Many thanks, Kees

Thierry Dumont

unread,

Sep 18, 2010, 1:10:34 PM9/18/10

to sage-s...@googlegroups.com

Le 18/09/2010 16:31, KvS a �crit :

> Hi again,
>
> I hope you don't mind me bumping this thread one more time. I started
> experimenting with trying a few things for fast arbitrary precision
> computations using Cython. Above it was suggested to use MPFR
> directly, so without the RealNumber wrapper, as the fastest way. Here
> is a bit of code that does the same thing (computing x**2+i*x for
> input x and i in range(10**7)) using RealNumber, using doubles and
> using MPFR directly:
>
>
>

> What surprises/disappoints me a bit is that both RealNumber and MPFR
> directly are a factor 100 slower than using doubles, even though I'm
> using RealField(53). Is this something I just have to live with, i.e.
> computations with doubles are somehow just more optimized (?), or did
> I do something wrong/is there something I can do to improve the speed
> of the RealNumber/MPFR directly variations?
>

Hello,

There is no way to work around this:

->RDF (Double Precision numbers) use the processor arithmetic
->RealField(x) use the MPFR library.

In the first case, Sage (python, cython) passes the operations to the
processor which performs each operation in 1 (or may be 2) clock cycles.
In the second case, MFR, will call a (may be small) routine to perform
the task. Even if MPFR is perfectly optimized, this is much more costly.
Use MPFR when you need high precision, whatever is costs, or a special
rounding method.

Even worst (or better): the processor implements pipelining, parallelism
and so on.

Something I find very spectacular:

def prod(a,b):
for i in range(1,1):
a*b
c=random_matrix(RDF,1000)
d=random_matrix(RDF,1000)

prod(c,d)

try it; count the operations (10^9 multiplications). On my computer
(3ghz), prod(c,d) takes 0.18 second: that is to say 5 Gigaflops! (or 10
is you take account of the additions): more than 1 operation by clock
cycle. This because we use here: 1) the processor fpu, 2) the blas
Atlas, which allows the processor to perform at full speed (and
minimizes cache misses).

Now, change RDF to RR. It takes 421 seconds to perform prod(c,d) on my
computer: that is to say *2000* times more than with RDF numbers,
because 1) we do not use the fpu of the processor, 2) there is no
optimization of the matrix product (and a lot of cache misses also!).

For "number crunching" use the processor's fpu (RDF) directly!

Yours
t.d.

> Many thanks, Kees
>

tdumont.vcf

KvS

unread,

Sep 19, 2010, 8:09:26 AM9/19/10

to sage-support

On Sep 18, 7:10 pm, Thierry Dumont <tdum...@math.univ-lyon1.fr> wrote:
> Le 18/09/2010 16:31, KvS a ï¿½crit :

> tdumont.vcf
> < 1KViewDownload

Alright, many thanks for the clear and extensive answer Thierry.
Bottomline is thus that I'll have to live with it.

On a sidenote, I must admit it surprises me. I'm only an amateur
programmer, let alone that I know anything about the subtleties of how
cpus interact exactly with code, but you would somehow expect that it
should be possible when you add two arbitrary precision numbers say,
you break them up in smaller chunks (like 2115+3135 =
(21+31)*100+(15+35)) so that each of the smaller chunks are
essentially doubles to be added and can hence be performed at optimal
speed. You would get the overhead (breaking the numbers up in chunks
and putting them together again) plus the cycles necessary to make the
additions, but this would be a lot less than a factor 100 I'd have
guessed.

Well, anyhow, of course the above is layman talk and I will be missing
essential points, but that's probably why I'd have expected arbitrary
precision speed to be a lot closer to double speed.

Cheers, Kees

Jason Grout

unread,

Sep 20, 2010, 7:02:43 AM9/20/10

to sage-s...@googlegroups.com

On 9/19/10 7:09 AM, KvS wrote:

> Alright, many thanks for the clear and extensive answer Thierry.
> Bottomline is thus that I'll have to live with it.
>
> On a sidenote, I must admit it surprises me. I'm only an amateur
> programmer, let alone that I know anything about the subtleties of how
> cpus interact exactly with code, but you would somehow expect that it
> should be possible when you add two arbitrary precision numbers say,
> you break them up in smaller chunks (like 2115+3135 =
> (21+31)*100+(15+35)) so that each of the smaller chunks are
> essentially doubles to be added and can hence be performed at optimal
> speed. You would get the overhead (breaking the numbers up in chunks
> and putting them together again) plus the cycles necessary to make the
> additions, but this would be a lot less than a factor 100 I'd have
> guessed.
>
> Well, anyhow, of course the above is layman talk and I will be missing
> essential points, but that's probably why I'd have expected arbitrary
> precision speed to be a lot closer to double speed.

It's more than that. You can see at
http://www.mpfr.org/mpfr-current/mpfr.html#Introduction-to-MPFR that
MPFR guarantees a lot of stuff happens (modulo bugs, of course) in a
cross-platform manner.

To see the sort of thing you mention above, you might look at the
quad-double library (this isn't arbitrary precision, though):

Jason

Reply all

Reply to author

Forward