speeding up mpmath

615 views
Skip to first unread message

Ondrej Certik

unread,
May 12, 2008, 7:31:46 AM5/12/08
to mpm...@googlegroups.com
Hi,

mpmath contains some really neat numerical algorithms, be it ODE
solvers, or some other stuff. However, sometimes one only needs a
couple of digits in precision, but a correct result.

What are your views on speeding mpmath up for these cases? I think
using python floats is pretty fast. Is it worthy?

Another question -- compared to mpfr, what is the biggest bottleneck
of mpmath? Would writing the lib.py in C (possibly calling GMP) make
it the same fast? Or are there other problems?

Ondrej

P.S., let's use explicit imports in libmpc.py, calculus.py,
intervals.py and mptypes.py?

Fredrik Johansson

unread,
May 13, 2008, 5:31:12 PM5/13/08
to mpm...@googlegroups.com
On Mon, May 12, 2008 at 1:31 PM, Ondrej Certik <ond...@certik.cz> wrote:
>
> Hi,
>
> mpmath contains some really neat numerical algorithms, be it ODE
> solvers, or some other stuff. However, sometimes one only needs a
> couple of digits in precision, but a correct result.
>
> What are your views on speeding mpmath up for these cases? I think
> using python floats is pretty fast. Is it worthy?

You can send a float-valued function to e.g. quadts but the gain in
speed is usually not large due to the nature of the algorithms and
because of the fact that mpfs are used by the algorithms for internal
values. Changing the algorithms to optionally use floats for internal
values would add significant complexity, and besides, SciPy already
provides functions that are faster and more well tested for most
low-precision numerical tasks.

As part of my GSoC project, I'm however going to look into ways to use
machine precision arithmetic whenever it provides sufficient accuracy.
So expression evaluation in SymPy should typically be a bit faster
than directly using mpmath.

> Another question -- compared to mpfr, what is the biggest bottleneck
> of mpmath? Would writing the lib.py in C (possibly calling GMP) make
> it the same fast? Or are there other problems?

To eval say a+b, about 50% is spent on checking types and creating a
new instance, and the other 50% is spent on actual computation in lib.
So just rewriting lib in C would at best give a 2x speedup.

Again, bypassing typechecks and such should allow numerical expression
evaluation in SymPy to become faster.

> P.S., let's use explicit imports in libmpc.py, calculus.py,
> intervals.py and mptypes.py?

I don't personally see much point in doing this. At least in the case
of lib imports, the functions can be identified through naming
convention (lib mpf functions are prefixed with an f, which will
perhaps be changed to mpf_ in the future, and libmpc functions are
prefixed with mpc_).

Fredrik

Ondrej Certik

unread,
May 19, 2008, 5:47:59 AM5/19/08
to mpm...@googlegroups.com

I see, thanks for the explanation. So you think it is not possible to
get the same speed as gmpy by just rewriting some parts of mpmath to
C?

>
>> P.S., let's use explicit imports in libmpc.py, calculus.py,
>> intervals.py and mptypes.py?
>
> I don't personally see much point in doing this. At least in the case
> of lib imports, the functions can be identified through naming
> convention (lib mpf functions are prefixed with an f, which will
> perhaps be changed to mpf_ in the future, and libmpc functions are
> prefixed with mpc_).


from: http://docs.python.org/tut/node8.html#SECTION008410000000000000000

"
Note that in general the practice of importing * from a module or
package is frowned upon, since it often causes poorly readable code.
However, it is okay to use it to save typing in interactive sessions,
and certain modules are designed to export only names that follow
certain patterns.
"

At least in sympy after discussion we converted all imports to
explicit. I also think it's better
because then you can easily find out where each symbol in the python
source file comes from. So I suggest to use what everyone is using,
but if you insist on implicit imports, we can just make the change in
sympy and/or discuss it again.

Ondrej

Fredrik Johansson

unread,
May 19, 2008, 6:34:20 AM5/19/08
to mpm...@googlegroups.com
On Mon, May 19, 2008 at 11:47 AM, Ondrej Certik <ond...@certik.cz> wrote:
> I see, thanks for the explanation. So you think it is not possible to
> get the same speed as gmpy by just rewriting some parts of mpmath to
> C?

I think it is possible to get close by rewriting a significant portion
in C. In the end though, you can't get around the fact that GMP ints
are at least 2x faster than Python ints.

> At least in sympy after discussion we converted all imports to
> explicit. I also think it's better
> because then you can easily find out where each symbol in the python
> source file comes from. So I suggest to use what everyone is using,
> but if you insist on implicit imports, we can just make the change in
> sympy and/or discuss it again.

This is a good convention generally, but not really necessary to
follow when there's a small number of well isolated modules. The fact
that mpmath uses import * internally doesn't mean sympy has to. I
think any code in sympy that uses mpmath should import explicitly.

Fredrik

Ondrej Certik

unread,
May 19, 2008, 8:50:20 AM5/19/08
to mpm...@googlegroups.com
On Mon, May 19, 2008 at 12:34 PM, Fredrik Johansson
<fredrik....@gmail.com> wrote:
>
> On Mon, May 19, 2008 at 11:47 AM, Ondrej Certik <ond...@certik.cz> wrote:
>> I see, thanks for the explanation. So you think it is not possible to
>> get the same speed as gmpy by just rewriting some parts of mpmath to
>> C?
>
> I think it is possible to get close by rewriting a significant portion
> in C. In the end though, you can't get around the fact that GMP ints
> are at least 2x faster than Python ints.

If we use the same algorithms for ints as GMP, it will still be 2x
slower because of the type checking?

>
>> At least in sympy after discussion we converted all imports to
>> explicit. I also think it's better
>> because then you can easily find out where each symbol in the python
>> source file comes from. So I suggest to use what everyone is using,
>> but if you insist on implicit imports, we can just make the change in
>> sympy and/or discuss it again.
>
> This is a good convention generally, but not really necessary to
> follow when there's a small number of well isolated modules. The fact
> that mpmath uses import * internally doesn't mean sympy has to. I
> think any code in sympy that uses mpmath should import explicitly.

That's right.

Ondrej

casevh

unread,
May 20, 2008, 10:48:44 AM5/20/08
to mpmath
On May 19, 5:50 am, "Ondrej Certik" <ond...@certik.cz> wrote:
> On Mon, May 19, 2008 at 12:34 PM, Fredrik Johansson
>
> <fredrik.johans...@gmail.com> wrote:
>
> > On Mon, May 19, 2008 at 11:47 AM, Ondrej Certik <ond...@certik.cz> wrote:
> >> I see, thanks for the explanation. So you think it is not possible to
> >> get the same speed as gmpy by just rewriting some parts of mpmath to
> >> C?
>
> > I think it is possible to get close by rewriting a significant portion
> > in C. In the end though, you can't get around the fact that GMP ints
> > are at least 2x faster than Python ints.
>
> If we use the same algorithms for ints as GMP, it will still be 2x
> slower because of the type checking?
>
I've written some code that uses Python longs by default but will
automatically use gmpy if it exists. For operations using
approximately
3000 bit numbers, the gmpy version is about 10x faster.

I created my own constructor that was set to either "long" or
"gmpy.mpz". Coincidentally, I happened to choose "mp". To get
the best performance, I used that constructor everywhere and did
not rely on automatic type conversion. This does make the source
code is little more cluttered. ;-)

If you are interested, I am willing to try to add gmpy support to
mpmath.

casevh

Ondrej Certik

unread,
May 20, 2008, 11:46:03 AM5/20/08
to mpm...@googlegroups.com

Of course Fredrik should say what he thinks, as he is the main
maintiner of mpmath, but for sympy I am very interested in that. I
want to use the best libraries if they are available. If they are not
available, then we'll fall back to pure python. Also it should be
maintainable, but imho it could be made that way.

The reason I am asking these questions is becaue mpmath contains
really nice algorithms for a lot of things, so if it could be made a
little more general to call gmp or python floats in the back and it
would still work, it'd be very nice.

Ondrej

Fredrik Johansson

unread,
May 20, 2008, 1:30:46 PM5/20/08
to mpm...@googlegroups.com
On Tue, May 20, 2008 at 5:46 PM, Ondrej Certik <ond...@certik.cz> wrote:
>
> On Tue, May 20, 2008 at 4:48 PM, casevh <cas...@gmail.com> wrote:
>> I've written some code that uses Python longs by default but will
>> automatically use gmpy if it exists. For operations using
>> approximately
>> 3000 bit numbers, the gmpy version is about 10x faster.
>>
>> I created my own constructor that was set to either "long" or
>> "gmpy.mpz". Coincidentally, I happened to choose "mp". To get
>> the best performance, I used that constructor everywhere and did
>> not rely on automatic type conversion. This does make the source
>> code is little more cluttered. ;-)
>>
>> If you are interested, I am willing to try to add gmpy support to
>> mpmath.

I have written some such code as well. One problem for mpmath is that it
slows things down at low precision (though possibly not by much).

Another problem is that mpmath heavily relies on math.log to calculate
the size of an integer, and that doesn't work with gmpy integers. But
this problem can perhaps be worked around (even making things faster
by using gmpy's bit size functions instead).

If you can implement this without introducing unreasonable complexity,
be my guest.

Fredrik

casevh

unread,
May 21, 2008, 12:41:55 AM5/21/08
to mpmath
>
> If you can implement this without introducing unreasonable complexity,
> be my guest.
>
> Fredrik

I will give it a try. I'll be the first to admit it may not be worth
the complexity.

casevh

Ondrej Certik

unread,
May 21, 2008, 5:22:32 AM5/21/08
to mpm...@googlegroups.com

Excellent, go ahead.

Ondrej

casevh

unread,
May 21, 2008, 7:27:37 PM5/21/08
to mpmath
> >> If you can implement this without introducing unreasonable complexity,
> >> be my guest.
>
> >> Fredrik
>
> > I will give it a try. I'll be the first to admit it may not be worth
> > the complexity.
>
> Excellent, go ahead.
>
> Ondrej

Some early results:

runtests.py passes all tests (except for the pickle module). The
changes were suprisingly small.

Unfortunately, runtests.py is about 3x slower. But computing sine to
10,000 dps is about 6x faster.

Should I try to "switch-over" to gmpy only when precision is set very
high?

casevh

Fredrik Johansson

unread,
May 22, 2008, 2:17:24 AM5/22/08
to mpm...@googlegroups.com
On Thu, May 22, 2008 at 1:27 AM, casevh <cas...@gmail.com> wrote:
> Some early results:
>
> runtests.py passes all tests (except for the pickle module). The
> changes were suprisingly small.
>
> Unfortunately, runtests.py is about 3x slower. But computing sine to
> 10,000 dps is about 6x faster.

A 3x slowdown in the normal case is not acceptable.

> Should I try to "switch-over" to gmpy only when precision is set very
> high?

Just go ahead and try. We'll see if it works out well :-)

Fredrik

casevh

unread,
May 25, 2008, 7:02:44 PM5/25/08
to mpmath
Update #2

I now have the gmpy-based version approximately 10% faster for
runtests.py. I had forgotten to update all the fixed point functions.

runtests.py uncovered a bug in 64-bit gmpy. It has been fixed in
gmpy's repository but we would need to test if that bug existed before
we could use the version of gmpy that is present on someone's system.

Overview of the changes
1) Whenever a mantissa or fixed-point number is created, I force it to
either "long" or "gmpy.mpz". Forcing to "long" actually is very
slightly faster than allowing ints to be promoted to longs when values
become large.
2) There were a lot of "is" comparisons that I needed to change to
"==".
3) An mpz must be coerced to an int before it can be used as a
dictionary index. I will try adding an __index__ method to gmpy to see
if that will resolve the issue.
4) I optimized the bitcount and sqrt routines to gmpy specific
functions.

I'll keep working on this.

casevh



On May 21, 11:17 pm, "Fredrik Johansson" <fredrik.johans...@gmail.com>
wrote:

Fredrik Johansson

unread,
May 26, 2008, 12:14:41 PM5/26/08
to mpm...@googlegroups.com
On Mon, May 26, 2008 at 1:02 AM, casevh <cas...@gmail.com> wrote:
> I now have the gmpy-based version approximately 10% faster for
> runtests.py. I had forgotten to update all the fixed point functions.

So you managed to make it faster at low precision as well? That is interesting.

> I'll keep working on this.

Awesome! I'm looking forward.

Fredrik

casevh

unread,
May 27, 2008, 3:04:09 AM5/27/08
to mpmath
Update #3

I've done some more testing. Compared to the standard 0.8 release and
without gmpy, my modified version is 5% slower for runtests.py and
~10% slower for a mix of operations with dps of 20. The slowdown is
constant. If gmpy is available runtests.py is 5% faster. The mix of
floating point operations I used is still ~10% slower with dps of 20.
The break-even point is with dps of ~40. It is 2.6x faster at dps of
200 and almost 13x faster at dps of 1000.

Is it worth pursuing even with a 10% penalty for low precision
operations?

I will keep trying to decrease the penalty but I don't know if I can.

casevh



On May 26, 9:14 am, "Fredrik Johansson" <fredrik.johans...@gmail.com>
wrote:

Fredrik Johansson

unread,
May 27, 2008, 6:13:04 AM5/27/08
to mpm...@googlegroups.com
On Tue, May 27, 2008 at 9:04 AM, casevh <cas...@gmail.com> wrote:

> Is it worth pursuing even with a 10% penalty for low precision
> operations?

If the break-even point is as low as dps = 40, I definitely think so.
This should be configurable though, right? Instead of just trying to
import gmpy, you could check whether an environment variable
MPMATH_USE_GMPY is set. Or it might even be feasible to turn this
feature on or off during runtime (perhaps with necessary cache
purges).

Have you tried comparing performance with psyco?

By the way, there have been some changes to since the 0.8 release. If
you send in a patch, it would be great if you could make sure it's up
to date with the svn version.

Fredrik

Ondrej Certik

unread,
May 27, 2008, 7:52:30 AM5/27/08
to mpm...@googlegroups.com

Maybe you should consider using mercurial, as merging patches in svn is pain. :)

Ondrej

casevh

unread,
May 28, 2008, 3:26:01 AM5/28/08
to mpmath
I haven't tried psyco yet since I normally using a 64-bit
environment.

At startup I set MPBASE to either "long" or "gmpy.mpz" and then use
MPBASE() to coerce the mantissa to the appropriate type. I need to do
this each time an mpf is created or each time an function creates a
value that eventually becomes a mantissa. The majority of the changes
are in lib.py with a few others sprinkled around. The 5 to 10%
performance penalty comes from the overhead of using MPBASE(m). To
find the locations that need to be changed, I just add an assert to
normalize and check that every incoming mantissa value is the correct
type.

Another approach would be move make_mpf into lib.py and have two
versions of lib.py: the standard one and lib-gmpy.py that is optimized
for gmpy. The drawback is that there would be two versions of code but
you could easily take advantage of the the sqrt() command in gmpy.
Also, bitcount could be inlined as "m.numdigits(2) if m else 0".

I'll be spending time in airports the next few days so I may try that
approach. I'm curious how fast I can make mpmath.

I'll make sure I sync up with the svn version before I create a patch.

casevh

Fredrik Johansson

unread,
May 28, 2008, 7:39:24 AM5/28/08
to mpm...@googlegroups.com
On Wed, May 28, 2008 at 9:26 AM, casevh <cas...@gmail.com> wrote:
> I haven't tried psyco yet since I normally using a 64-bit environment.

Right. There's probably a significant difference between 32-bit and
64-bit to begin with, which will be interesting to see.

> At startup I set MPBASE to either "long" or "gmpy.mpz" and then use
> MPBASE() to coerce the mantissa to the appropriate type. I need to do
> this each time an mpf is created or each time an function creates a
> value that eventually becomes a mantissa. The majority of the changes
> are in lib.py with a few others sprinkled around. The 5 to 10%
> performance penalty comes from the overhead of using MPBASE(m). To
> find the locations that need to be changed, I just add an assert to
> normalize and check that every incoming mantissa value is the correct
> type.

This is not necessary everywhere, is it? I think it should work fine to
mix long and mpz, at least when gmpy is being used (either should work
fine as input for gmpy functions).

In many cases the mantissa starts out as 1 or 1<<prec. Instead of
calling MPBASE(1) each time, it should be possible to store MPBASE(1)
either as a global or a local variable. You've probably already
thought of this, but I'm pointing it out just in case.

Presumably the 5 to 10% penalty is with the assert removed?

> Another approach would be move make_mpf into lib.py and have two
> versions of lib.py: the standard one and lib-gmpy.py that is optimized
> for gmpy. The drawback is that there would be two versions of code but
> you could easily take advantage of the the sqrt() command in gmpy.
> Also, bitcount could be inlined as "m.numdigits(2) if m else 0".

Not two libs, please :)

Is there any reason for using mpz methods instead of gmpy functions,
which allow ordinary ints/longs as input?

The documentation for numdigits says "the value returned may sometimes
be 1 more than necessary" It's not clear, but I would assume that this
only applies for nonbinary bases (otherwise the implementation is doing
something horribly wrong :). Mpmath relies rather heavily on the
bitcount being exact.

I think sqrt should really be implemented using sqrtrem, as this
allows correct rounding to be implemented. (This means mpmath should
implement a proper pure-Python sqrtrem function as well.)

> I'll make sure I sync up with the svn version before I create a patch.

Great, thanks.

Fredrik

casevh

unread,
May 28, 2008, 8:43:30 AM5/28/08
to mpmath
> > At startup I set MPBASE to either "long" or "gmpy.mpz" and then use
> > MPBASE() to coerce the mantissa to the appropriate type. I need to do
> > this each time an mpf is created or each time an function creates a
> > value that eventually becomes a mantissa. The majority of the changes
> > are in lib.py with a few others sprinkled around. The 5 to 10%
> > performance penalty comes from the overhead of using MPBASE(m). To
> > find the locations that need to be changed, I just add an assert to
> > normalize and check that every incoming mantissa value is the correct
> > type.
>
> This is not necessary everywhere, is it? I think it should work fine to
> mix long and mpz, at least when gmpy is being used (either should work
> fine as input for gmpy functions).

My first attempt did not use MPBASE() in the functions such as manchin
etc. That is when I had 2.5x slowdown. I think it was doing a
conversion between long and gmpy for each iteration. The challenge is
to minimize the number of conversions.

> In many cases the mantissa starts out as 1 or 1<<prec. Instead of
> calling MPBASE(1) each time, it should be possible to store MPBASE(1)
> either as a global or a local variable. You've probably already
> thought of this, but I'm pointing it out just in case.

Tried that. :)
> Presumably the 5 to 10% penalty is with the assert removed?

Yes.

> > Another approach would be move make_mpf into lib.py and have two
> > versions of lib.py: the standard one and lib-gmpy.py that is optimized
> > for gmpy. The drawback is that there would be two versions of code but
> > you could easily take advantage of the the sqrt() command in gmpy.
> > Also, bitcount could be inlined as "m.numdigits(2) if m else 0".
>
> Not two libs, please :)
>

I agree.

> Is there any reason for using mpz methods instead of gmpy functions,
> which allow ordinary ints/longs as input?
>
> The documentation for numdigits says "the value returned may sometimes
> be 1 more than necessary" It's not clear, but I would assume that this
> only applies for nonbinary bases (otherwise the implementation is doing
> something horribly wrong :). Mpmath relies rather heavily on the
> bitcount being exact.
>

I checked the GMP documentation and it is guaranteed accurate for base
2.

> I think sqrt should really be implemented using sqrtrem, as this
> allows correct rounding to be implemented. (This means mpmath should
> implement a proper pure-Python sqrtrem function as well.)
>
> > I'll make sure I sync up with the svn version before I create a patch.
>

casevh

casevh

unread,
Jun 4, 2008, 12:45:54 AM6/4/08
to mpmath
I've been a little busy but I managed to run some more tests. Here is
a summary of the results:

32-bit Python using psyco is the fastest up to a precision of around
50.

32-bit Python using psyco and gmpy is the fastest between precision of
50 and 200.

64-bit Python using gmpy is fastest when the precision exceeds 200.

In all tests, the overhead for adding the ability to use gmpy
increased the running time slightly (~5%) if gmpy is not present.

The significant improvements for gmpy begin when the precision is
around 100.

In addition to the slight slowdown for low precisions, the patches
require the svn version of gmpy on 64-bit platforms and version 1.02
on 32-bit platforms. Unfortunately, Ubuntu (for example) only provides
version 1.01.

Given these gotchas, if you would like to add gmpy support, I can
create a set of patches against the svn version of mpmath. If not, I
have a couple more ideas that I might try.

casevh


casevh

unread,
Jun 26, 2008, 1:41:06 AM6/26/08
to mpmath
I finally have some time to work on this again. I submitted at
code.google.com a set of patches against a recent svn version. All
tests pass except test_pickle.py. For runtests.py there is still a
minor performance hit (~1%) if gmpy is not present but a 10%
improvement if gmpy is present.

The mpmath test suite uncovered some bugs in gmpy so the recently
released gmpy v1.03 is required.

Please let me know if you'd like me to do anything different.

casevh
Reply all
Reply to author
Forward
0 new messages