High-Performance Arbitrary precision numerical calculations in Java (Mandelbrot Set!)

Matthew Kerle

unread,

Mar 24, 2010, 11:15:10 PM3/24/10

to java...@googlegroups.com

TL;DR we're writing a Mandelbrot set viewer and need to dive deeper. BigDecimal sucks. help!

So... My friend wrote a mandelbrot set viewer in Java (with Netbeans =D ) using floats, it's pretty nice and lets you pan around and zoom and the performance is not too bad, but after a while you hit the precision limit of float (image pixellates out) so we swapped over to double and got a bit deeper.

The problem is now we want to go deeper than double lets us (step 6 here). We tried implementing with BigDecimal but the performance absolutely crapped out, like 100x worse, you die waiting for a single frame to render even with precision turned way down (God help if you don't set a MathContext!!). After googling around it looks like the problem is BigDecimal, it's optimised for precision over speed. we don't care very much about precision, we just need enough to get our pretty colours :-) Speed is more of a concern, although obviously we're not realistically expecting float/double type speed from arbitrary-precision arithmetic, but a slowdown of 2-5x would be fine, 10x + starts getting depressing though...

Other methods of performance tuning we're looking at:

- firstly progressive rendering (eg, do 100 iterations, draw an image, then next 100 and so on to limit)

- scale-dependant implementation (eg go from float to double to ? as you zoom in)

- multi-threading (split the image up and hand each portion off to a core, not sure best way to configure this, thread per core?)

- caching (not sure best strategy for this, store result for each point with iteration count or image directly? cache would get HUGE, but don't care...)

I've found the ApFloat library which looks pretty cool. but haven't had a chance to run some comparison runs with it against BD yet (darn day job!!).

Just wondering if anyone else has any experience with this domain or any pointers? it's been a few years since uni and my number theory is a bit rusty (one of the reasons we're playing with this...). mandelbrot uses T(n+1) = n^2 + first term, so maybe we can do some optimisations by using operating on integers and storing the precision separately? (obviously this is what BD does already, but we only want a narrow slice of what it does...)

incidentally, found Fast Fractal, it's a closed-source windows-only program which purports to have GPU acceleration for fractal viewing. I tried it out and wasn't impressed, but then it couldn't detect my graphics card so GPU acceleration wasn't enabled (low-rung Toshiba Satellite laptop). Any decent GPU accelerated mandelbrot viewers out there for linux?

cheers!

matt.

mikeb01

unread,

Mar 25, 2010, 2:35:43 AM3/25/10

to The Java Posse

While I don't have first hand experience I have heard good things
about the performance of the GMP library for arbitrary precision
math. It's C based so you will need JNI. There is a wrapper here:

http://bitbucket.org/dfdeshom/gmp-java/src/

On Mar 25, 3:15 am, Matthew Kerle <mattke...@gmail.com> wrote:
> TL;DR we're writing a Mandelbrot set viewer and need to dive deeper.
> BigDecimal sucks. help!
>
> So... My friend wrote a mandelbrot set viewer in Java (with Netbeans =D )
> using floats, it's pretty nice and lets you pan around and zoom and the
> performance is not too bad, but after a while you hit the precision limit of
> float (image pixellates out) so we swapped over to double and got a bit
> deeper.
>
> The problem is now we want to go deeper than double lets us (step 6

> here<http://en.wikipedia.org/wiki/Mandelbrot_set#Image_gallery_of_a_zoom_s...>).

> We tried implementing with BigDecimal but the performance absolutely crapped
> out, like 100x worse, you die waiting for a single frame to render even with
> precision turned way down (God help if you don't set a MathContext!!). After
> googling around it looks like the problem is BigDecimal, it's optimised for
> precision over speed. we don't care very much about precision, we just need
> enough to get our pretty colours :-) Speed is more of a concern, although
> obviously we're not realistically expecting float/double type speed from
> arbitrary-precision arithmetic, but a slowdown of 2-5x would be fine, 10x +
> starts getting depressing though...
>
> Other methods of performance tuning we're looking at:
> - firstly progressive rendering (eg, do 100 iterations, draw an image, then
> next 100 and so on to limit)
> - scale-dependant implementation (eg go from float to double to ? as you
> zoom in)
> - multi-threading (split the image up and hand each portion off to a core,
> not sure best way to configure this, thread per core?)
> - caching (not sure best strategy for this, store result for each point with
> iteration count or image directly? cache would get HUGE, but don't care...)
>

> I've found the ApFloat library <http://www.apfloat.org/apfloat_java/> which

> looks pretty cool. but haven't had a chance to run some comparison runs with
> it against BD yet (darn day job!!).
>
> Just wondering if anyone else has any experience with this domain or any
> pointers? it's been a few years since uni and my number theory is a bit
> rusty (one of the reasons we're playing with this...). mandelbrot uses
> T(n+1) = n^2 + first term, so maybe we can do some optimisations by using
> operating on integers and storing the precision separately? (obviously this
> is what BD does already, but we only want a narrow slice of what it does...)
>
> incidentally, found Fast Fractal, it's a closed-source

> windows-only<http://www.fastfractal.com/>program which purports to

Matthew Kerle

unread,

Mar 25, 2010, 6:04:49 AM3/25/10

to java...@googlegroups.com, Michael Pascoe

thanks! that looks really cool. not sure how it'll go with the iterative nature of our calc's but will give it a benchmark this weekend. only thing I'm worried about is the perf hit of making a native calls on every operation (mul, add, compare)...? I thought JNI was really slow? mind you it's only got to be faster than BigDecimal, if we get really serious I suppose we can write a C wrapper to handle iterating over the complex array so there would be one native call per render...

anyone have experience with JNI performance on Java 6u16+? I remember it used to be painful slow, but that's like 4 major releases ago, and apparently it's been upgraded since then...

I've written a little perf tester (attached) that runs a sample loop through and prints out some times. as expected double and float kick ass. I'd heard that this Apfloat class was pretty good but it turns out to be even slower than BigDecimal!!! and they call themselves high performance..

output:

Float: 7ms

double: 8ms

BigDecimal: 1881ms

ApFloat: 4293ms

now I just need GMP-J in here...

--
You received this message because you are subscribed to the Google Groups "The Java Posse" group.
To post to this group, send email to java...@googlegroups.com.
To unsubscribe from this group, send email to javaposse+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/javaposse?hl=en.

FloatTester.java

Matthew Kerle

unread,

Mar 25, 2010, 6:49:33 AM3/25/10

to java...@googlegroups.com

ah, no windows binary release of GMP, need to build from source.. MingW where are you again...

John Muir

unread,

Mar 25, 2010, 9:51:05 AM3/25/10

to The Java Posse

This may be waaay off topic, but if you could use Python then this
implementation of the game of life may have something to teach - it's
really blindingly fast!

http://golly.sourceforge.net/

enjoy,
John

On Mar 25, 4:15 am, Matthew Kerle <mattke...@gmail.com> wrote:
> TL;DR we're writing a Mandelbrot set viewer and need to dive deeper.
> BigDecimal sucks. help!
>
> So... My friend wrote a mandelbrot set viewer in Java (with Netbeans =D )
> using floats, it's pretty nice and lets you pan around and zoom and the
> performance is not too bad, but after a while you hit the precision limit of
> float (image pixellates out) so we swapped over to double and got a bit
> deeper.
>
> The problem is now we want to go deeper than double lets us (step 6

> here<http://en.wikipedia.org/wiki/Mandelbrot_set#Image_gallery_of_a_zoom_s...>).

> We tried implementing with BigDecimal but the performance absolutely crapped
> out, like 100x worse, you die waiting for a single frame to render even with
> precision turned way down (God help if you don't set a MathContext!!). After
> googling around it looks like the problem is BigDecimal, it's optimised for
> precision over speed. we don't care very much about precision, we just need
> enough to get our pretty colours :-) Speed is more of a concern, although
> obviously we're not realistically expecting float/double type speed from
> arbitrary-precision arithmetic, but a slowdown of 2-5x would be fine, 10x +
> starts getting depressing though...
>
> Other methods of performance tuning we're looking at:
> - firstly progressive rendering (eg, do 100 iterations, draw an image, then
> next 100 and so on to limit)
> - scale-dependant implementation (eg go from float to double to ? as you
> zoom in)
> - multi-threading (split the image up and hand each portion off to a core,
> not sure best way to configure this, thread per core?)
> - caching (not sure best strategy for this, store result for each point with
> iteration count or image directly? cache would get HUGE, but don't care...)
>

> I've found the ApFloat library <http://www.apfloat.org/apfloat_java/> which

> looks pretty cool. but haven't had a chance to run some comparison runs with
> it against BD yet (darn day job!!).
>
> Just wondering if anyone else has any experience with this domain or any
> pointers? it's been a few years since uni and my number theory is a bit
> rusty (one of the reasons we're playing with this...). mandelbrot uses
> T(n+1) = n^2 + first term, so maybe we can do some optimisations by using
> operating on integers and storing the precision separately? (obviously this
> is what BD does already, but we only want a narrow slice of what it does...)
>
> incidentally, found Fast Fractal, it's a closed-source

> windows-only<http://www.fastfractal.com/>program which purports to

Tom Hawtin

unread,

Mar 25, 2010, 3:37:12 PM3/25/10

to The Java Posse

I guess fixed point makes most sense. You aren't going to need values
outside of, er, [-2, 2] (it's a long time since my second year maths
project). There also isn't much point in specially handling values
much smaller. So even long would get you a few more bits than double.
Unfortunately Java (or most other 3GLs, such as C) are much cop at
handling arithmetic split across multiple "integer" types.

mikeb01

unread,

Mar 25, 2010, 6:00:38 PM3/25/10

to The Java Posse

> anyone have experience with JNI performance on Java 6u16+? I remember it
> used to be painful slow, but that's like 4 major releases ago, and

> apparently <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5086424> it's
> been upgraded since then...

From my experience down calls (Java -> Native) are very fast now (up
calls Native -> Java, not so much). However, most of my recent
experience is with a C-based messaging library (albeit a high
performance one), which would have a lower call frequency than the
rendering of a Mandelbrot set.

Matthew Kerle

unread,

Mar 25, 2010, 6:05:48 PM3/25/10

to java...@googlegroups.com

well I'll find out later today and post the timing results. Hopefully not too bad!!

--

Matthew Kerle

unread,

Mar 25, 2010, 6:09:34 PM3/25/10

to java...@googlegroups.com

thanks for the link! yes off topic but looks interesting still, haven't heard of the GoL for awhile, I'm always a sucker for pretty visualisations of maths ;-)

michaelm

unread,

Apr 5, 2010, 6:23:11 AM4/5/10

to The Java Posse

Hi,
I made some benchmarking on a few OS java libraries a few months ago.
The test consists of running two geometric computation methods
involving additions, substraction and multiplications
- isCounterClockWise
- isInCircle
For 1 000 000 operations I get
double 104 ms
DoubleDouble 4334 ms
BigDecimal 29975 ms
Real 36058 ms
Dfp10 61128 ms
Apfloat 98966 ms

This is a very partial test which does not take into account what each
library can really do (floating point vs arbitrary precision,
including/not including transcendantal functions...)

http://tsusiatsoftware.net/
http://real-java.sourceforge.net/Real.html
http://dfp.sourceforge.net/
http://www.apfloat.org/apfloat_java/

Michaël

Matthew Kerle

unread,

Apr 6, 2010, 1:58:04 AM4/6/10

to java...@googlegroups.com

thanks Michaël, what ranges were your numbers in for those operations? We need large precisions over the box bounded by (-2, -2i) and (2, 2i) with most of the interesting features deep inside many decimal points of precision.

I wrote a quick and dirty benchmarker for BigDecimal vs ApFloat the other day, and it looks like for small ranges (<64 sf decimal) and low iterations (<100 iters) BD outperforms ApFloat, but once you get a lot iterations in and deeper than that then ApFloat really pulls ahead.

It basically creates a complex number wrapper for BD (why do you have to specify a MathContext on *every* operation argh!! it should just obey the initial precision you give it without going aribitrary!!), starts with c = (0.5, 0.5i) and iterates over n^2 + c at the specified precision for the specified iters.

the code for this should be attached, if google chops it off then it's here:http://pastebin.com/cGEkbE1m

format: {ApFloat_runtime, BigDecimal_runtime}

10 100 500 1000

64 : {147ms, 11ms}{34ms, 29ms} {65ms, 85ms} {78ms, 203ms}

128 : {1ms, 1ms} {12ms, 33ms} {64ms, 175ms} {119ms, 319ms}

256 : {2ms, 3ms} {35ms, 101ms}{195ms, 507ms}{357ms, 1s}

512 : {11ms, 9ms} {75ms, 343ms}{365ms, 1s} {649ms, 3s}

1024: {6ms, 23ms} {119ms, 1s} {637ms, 7s} {1s, 16s}

(nb colouring added by me for emphasis, also there's a bug in the printing that causes prec's greater than 128 to print twice, if you can find the bug please let me know)

--

FloatTester.java

Matthew Kerle

unread,

Apr 6, 2010, 4:36:15 AM4/6/10

to java...@googlegroups.com

I rolled DoubleDouble into my benchmarking and re-ran. for it's precision range it outperforms Apfloat & BD as expected. Obviously there is a bit of time wandering as hotspot warms up but DoubleDouble should get us twice as far into the fractal (34 sig. figures) at decent performance before we need Apfloat.

Is anyone aware of an implementation of QuadDouble for Java? That would get us 64sf without Apfloat. Bailey has one for C++ but my math/bit theory isn't good enough to re-write it in Java. If only Knuth were here...

100 500 1000 (iterations)

16sf: {3ms/d, 13ms/dd,142ms/Apf,52ms/BD}{0ms/d,9ms/dd,84ms/Apf,37ms/BD}{1ms/d,4ms/dd,32ms/Apf, 33ms/BD}

32: {0ms/dd, 3ms/Apf, 5ms/BD}{1ms/dd, 17ms/Apf, 23ms/BD}{1ms/dd, 38ms/Apf, 43ms/BD}

64: {5ms/Apf, 9ms/BD}{26ms/Apf, 49ms/BD}{54ms/Apf, 92ms/BD}

128: {11ms/Apf, 25ms/BD}{58ms/Apf, 132ms/BD}{116ms/Apf, 315ms/BD}

256: {38ms/Apf, 86ms/BD}{193ms/Apf, 413ms/BD}{365ms/Apf, 862ms/BD}

512: {129ms/Apf, 323ms/BD}{402ms/Apf, 1s/BD}{633ms/Apf, 2s/BD}

1024: {120ms/Apf, 1s/BD}{647ms/Apf, 6s/BD}{1s/Apf, 12s/BD}

(significant figures decimal)

d=double (limited to 17sf)

dd=DoubleDouble (limited to 34sf)

Apf=Apfloat (unlimited)

BD=BigDecimal (unlimited)

obviously <18sf we want to use double, <35sf DoubleDouble, and thereafter Apfloat (unless we find/write quaddouble). each cell above represents a single pixel, so at 1024sf 1s/pixel is gonna take a while... :-(

Reply all

Reply to author

Forward