Possible performance enhancement

98 views
Skip to first unread message

Olivier Brault

unread,
Nov 4, 2017, 6:14:46 AM11/4/17
to marlin-renderer
Hi Laurent,

While profiling my app, I noticed that when drawing a lot of circles, Marlin made a lot of call to Math.cbrt() which is quite slow.
Did you try to use Apache FastMath.cbrt() instead ?

In my benchmark, the apache version is more than twice as fast as Java version.

Olivier

Olivier Brault

unread,
Nov 6, 2017, 7:49:34 AM11/6/17
to marlin-renderer
I understand your problem with licence issue, with Apache Math.
So here is my algorithm, I give it to you : it is based on Newton method, so no licence problem !

In fact, the algorithme depends both on :
  • the accuracy you need
  • the range of number to handle
Apache FastMath handles everything with the best precision.
This algorithm is tuned to give at least 10 digits of precision (I think it is quite sufficient for your need) for the range 10^-6 -> 10^6
I made of subtle compromise between the initial guess and the number of iterations : the more precise the initial guess, the less iteration you need.

    static private double my_cbrt(double x0) {

       
double x;

       
// Affinage de la première estimation
       
// celui ci-dessous est assez grossier, mais permet 10 chiffres de précision
       
// dans le range 10E-6 -> 10E6
       
// A CHECKER PLUS EN DETAIL : faire un vrai test de précision en bonne et due forme !!
       
if (x0>1000000) x=500;
       
else if (x0>1000) x=50;
       
else if (x0>1) x=5;
       
else if (x0>0.001) x=0.5;
       
else if (x0>0.000001) x=0.05;
       
else x = 0.005;


       
// Version Newton de base
//        for(int i=0;i<10;i++) {
//            x = 0.333333333333 * (2*x + x0/(x*x));
//        }

       
// Version très légèrement plus rapide
        x0
*= 0.333333333333;
       
for(int i=0;i<10;i++) {
            x
= 0.66666666666667 * x + x0/(x*x);
       
}

       
return x;
   
}

This version is 10 times faster than Apache FastMath, and 30 times faster tha Java Math.cbrt()

Olivier

Olivier Brault

unread,
Nov 6, 2017, 8:41:42 AM11/6/17
to marlin-renderer
Hum maybe I was a little bit optimistic in my benchmark.
In fact, this algorithm is a bit faster than Apach's one, but not 10 times ...
But still faster than Java one !


Le samedi 4 novembre 2017 11:14:46 UTC+1, Olivier Brault a écrit :

Laurent Bourgès

unread,
Nov 6, 2017, 1:45:10 PM11/6/17
to marlin-...@googlegroups.com
Thanks for your cbrt impl.

I will write a jmh benchmark and extract the function domain range in Marlin (cubics roots finder).

Stay tuned,
Laurent

--
You received this message because you are subscribed to the Google Groups "marlin-renderer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marlin-renderer+unsubscribe@googlegroups.com.
To post to this group, send email to marlin-renderer@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris Newland

unread,
Nov 9, 2017, 3:10:37 AM11/9/17
to marlin-renderer
Hi Laurent, 

fyi it looks like there may be some performance regressions in JDK9 math functions: https://twitter.com/richardstartin/status/927845209932787712 

Cheers,

Chris
--
@chriswhocodes

On Monday, 6 November 2017 18:45:10 UTC, bourges.laurent wrote:
Thanks for your cbrt impl.

I will write a jmh benchmark and extract the function domain range in Marlin (cubics roots finder).

Stay tuned,
Laurent
Le 6 nov. 2017 2:41 PM, "Olivier Brault" <o.br...@gmail.com> a écrit :
Hum maybe I was a little bit optimistic in my benchmark.
In fact, this algorithm is a bit faster than Apach's one, but not 10 times ...
But still faster than Java one !

Le samedi 4 novembre 2017 11:14:46 UTC+1, Olivier Brault a écrit :
Hi Laurent,

While profiling my app, I noticed that when drawing a lot of circles, Marlin made a lot of call to Math.cbrt() which is quite slow.
Did you try to use Apache FastMath.cbrt() instead ?

In my benchmark, the apache version is more than twice as fast as Java version.

Olivier

--
You received this message because you are subscribed to the Google Groups "marlin-renderer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marlin-render...@googlegroups.com.
To post to this group, send email to marlin-...@googlegroups.com.

Laurent Bourgès

unread,
Nov 9, 2017, 7:33:20 AM11/9/17
to marlin-...@googlegroups.com
Hi Olivier & Chris,

I quickly tested JaFaMa (cbrt and acos functions) within Marlin 0.8.2 but got no performance gain when running my MapBench tool (lots of circle, ellipses ...) so I suspect that hotspot (C2) uses intrinsics instead of calling native code (libfdm).

FYI I asked explicitely the question on the core-libs list:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-November/049891.html

I will check the JVM code to see if cbrt / acos or other math functions need more efficient implementations, but let the JVM core team do the job and possibility provide a new FastMath API (faster but less accurate).

Bye,
Laurent

To unsubscribe from this group and stop receiving emails from it, send an email to marlin-renderer+unsubscribe@googlegroups.com.
To post to this group, send email to marlin-renderer@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
--
Laurent Bourgès

Olivier Brault

unread,
Nov 10, 2017, 3:33:50 AM11/10/17
to marlin-renderer
I dont know exactly how Marlin Renderer works.
In fact, I realized that a lot of call to cbrt() are made when drawing "big" circles : for "small" circles, my profiler didn't show this function in the report.
So, may be, the use of many call to cbrt by Marlin is done only in very particular situations, not really representative of common utilisation.
May be too, the profiler think Java spend a lot of time in cbrt() but it can be a callibration error or anything else I don't know.
--
Laurent Bourgès

Chris Newland

unread,
Nov 10, 2017, 3:53:30 AM11/10/17
to marlin-renderer
https://gist.github.com/apangin/7a9b7062a4bd0cd41fcc for a list of current HotSpot intrinsics.

Cheers,

Chris
--
Laurent Bourgès

Laurent Bourgès

unread,
Nov 10, 2017, 4:13:35 AM11/10/17
to marlin-...@googlegroups.com
Olivier,


Le 10 nov. 2017 9:33 AM, "Olivier Brault" <o.br...@gmail.com> a écrit :
I dont know exactly how Marlin Renderer works.
In fact, I realized that a lot of call to cbrt() are made when drawing "big" circles : for "small" circles, my profiler didn't show this function in the report.
So, may be, the use of many call to cbrt by Marlin is done only in very particular situations, not really representative of common utilisation.

Cbrt / acos / cos are ONLY called by Stroker to find cubic roots.

I tested last night with jafama 2.2 on jdk8 and the gain is minor: 1 or 2% so I may adopt that solution for Marlin releases @ github.

For OpenJDK 10, math functions have been optimized by intel in 2015 but some work remains to be done: use intrinsics for ALL maths. Cbrt is java code now.
I will leave that work to core-libs experts: read the discussion thread (paul sandoz)

I will send you the jafama perf reports on jdk8 & 9: jdk 9 is really better than 8 but jafama 2.2 is still faster on these cbrt / acos functions (1 bit less accurate).

May be too, the profiler think Java spend a lot of time in cbrt() but it can be a callibration error or anything else I don't know.

I used Yourkit profiler on jdk8:
- cpu sampling shows lots of time spend in cbrt (native overhead)
- cpu tracing never shows cbrt...

I suspect profilers are unusable in such situation: false positive or negative...

If I have some time, I will use oprofile (linux kernel profiler using cpu counters) that is reliable and can estimate properly cpu costs on both jvm (native) & jit code (java).

PS: please answer the marlin survey: what kind of application ? use cases ... ?

Cheers,
Laurent

To unsubscribe from this group and stop receiving emails from it, send an email to marlin-renderer+unsubscribe@googlegroups.com.
To post to this group, send email to marlin-renderer@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages