Confused by Clojure floating-point differences (compared to other languages)

391 views
Skip to first unread message

Glen Fraser

unread,
Feb 5, 2014, 8:17:13 AM2/5/14
to clo...@googlegroups.com
(sorry if you received an earlier mail from me that was half-formed, I hit send by accident)

Hi there, I'm quite new to Clojure, and was trying to do some very simple benchmarking with other languages.  I was surprised by the floating-point results I got, which differed (for the same calculation, using doubles) compared to the other languages I tried (including C++, SuperCollider, Lua, Python).

My benchmark iteratively runs a function 100M times: g(x) <-- sin(2.3x) + cos(3.7x), starting with x of 0.

In the other languages, I always got the result 0.0541718..., but in Clojure I get 0.24788989....  I realize this is a contrived case, but -- doing an identical sequence of 64-bit floating-point operations on the same machine should give the same answer.   Note that if you only run the function for about ~110 iterations, you get the same answer in Clojure (or very close), but then it diverges.

I assume my confusion is due to my ignorance of Clojure and/or Java's math library.  I don't think I'm using 32-bit floats or the "BigDecimal" type (I even explicitly converted to double, but got the same results, and if I evaluate the type it tells me java.lang.Double, which seems right).  Maybe Clojure's answer is "better", but I do find it strange that it's different.  Can someone explain this to me?

Here are some results:

Clojure: ~23 seconds
(defn g [x] (+ (Math/sin (* 2.3 x)) (Math/cos (* 3.7 x))))
(loop [i 100000000 x 0] (if (pos? i) (recur (dec i) (g x)) x))
;; final x: 0.24788989279493556 (???)

C++ (g++ -O2): ~4 seconds
double g(double x) {
return std::sin(2.3*x) + std::cos(3.7*x);
}
int main() {
double x = 0;
for (int i = 0; i < 100000000; ++i) {
x = g(x);
}
std::cout << "final x: " << x << std::endl;
return 0;
}
// final x: 0.0541718

Lua: ~39 seconds
g = function(x)
return math.sin(2.3*x) + math.cos(3.7*x)
end

x = 0; for i = 1, 100000000 do x = g(x) end
-- Final x: 0.054171801051906

Python: ~72 seconds
def g(x):
    return math.sin(2.3*x) + math.cos(3.7*x)

x = 0
for i in xrange(100000000):
    x = g(x)

# Final x: 0.05417180105190572

SClang: ~26 seconds
g = { |x| sin(2.3*x) + cos(3.7*x) };
f = { |x| 100000000.do{ x = g.(x) }; x};
bench{ f.(0).postln };
// final x: 0.054171801051906 (same as C++, Lua, Python; different from Clojure)

Thanks,
Glen.

Jon Harrop

unread,
Feb 5, 2014, 10:06:31 AM2/5/14
to clo...@googlegroups.com

 

IIRC, Java provides unusual trigonometric functions which, I’m guessing, Clojure is using. I think the Java ones are actually more accurate (and slower) so you may well find the answer obtained on the JVM is more precise than the others.

 

Cheers,

Jon.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Glen Fraser

unread,
Feb 5, 2014, 10:58:45 AM2/5/14
to clo...@googlegroups.com, jonathand...@googlemail.com
Thanks for the tip.  After reading your comment, I looked and discovered the Java library called StrictMath, and tried it (replacing Math/cos and Math/sin by the StrictMath versions).  I did indeed get different results than with the regular library, but unfortunately still not the same answer as in other languages.  I guess the Java implementation(s) are indeed different.  It's not a big deal for me, just something I found confusing, wondering if I'd done something wrong.

Thanks,
Glen.

Konrad Hinsen

unread,
Feb 5, 2014, 12:22:54 PM2/5/14
to clo...@googlegroups.com
--On 5 Feb 2014 05:17:13 -0800 Glen Fraser <hola...@gmail.com> wrote:

> My benchmark iteratively runs a function 100M times: g(x) <-- sin(2.3x) +
> cos(3.7x), starting with x of 0.

A quick look at the series you are computing suggests that it has chaotic
behavior. Another quick looks shows that neither of the two values that
you see after 100M iterations is a fix point. I'd need to do a careful
numerical analysis to be sure, but I suspect that you are computing
a close to random number: any numerical error at some stage is amplified
in the further computation.

If you get identical results from different languages, this suggests that
they all end up using the same numerical code (probably the C math
library). I suggest you try your Python code under Jython, perhaps
that will reproduce the Clojure result by also relying on the JVM
standard library.

> In the other languages, I always got the result 0.0541718..., but in
> Clojure I get 0.24788989.... I realize this is a contrived case, but --
> doing an identical sequence of 64-bit floating-point operations on the
> same machine should give the same answer.

Unfortunately not. Your reasoning would be true if everyone adopted
IEEE float operations, but in practice nobody does because the main
objective is speed, not predictability. The Intel hardware is close
to IEEE, but not fully compatible, and it offers some parameters that
libraries can play with to get different results from the same operations.

Konrad.

Mark Engelberg

unread,
Feb 5, 2014, 12:23:13 PM2/5/14
to clojure
Looks to me like your Clojure loop runs in the opposite direction (counting downwards) versus the other languages.  Since your code only returns the result of the last iteration of the loop, it's not too surprising that they return completely different results -- the last iteration of the Clojure code is a completely different input than in the other languages.


Glen Fraser

unread,
Feb 5, 2014, 1:00:56 PM2/5/14
to clo...@googlegroups.com
Thanks, this is a satisfying answer. You're probably right that the other languages are all using the C standard math library (I naïvely assumed Java would too, but I see that's not the case). And yes, as I said, it is a rather contrived (and chaotic) example.

Glen.

Mark Engelberg

unread,
Feb 5, 2014, 1:10:19 PM2/5/14
to clojure
Ah, I see now that you are doing (g x) in your loop, not (g i), so scratch what I said about the loop running the wrong direction.

r

unread,
Feb 5, 2014, 1:28:08 PM2/5/14
to clo...@googlegroups.com
I'd agree here.

This is actually a very nice example of a system that might be called "chaotic", though
"chaos" is, even mathematically, a very vague term:

1) the iteration will never leave [-2, 2]
2) it won't converge because all 3 fixed points are unstable ( |f'(x_s)|>1 )

So, your example is really not calculating any particular number. 
Now you could consider it as a calculation of the series itself. The question is
that of repeatability. 

This is not something that can be answered by looking at hardware only. Even if you are
running with the same primitive operations, the results could be different. Floating point
representation violated distributivity and associativity laws of real numbers. Thus, the
error of a certain computation, even if algebraically equivalent, depends on the ordering
of operations (if you ever come close to the accuracy limits ~1e-7 for floats and ~1e-15 for 
doubles, or something like that). Since different compilers will order computation differently,
you cannot really expect to match a diverging series ... 

Standard texts are:

ranko

Alex Miller

unread,
Feb 5, 2014, 2:07:44 PM2/5/14
to clo...@googlegroups.com
Others have answered with many useful bits but I would mention that it would possibly make a significant performance difference if you added this to your code:

(set! *unchecked-math* true)

David Nolen

unread,
Feb 5, 2014, 2:13:08 PM2/5/14
to clojure
Also:

(defn g ^double [^double x] (+ (Math/sin (* 2.3 x)) (Math/cos (* 3.7 x))))


Glen Fraser

unread,
Feb 5, 2014, 4:41:51 PM2/5/14
to clo...@googlegroups.com
Thanks to both of you for these suggestions, they're good to know.  In my specific case, setting the *unchecked-math* flag true did indeed speed things up slightly (by about 6%).  The other change, though, with the double type hints (I assume that's what those are), actually ran notably slower (over 20% slower!).

Glen.

You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/kFNxGrRPf2k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

David Nolen

unread,
Feb 5, 2014, 4:56:00 PM2/5/14
to clojure
(set! *unchecked-math* true)
(defn g ^double [^double x] (+ (Math/sin (* 2.3 x)) (Math/cos (* 3.7 x))))
(time (loop [i 100000000 x 0.0] (if (pos? i) (recur (dec i) (g x)) x)))

This is nearly 50% faster than the original version on my machine. Note that x is bound to 0.0 in the loop, which allows the optimized g to be invoked.

Glen Fraser

unread,
Feb 5, 2014, 5:30:50 PM2/5/14
to clo...@googlegroups.com
Thanks, yes, the version starting with 0.0 in the loop (rather than 0) does run faster.  In my case, about 13% faster (19.7 seconds -- for the code you pasted below, with *unchecked-math*, type hints and starting x of 0.0 -- vs 22.7 seconds for my original version).  But if you start with x of 0 (integer), the type-hinted version runs notably slower.  In all cases, though, at least you get the same final answer… (-;

So I don't see that 50% speedup you're seeing, but I do see improvement.  I'm on Clojure 1.5.1, and Java 1.7.0_51 on OS X 10.8.5, running in an nREPL (cider) in Emacs.  Possibly other JDK versions have more optimizations?

Thanks
Glen.

David Nolen

unread,
Feb 5, 2014, 5:34:12 PM2/5/14
to clojure
You need to make sure that you are running with server settings. If you are using lein, it's likely that this is not the case unless you have overridden lein's defaults in your project.clj.


Alex Miller

unread,
Feb 5, 2014, 6:05:18 PM2/5/14
to clo...@googlegroups.com
To override the default tiered compilation, add this to your project.clj:
:jvm-opts ^:replace []
I would also recommend using a newer JDK (preferably 7, but at least 6). 

Colin Yates

unread,
Feb 5, 2014, 7:10:59 PM2/5/14
to clo...@googlegroups.com
Did I see a thread a while ago where doing this caught some people out because it wiped out some other performance switches?  I can't find the thread.

Apologies if I am spreading FUD....

Daniel

unread,
Feb 5, 2014, 7:55:03 PM2/5/14
to clo...@googlegroups.com
He is running 7.

Lee Spector

unread,
Feb 5, 2014, 8:28:25 PM2/5/14
to clo...@googlegroups.com

On Feb 5, 2014, at 6:05 PM, Alex Miller wrote:

> To override the default tiered compilation, add this to your project.clj:
> :jvm-opts ^:replace []

I was under the impression that one can get the same effect by running your program with:

lein trampoline with-profile production run [etc]

True? I *think* the text here implies this too: https://github.com/technomancy/leiningen/blob/master/doc/TUTORIAL.md

FWIW my goal is to be able to run a project with one command, starting with a project that contains only source code (with none of my code pre-compiled), and have it do whatever it has to do to launch (I don't much care how long the launch takes) and then run as fast as possible (often for hours or days, CPU-bound). I launch my current runs with a command line like the one above, and I do also specify :jvm-opts in project.clj, specificially:

:jvm-opts ["-Xmx12g" "-Xms12g" "-XX:+UseParallelGC"]

Except that I have to tweak those 12s manually for different machines... Is there any way to specify "just use whatever's available"?

Since I'm supplying other :jvm-opts I was under the impression that I couldn't do the ^:replace [] thing... So is "with-profile production" going to have the same effect?

BTW I would also love input on the GC option. I'm also not at all sure that one is the best, but I generate lots of garbage across large numbers of cores so it seemed like a good idea. But then I read something here about the G1 GC... is that likely to be better? If so, does anyone know the string to include in :jvm-opts to use it?

Thanks (& sorry to include so many questions!),

-Lee

Bruce Adams

unread,
Feb 5, 2014, 8:50:28 PM2/5/14
to clo...@googlegroups.com
Modern JVM's pick default heap sizes based on the physical memory in
your machine. With more than 1GB of physical memory, initial heap is
1/64 and maximum heap is 1/4 of physical memory.[1]

For OpenJDK and Oracle, this command:
java -XX:+PrintFlagsFinal -version | grep HeapSize
will show the initial and maximum heap sizes (along with a few other
numbers).

Also, you may not want to set the initial heap size as large as the
maximum heap size. Oracle[2] says (in part):

> Setting -Xms and -Xmx to the same value increases predictability by removing the most important sizing decision from the virtual machine. However, the virtual machine is then unable to compensate if you make a poor choice.

- Bruce

[1]
http://stackoverflow.com/questions/2915276/what-is-the-default-maximum-heap-size-for-suns-jvm-from-java-se-6
[2]
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#generation_sizing.total_heap

Lee Spector

unread,
Feb 5, 2014, 9:15:52 PM2/5/14
to clo...@googlegroups.com

On Feb 5, 2014, at 8:50 PM, Bruce Adams wrote:
> Modern JVM's pick default heap sizes based on the physical memory in
> your machine. With more than 1GB of physical memory, initial heap is
> 1/64 and maximum heap is 1/4 of physical memory.[1]
>
> For OpenJDK and Oracle, this command:
> java -XX:+PrintFlagsFinal -version | grep HeapSize
> will show the initial and maximum heap sizes (along with a few other
> numbers).

Thanks Bruce. Do you happen to know if there's a way to specify different fractions? I'd like something more like 3/4 (or 7/8) than 1/4. Nothing else will be running on the machine (aside from the OS and maybe tiny other things), and I want it to take everything it might need. I realize I can hardcode specific sizes, but then I have to change it for every machine configuration it runs on, which is what I was hoping to avoid. I have 64GB on some of my machines, 12GB on others, etc., and I'd like to use most of what's available wherever I run, preferably without changing project.clj every time.

> Also, you may not want to set the initial heap size as large as the
> maximum heap size. Oracle[2] says (in part):
>
>> Setting -Xms and -Xmx to the same value increases predictability by removing the most important sizing decision from the virtual machine. However, the virtual machine is then unable to compensate if you make a poor choice.

The choice of using the same, maximal limit for both -Xms and -Xmx, could only be poor in the sense of using more than the necessary memory, right? Since I'm happy to use every available byte and am only concerned about speed it seems like this should be okay.

Thanks,

-Lee

Michał Marczyk

unread,
Feb 5, 2014, 11:42:58 PM2/5/14
to clojure
This returns

(.getTotalPhysicalMemorySize
(java.lang.management.ManagementFactory/getOperatingSystemMXBean))

You could use this in your project.clj, perhaps by including

~(str "-Xms" (quot (.getTotalPhysicalMemorySize ...) appropriate-number))

in :jvm-opts.

Also, you can absolutely use your own :jvm-opts with :replace.

Cheers,
Michał


PS. getTotalPhysicalMemorySize is declared by
com.sun.management.OperatingSystemMXBean. getOperatingSystemMXBean's
return type is actually java.lang.management.OperatingSystemMXBean,
but the actual returned value does implement the com.sun interface. I
just tested this on a Linux system, hopefully it'll also work on other
platforms.
Message has been deleted

Bruno Kim Medeiros Cesar

unread,
Feb 6, 2014, 7:07:40 AM2/6/14
to clo...@googlegroups.com, jonathand...@googlemail.com
Just to add a bit to the thread: the Java compiler treats java.lang.Math differently when more efficient alternatives are available. StrictMath is used only as a fallback.

By default many of the Math methods simply call the equivalent method in StrictMath for their implementation. Code generators are encouraged to use platform-specific native libraries or microprocessor instructions, where available, to provide higher-performance implementations of Math methods. Such higher-performance implementations still must conform to the specification for Math.


As most probably all your versions use the same native libraries or hardware instructions, the differences must rely either on float configuration parameters, like rounding modes, or the order of operations.

Bruno Kim Medeiros Cesar

unread,
Feb 6, 2014, 7:18:07 AM2/6/14
to clo...@googlegroups.com, jonathand...@googlemail.com
Also, I've made a test for this function for all float values in C: https://gist.github.com/brunokim/8843039

Unfortunetely it doesn't work in my system, as it does not have other rounding modes available besides the default. If anyone suceeds in running it, please report.

Glen Fraser

unread,
Feb 6, 2014, 9:38:11 AM2/6/14
to clo...@googlegroups.com
Probably because you're using 0 through 3 as the arguments to fesetround(), rather than the proper #defined values:

(e.g. on my Mac, from fenv.h)

#define FE_TONEAREST        0x0000
#define FE_DOWNWARD         0x0400
#define FE_UPWARD           0x0800
#define FE_TOWARDZERO       0x0c00

You should use the defines.

Glen.

You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/kFNxGrRPf2k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

Lee Spector

unread,
Feb 7, 2014, 11:22:06 AM2/7/14
to clo...@googlegroups.com

On Feb 5, 2014, at 11:42 PM, Michał Marczyk wrote:

> This returns
>
> (.getTotalPhysicalMemorySize
> (java.lang.management.ManagementFactory/getOperatingSystemMXBean))
>
> You could use this in your project.clj, perhaps by including
>
> ~(str "-Xms" (quot (.getTotalPhysicalMemorySize ...) appropriate-number))
>
> in :jvm-opts.


Very cool. I had no idea I could do computation in project.clj. The following seems to work to allocate 80% of the a machine's RAM to my process (launched with "lein trampoline with-profile production run"):

:jvm-opts [~(str "-Xmx"
(long (* (.getTotalPhysicalMemorySize
(java.lang.management.ManagementFactory/getOperatingSystemMXBean))
0.8)))
~(str "-Xms"
(long (* (.getTotalPhysicalMemorySize
(java.lang.management.ManagementFactory/getOperatingSystemMXBean))
0.8)))
"-XX:+UseParallelGC"]

I'll ask more about the GC part in another thread.

> Also, you can absolutely use your own :jvm-opts with :replace.

How do I combine them? Does the big vector above just replace the [] in ":jvm-opts ^:replace []"?

Also, does this (the :replace part) in fact do the same thing as putting "with-profile production" on the command line? So if I do this I can simplify my command line to "lein trampoline run"?

Thanks!

-Lee

Andy Fingerhut

unread,
Feb 7, 2014, 11:45:59 AM2/7/14
to clo...@googlegroups.com
You may also use a let form wrapped around your entire defproject if you want to avoid the duplication of code present in your example.

Andy



 -Lee

Lee Spector

unread,
Feb 7, 2014, 11:51:45 AM2/7/14
to clo...@googlegroups.com
On Feb 7, 2014, at 11:45 AM, Andy Fingerhut wrote:

> You may also use a let form wrapped around your entire defproject if you want to avoid the duplication of code present in your example.

Thanks -- I actually noticed that after I posted. I don't know why, but I never thought of project.clj as containing code that gets executed before. Opens up lots of possibilities, I think.

-Lee
Reply all
Reply to author
Forward
0 new messages