performnace and accuracy of big numbers

136 views
Skip to first unread message

Alex Tatumizer

unread,
Feb 12, 2014, 1:00:54 PM2/12/14
to mi...@dartlang.org
Somebody just published benchmark for V8: http://jxcore.com/document/v8vsllvm-numbers-test/
Obviously, I couldn't miss a chance to compare with dart.
Results are alarming.
Dart performs by 100 times slower than js (I really ran their code in Chrome to make sure this is not an optical illusion or something).
Sure, "fibi" returns enormous numbers, but somehow javascript can cope with them, and dart can't.

Also deserves mentioning that dart, in my opinion, takes too much freedom while dealing with ints - it can round stuff to whatever precision it sees fit.
E.g., fibi(100) returns 218922995834555169026 in javascript and 218922995834555200000 in dart. Close, but not the same.
WTF?

here is my translation of their program
fibi (int x){
   
   
int a = 1, b = 1, c;
   
for (int i = 3; i < x; i++){
        c
= a + b;
        a
= b;
        b
= c;
   
}
   
return b;
}
 
test
() {
 
int t = 0;
   
 
var w = new Stopwatch()..start();
   
 
for(int x=0;x<500;x++){
     
for(int z=0;z<500;z++){
          t
+= fibi(z + 100) % 991;
     
}
 
}
   
 
print("total ${w.elapsedMilliseconds} $t");
}
main
() {
 
print(fibi(100));
  test
();

}


Alex Tatumizer

unread,
Feb 12, 2014, 1:56:12 PM2/12/14
to mi...@dartlang.org
Not all is bad though: they have another test for strings http://jxcore.com/document/v8vsllvm-string-test/
In this test,  js program completes in 558ms (in my Chrome), and dart runs with alacrity at 74 ms.
And surprisingly, the computed value is the same in both experiments!



Vyacheslav Egorov

unread,
Feb 12, 2014, 1:59:35 PM2/12/14
to General Dart Discussion
Alex, there are no big integers in JavaScript. To be completely precise there are no small integers either, there are only doubles :-) Here is more accurate translation of their code to Dart

fibi (x) {
    double a = 1.0, b = 1.0, c;
    for (int i = 3; i < x; i++){
        c = a + b;
        a = b;
        b = c;
    }
    return b;
}

test () {
  double t = 0.0;

  var w = new Stopwatch()..start();

  for(int x=0;x<500;x++){
      for(int z=0;z<500;z++){
          t += fibi(z + 100) % 991;
      }
  }

  print("total ${w.elapsedMilliseconds} $t");
}

main() {
  print(fibi(100));
  test();
}

The question "what on earth are we trying to compute" I leave to curious minds.


// Vyacheslav Egorov


--
For other discussions, see https://groups.google.com/a/dartlang.org/
 
For HOWTO questions, visit http://stackoverflow.com/tags/dart
 
To file a bug report or feature request, go to http://www.dartbug.com/new

To unsubscribe from this group and stop receiving emails from it, send an email to misc+uns...@dartlang.org.

Alex Tatumizer

unread,
Feb 12, 2014, 2:08:27 PM2/12/14
to mi...@dartlang.org
OMG! I was fooled by a scam! Now I recall that I really knew about this js quirk, but forgot.
With doubles, our timing is exactly the same as theirs. (Is it good enough?)

Jon Valdes

unread,
Feb 13, 2014, 2:02:30 AM2/13/14
to mi...@dartlang.org


On Wednesday, February 12, 2014 8:08:27 PM UTC+1, Alex Tatumizer wrote:
OMG! I was fooled by a scam! Now I recall that I really knew about this js quirk, but forgot.
With doubles, our timing is exactly the same as theirs. (Is it good enough?)


Here's a C++11 version of the same test. Should give you a rough estimate of what's the upper limit for performance in this case.


#include <stdio.h>
#include <chrono>
#include <cmath>
using namespace std::chrono;

double fibi (double x) {
    double a = 1.0;
    double b = 1.0;
    double c;
    for (int i = 3; i < x; i++){
        c = a + b;
        a = b;
        b = c;
    }
    return b;
}

void test () {
    double t = 0.0;
    high_resolution_clock timer;
    auto start = timer.now();

    for(int x=0;x<500;x++){
        for(int z=0;z<500;z++){
            t += fmod(fibi(z + 100), 991);
        }
    }
    auto end = timer.now();
    auto elapsed = duration_cast<milliseconds>(end-start).count();
    printf("Elapsed milliseconds: %lld\n", elapsed);
}

int main() {
    printf("Value: %f\n", fibi(100));
    test();

Jos Hirth

unread,
Feb 13, 2014, 2:38:23 AM2/13/14
to mi...@dartlang.org
With JavaScript (or doubles for that matter), the biggest integer you can represent is 2^53 or 9007199254740992 (9 quadrillion, a 9 followed by 15 zeros).

9007199254740992
218922995834555169026

Just by looking at the number of digits, you can easily tell that the result must be completely bogus.

They should have used Octane or any of the other benchmark suites.

By the way, the Benchmarks Game does include one big int benchmark:


Currently, Dart is at 15x which isn't too hot. The Python implementation it's based on is over 10 times faster.

Lasse R.H. Nielsen

unread,
Feb 13, 2014, 3:34:22 AM2/13/14
to mi...@dartlang.org
On Thu, Feb 13, 2014 at 8:38 AM, Jos Hirth <google...@kaioa.com> wrote:
With JavaScript (or doubles for that matter), the biggest integer you can represent is 2^53 or 9007199254740992 (9 quadrillion, a 9 followed by 15 zeros).

Needless pedantry (isn't that an oxymoron?): 

The largest number representable by a double is: 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.0 
(aka 1.7976931348623157e+308). It's also an integer, so *that's* the biggest integer you can represent :)

Now, the number 2^53+1 is the smallest integer value that you *cannot* represent with a double. That means that any result above this is (increasingly) likely to be an approximation of the real value. A double can have at most 53 significant binary digits (and any integer with at most 53 significant binary digits can be a double, up to the max above).

So with that pedantry aside, your argument is absolutely correct:


9007199254740992
218922995834555169026

Just by looking at the number of digits, you can easily tell that the result must be completely bogus.

True. And 218922995834555169026 has 67 significant binary digits, so it's definitely not representable as a double.

I also think 218922995834555169026 is the Dart result, not the JS result, and It's actually fib(98) or fib(99), not fib(100). Which on it is depends on whether fib(1) == 1 or fib(1) == 0.

/L 'Benchmarking is hard, let's go shopping'
--
Lasse R.H. Nielsen - l...@google.com  
'Faith without judgement merely degrades the spirit divine'
Google Denmark ApS - Frederiksborggade 20B, 1 sal - 1360 København K - Denmark - CVR nr. 28 86 69 84

Jos Hirth

unread,
Feb 13, 2014, 3:58:06 AM2/13/14
to mi...@dartlang.org
Touché. :)

Alex Tatumizer

unread,
Feb 13, 2014, 9:40:49 AM2/13/14
to mi...@dartlang.org
Let's switch topic a bit.
Consider the following etude in C sharp (see *):

  print(identical(#C, #C)); // true
 
print(identical((#C).toString(), (#C).toString()));//surprise:false

What follows from this gedanken experiment?
1) symbols are interned, like constant strings
2) symbol doesn't have a string representation inside
3) symbol doesn't even cache string representation when it becomes known.

Another observation: when you use symbols as keys to the map, the performance is horrible!
When you use #C as a key vs "C" as a key, the former is by 6 times slower than the latter.

Since the whole mirror system revolves around symbols, one needs symbols as keys very often. 
Actually, I don't understand what's going on here. Fine, symbol doesn't contain string. But it certainly contains *something*. Why not compute hash code for this "something" and cache it?

In another experiment, I compared performance of Symbols as keys with performance of Symbol.toString() as key. The numbers are very close, so my working hypothesis is that Symbol (internally) converts itself to String each time when hashCode is requested.  I have no other explanation for effect. Vyacheslav, Lasse: please disprove. 

(*) I experimented also with the runs in F sharp - it made no difference whatever.

Lasse R.H. Nielsen

unread,
Feb 13, 2014, 10:12:51 AM2/13/14
to mi...@dartlang.org
That would be hard to disprove, considering this code taken from the internal symbol.dart file:

  final String _name;
  ...
  int get hashCode {
    const arbitraryPrime = 664597;
    return 0x1fffffff & (arbitraryPrime * _name.hashCode);
  }

  toString() => 'Symbol("$_name")';

So, the symbol does have a string inside it. 
The hashCode does use that string's hashCode.
The toString method doesn't return that string, but a new string created by wrapping "Symbol(...)" around it, which is why the results of toString() are not identical, it's a new string each time.

I don't know if the VM is able to detect that the multiplication in hashCode can be done as a smi operation, otherwise it might be more expensive than necessary.
 
In complete generality: don't use toString for anything but debugging, please :)
For anything but numbers, the "toString" result is created for readability, not efficiency.


This is the internal representation of a (non private) symbol. It's not exposed, so we may change it any time (for instance for performance reasons).
We can't cache the hashCode, though, since Symbol is a const class - all fields must be final.

And for the record, I think it is a mistake that we return the internal string in the toString, but not make it available in any other way (q.v., Josh Bloch's Effective Java, last paragraph of item 9).

/L

Alex Tatumizer

unread,
Feb 13, 2014, 12:10:59 PM2/13/14
to mi...@dartlang.org
@Lasse: what problem is addressed by multiplying String's hashCode by arbitrary prime? Why don't you return the same hashCode as _name.hashCode (which is cached)?



Alan Knight

unread,
Feb 13, 2014, 12:32:16 PM2/13/14
to General Dart Discussion
...
 
And for the record, I think it is a mistake that we return the internal string in the toString, but not make it available in any other way (q.v., Josh Bloch's Effective Java, last paragraph of item 9).

It's available via MirrorSystem.getName, just not as a method on Symbol. My understanding is that this is to ensure that if you're trying to get the name you know statically exactly when and where, because in a minified system doing this operation can require you to use a lot of additional space.

But the hash operation seems very peculiar to me. But perhaps less so if we look at 

identical(new Symbol("C"), new Symbol("C"))  => false

It seems to me that part of the point of having symbols is the interning, so then you can just use identity hashing.

 

Bob Nystrom

unread,
Feb 13, 2014, 12:59:05 PM2/13/14
to General Dart Discussion

On Thu, Feb 13, 2014 at 9:10 AM, Alex Tatumizer <tatu...@gmail.com> wrote:
@Lasse: what problem is addressed by multiplying String's hashCode by arbitrary prime? Why don't you return the same hashCode as _name.hashCode (which is cached)?

I assume it's to help spread out keys when you have a mixture of symbols and strings with the same name.

- bob

Alex Tatumizer

unread,
Feb 13, 2014, 1:03:38 PM2/13/14
to mi...@dartlang.org
> I assume it's to help spread out keys when you have a mixture of symbols and strings with the same name.
in the same Map??? Is it a sufficient ground to degrade performance by a factor of 6?
Wouldn't it be more fair if whoever mixes #C with "C" in the same map should pay a price, not those who don't?

Reply all
Reply to author
Forward
0 new messages