I've got the following Scala code: object Foo { def main(argv: Array[String]) { var bits = 0L val start = System.currentTimeMillis() var n = 2000000001 // var n = 2000000001L // makes things very slow while (n > 0) { bits = bits ^ (1 << 5) n = n - 1 } System.out.println(bits) val end = System.currentTimeMillis() System.out.println(end-start) }
}
I'm enclosing the source and the bytecode.
There are 2B iterations.
On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to run.
On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
The run time on the Mac seems more "reasonable". What's going on?
David, For me 1.5 and 1.6 perform about the same...1.6 is a little faster. However, -server yields the surprisingly fast results that you are seeing under 1.6 and -client yields the slow results that you are seeing under 1.5.
It still doesn't explain why it's so fast...but it's another data point. I'm using an ancient computer with Linux.
-Erik
On Nov 30, 2007 5:49 PM, David Pollak <feeder.of.the.be...@gmail.com> wrote:
> I've got the following Scala code: > object Foo { > def main(argv: Array[String]) { > var bits = 0L > val start = System.currentTimeMillis() > var n = 2000000001 > // var n = 2000000001L // makes things very slow > while (n > 0) { > bits = bits ^ (1 << 5) > n = n - 1 > } > System.out.println(bits) > val end = System.currentTimeMillis() > System.out.println(end-start) > } > }
> I'm enclosing the source and the bytecode.
> There are 2B iterations.
> On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to run.
> On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
> The run time on the Mac seems more "reasonable". What's going on?
> David, > For me 1.5 and 1.6 perform about the same...1.6 is a little faster. > However, -server yields the surprisingly fast results that you are seeing > under 1.6 and -client yields the slow results that you are seeing under 1.5.
> It still doesn't explain why it's so fast...but it's another data point. > I'm using an ancient computer with Linux.
> -Erik
> On Nov 30, 2007 5:49 PM, David Pollak < feeder.of.the.be...@gmail.com> > wrote: > > I've got the following Scala code: > > object Foo { > > def main(argv: Array[String]) { > > var bits = 0L > > val start = System.currentTimeMillis() > > var n = 2000000001 > > // var n = 2000000001L // makes things very slow > > while (n > 0) { > > bits = bits ^ (1 << 5) > > n = n - 1 > > } > > System.out.println(bits) > > val end = System.currentTimeMillis() > > System.out.println(end-start) > > } > > }
> > I'm enclosing the source and the bytecode.
> > There are 2B iterations.
> > On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to run.
> > On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
> > The run time on the Mac seems more "reasonable". What's going on?
For what it's worth, the obvious Java translation does the same thing:
class Foo{ public static void main(String[] args){ long bits = 0L; long start = System.currentTimeMillis(); int n = 2000000001; while (n > 0) { bits ^= (1 << 5); n--; } System.out.println(bits); long end = System.currentTimeMillis(); System.out.println(end-start);
}
}
So it isn't a quirk of the bytecode scala produces
On Nov 30, 2007 11:18 PM, Erik Engbrecht <erik.engbre...@gmail.com> wrote:
> David, > For me 1.5 and 1.6 perform about the same...1.6 is a little faster. > However, -server yields the surprisingly fast results that you are seeing > under 1.6 and -client yields the slow results that you are seeing under 1.5.
> It still doesn't explain why it's so fast...but it's another data point. > I'm using an ancient computer with Linux.
> -Erik
> On Nov 30, 2007 5:49 PM, David Pollak < feeder.of.the.be...@gmail.com> > wrote: > > I've got the following Scala code: > > object Foo { > > def main(argv: Array[String]) { > > var bits = 0L > > val start = System.currentTimeMillis() > > var n = 2000000001 > > // var n = 2000000001L // makes things very slow > > while (n > 0) { > > bits = bits ^ (1 << 5) > > n = n - 1 > > } > > System.out.println(bits) > > val end = System.currentTimeMillis() > > System.out.println(end-start) > > } > > }
> > I'm enclosing the source and the bytecode.
> > There are 2B iterations.
> > On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to run.
> > On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
> > The run time on the Mac seems more "reasonable". What's going on?
Yeah, the Scala code was a port of Java code that does similar things. With the Scala command line and some other Scala tools, it's much faster to make a change and test the change with Scala than with Java.
On 11/30/07, David MacIver <david.maci...@gmail.com> wrote:
> For what it's worth, the obvious Java translation does the same thing:
> class Foo{ > public static void main(String[] args){ > long bits = 0L; > long start = System.currentTimeMillis(); > int n = 2000000001; > while (n > 0) { > bits ^= (1 << 5); > n--; > } > System.out.println(bits); > long end = System.currentTimeMillis(); > System.out.println(end-start);
> } > }
> So it isn't a quirk of the bytecode scala produces
> On Nov 30, 2007 11:18 PM, Erik Engbrecht <erik.engbre...@gmail.com> wrote: > > David, > > For me 1.5 and 1.6 perform about the same...1.6 is a little faster. > > However, -server yields the surprisingly fast results that you are > seeing > > under 1.6 and -client yields the slow results that you are seeing under > 1.5.
> > It still doesn't explain why it's so fast...but it's another data point. > > I'm using an ancient computer with Linux.
> > -Erik
> > On Nov 30, 2007 5:49 PM, David Pollak < feeder.of.the.be...@gmail.com> > > wrote: > > > I've got the following Scala code: > > > object Foo { > > > def main(argv: Array[String]) { > > > var bits = 0L > > > val start = System.currentTimeMillis() > > > var n = 2000000001 > > > // var n = 2000000001L // makes things very slow > > > while (n > 0) { > > > bits = bits ^ (1 << 5) > > > n = n - 1 > > > } > > > System.out.println(bits) > > > val end = System.currentTimeMillis() > > > System.out.println(end-start) > > > } > > > }
> > > I'm enclosing the source and the bytecode.
> > > There are 2B iterations.
> > > On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to > run.
> > > On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
> > > The run time on the Mac seems more "reasonable". What's going on?
David Pollak wrote: > On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to run.
> On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
> The run time on the Mac seems more "reasonable". What's going on?
If the core 2 is a server-class machine it will run the server VM by default, which may be able to optimize much of the code away. Do you get the same results forcing server or client VMs on both machines?
On a 64-bit machine running linux there's no difference between the client and server times. I don't know whether this is because the 64-bit client and server VM optimisations aren't as different as they are in 32-bit mode or whether there's some advantage to 64-bit here (if there is, gcc isn't taking advantage of it, see below).
The obvious C translation of
#include <stdio.h>
int main(){ long bits = 0L; int n = 2000000001; while (n > 0) { bits ^= (1 << 5); n--; } printf("%d", bits);
}
Seems to take longer than the client time, regardless of whether it's in 64 or 32 bit mode.
On Dec 1, 2007 3:18 AM, David Pollak <feeder.of.the.be...@gmail.com> wrote:
> Yeah, the Scala code was a port of Java code that does similar things. With > the Scala command line and some other Scala tools, it's much faster to make > a change and test the change with Scala than with Java.
> On 11/30/07, David MacIver <david.maci...@gmail.com> wrote:
> > For what it's worth, the obvious Java translation does the same thing:
> > class Foo{ > > public static void main(String[] args){ > > long bits = 0L; > > long start = System.currentTimeMillis(); > > int n = 2000000001; > > while (n > 0) { > > bits ^= (1 << 5); > > n--; > > } > > System.out.println(bits); > > long end = System.currentTimeMillis(); > > System.out.println(end-start);
> > } > > }
> > So it isn't a quirk of the bytecode scala produces
> > On Nov 30, 2007 11:18 PM, Erik Engbrecht <erik.engbre...@gmail.com> wrote: > > > David, > > > For me 1.5 and 1.6 perform about the same...1.6 is a little faster. > > > However, -server yields the surprisingly fast results that you are > seeing > > > under 1.6 and -client yields the slow results that you are seeing under > 1.5.
> > > It still doesn't explain why it's so fast...but it's another data point. > > > I'm using an ancient computer with Linux.
> > > -Erik
> > > On Nov 30, 2007 5:49 PM, David Pollak < feeder.of.the.be...@gmail.com> > > > wrote: > > > > I've got the following Scala code: > > > > object Foo { > > > > def main(argv: Array[String]) { > > > > var bits = 0L > > > > val start = System.currentTimeMillis() > > > > var n = 2000000001 > > > > // var n = 2000000001L // makes things very slow > > > > while (n > 0) { > > > > bits = bits ^ (1 << 5) > > > > n = n - 1 > > > > } > > > > System.out.println(bits) > > > > val end = System.currentTimeMillis() > > > > System.out.println (end-start) > > > > } > > > > }
> > > > I'm enclosing the source and the bytecode.
> > > > There are 2B iterations.
> > > > On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to > run.
> > > > On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
> > > > The run time on the Mac seems more "reasonable". What's going on?
> On a 64-bit machine running linux there's no difference between the > client and server times. I don't know whether this is because the > 64-bit client and server VM optimisations aren't as different as they > are in 32-bit mode or whether there's some advantage to 64-bit here > (if there is, gcc isn't taking advantage of it, see below).
> The obvious C translation of
> #include <stdio.h>
> int main(){ > long bits = 0L; > int n = 2000000001; > while (n > 0) { > bits ^= (1 << 5); > n--; > } > printf("%d", bits); > }
> Seems to take longer than the client time, regardless of whether it's > in 64 or 32 bit mode.
> On Dec 1, 2007 3:18 AM, David Pollak <feeder.of.the.be...@gmail.com> > wrote: >> Yeah, the Scala code was a port of Java code that does similar >> things. With >> the Scala command line and some other Scala tools, it's much faster >> to make >> a change and test the change with Scala than with Java.
>> On 11/30/07, David MacIver <david.maci...@gmail.com> wrote:
>>> For what it's worth, the obvious Java translation does the same >>> thing:
>>> class Foo{ >>> public static void main(String[] args){ >>> long bits = 0L; >>> long start = System.currentTimeMillis(); >>> int n = 2000000001; >>> while (n > 0) { >>> bits ^= (1 << 5); >>> n--; >>> } >>> System.out.println(bits); >>> long end = System.currentTimeMillis(); >>> System.out.println(end-start);
>>> } >>> }
>>> So it isn't a quirk of the bytecode scala produces
>>> On Nov 30, 2007 11:18 PM, Erik Engbrecht >>> <erik.engbre...@gmail.com> wrote: >>>> David, >>>> For me 1.5 and 1.6 perform about the same...1.6 is a little faster. >>>> However, -server yields the surprisingly fast results that you are >> seeing >>>> under 1.6 and -client yields the slow results that you are seeing >>>> under >> 1.5.
>>>> It still doesn't explain why it's so fast...but it's another data >>>> point. >>>> I'm using an ancient computer with Linux.
>>>> -Erik
>>>> On Nov 30, 2007 5:49 PM, David Pollak < feeder.of.the.be...@gmail.com
>>>> wrote: >>>>> I've got the following Scala code: >>>>> object Foo { >>>>> def main(argv: Array[String]) { >>>>> var bits = 0L >>>>> val start = System.currentTimeMillis() >>>>> var n = 2000000001 >>>>> // var n = 2000000001L // makes things very slow >>>>> while (n > 0) { >>>>> bits = bits ^ (1 << 5) >>>>> n = n - 1 >>>>> } >>>>> System.out.println(bits) >>>>> val end = System.currentTimeMillis() >>>>> System.out.println (end-start) >>>>> } >>>>> }
>>>>> I'm enclosing the source and the bytecode.
>>>>> There are 2B iterations.
>>>>> On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to >> run.
>>>>> On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
>>>>> The run time on the Mac seems more "reasonable". What's going on?
>>>>> -- >>>>> lift, the secure, simple, powerful web framework http:// >>>>> liftweb.net >>>>> Collaborative Task Management http://much4.us
> "Scala faster than C" post to reddit, anyone? ;-)
> On 1 Dec 2007, at 10:53, David MacIver wrote:
> > More interesting data points:
> > On a 64-bit machine running linux there's no difference between the > > client and server times. I don't know whether this is because the > > 64-bit client and server VM optimisations aren't as different as they > > are in 32-bit mode or whether there's some advantage to 64-bit here > > (if there is, gcc isn't taking advantage of it, see below).
> > The obvious C translation of
> > #include <stdio.h>
> > int main(){ > > long bits = 0L; > > int n = 2000000001; > > while (n > 0) { > > bits ^= (1 << 5); > > n--; > > } > > printf("%d", bits); > > }
> > Seems to take longer than the client time, regardless of whether it's > > in 64 or 32 bit mode.
> > On Dec 1, 2007 3:18 AM, David Pollak <feeder.of.the.be...@gmail.com> > > wrote: > >> Yeah, the Scala code was a port of Java code that does similar > >> things. With > >> the Scala command line and some other Scala tools, it's much faster > >> to make > >> a change and test the change with Scala than with Java.
> >> On 11/30/07, David MacIver <david.maci...@gmail.com> wrote:
> >>> For what it's worth, the obvious Java translation does the same > >>> thing:
> >>> class Foo{ > >>> public static void main(String[] args){ > >>> long bits = 0L; > >>> long start = System.currentTimeMillis(); > >>> int n = 2000000001; > >>> while (n > 0) { > >>> bits ^= (1 << 5); > >>> n--; > >>> } > >>> System.out.println(bits); > >>> long end = System.currentTimeMillis(); > >>> System.out.println(end-start);
> >>> } > >>> }
> >>> So it isn't a quirk of the bytecode scala produces
> >>> On Nov 30, 2007 11:18 PM, Erik Engbrecht > >>> <erik.engbre...@gmail.com> wrote: > >>>> David, > >>>> For me 1.5 and 1.6 perform about the same...1.6 is a little faster. > >>>> However, -server yields the surprisingly fast results that you are > >> seeing > >>>> under 1.6 and -client yields the slow results that you are seeing > >>>> under > >> 1.5.
> >>>> It still doesn't explain why it's so fast...but it's another data > >>>> point. > >>>> I'm using an ancient computer with Linux.
> >>>> -Erik
> >>>> On Nov 30, 2007 5:49 PM, David Pollak < feeder.of.the.be...@gmail.com
> >>>> wrote: > >>>>> I've got the following Scala code: > >>>>> object Foo { > >>>>> def main(argv: Array[String]) { > >>>>> var bits = 0L > >>>>> val start = System.currentTimeMillis() > >>>>> var n = 2000000001 > >>>>> // var n = 2000000001L // makes things very slow > >>>>> while (n > 0) { > >>>>> bits = bits ^ (1 << 5) > >>>>> n = n - 1 > >>>>> } > >>>>> System.out.println(bits) > >>>>> val end = System.currentTimeMillis() > >>>>> System.out.println (end-start) > >>>>> } > >>>>> }
> >>>>> I'm enclosing the source and the bytecode.
> >>>>> There are 2B iterations.
> >>>>> On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to > >> run.
> >>>>> On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
> >>>>> The run time on the Mac seems more "reasonable". What's going on?
I think I've figured out what's going on. I can't figure out how to get sufficient debug information out of the server hotspot to verify this, but it seems to make sense and performing the translation manually produces comparable speedups in the client VM.
class Foo{ public static void main(String[] args){ long bits = 0L; long start = System.currentTimeMillis(); int n = 2000000001; while (n > 0) { bits ^= (1 << 5); n--; } System.out.println(bits); long end = System.currentTimeMillis(); System.out.println(end-start);
So we can remove that calculation from the loop entirely.
So basically the reason the server VM is so freaking fast is that the loop body is practically empty. :-) It's just the end result of a few simple "throwing away uneccessary work" optimisations, nothing magic.
On Dec 1, 2007 1:13 AM, David MacIver <david.maci...@gmail.com> wrote:
> For what it's worth, the obvious Java translation does the same thing:
> class Foo{ > public static void main(String[] args){ > long bits = 0L; > long start = System.currentTimeMillis(); > int n = 2000000001; > while (n > 0) { > bits ^= (1 << 5); > n--; > } > System.out.println(bits); > long end = System.currentTimeMillis(); > System.out.println(end-start);
> } > }
> So it isn't a quirk of the bytecode scala produces
> On Nov 30, 2007 11:18 PM, Erik Engbrecht <erik.engbre...@gmail.com> wrote:
> > David, > > For me 1.5 and 1.6 perform about the same...1.6 is a little faster. > > However, -server yields the surprisingly fast results that you are seeing > > under 1.6 and -client yields the slow results that you are seeing under 1.5.
> > It still doesn't explain why it's so fast...but it's another data point. > > I'm using an ancient computer with Linux.
> > -Erik
> > On Nov 30, 2007 5:49 PM, David Pollak < feeder.of.the.be...@gmail.com> > > wrote: > > > I've got the following Scala code: > > > object Foo { > > > def main(argv: Array[String]) { > > > var bits = 0L > > > val start = System.currentTimeMillis() > > > var n = 2000000001 > > > // var n = 2000000001L // makes things very slow > > > while (n > 0) { > > > bits = bits ^ (1 << 5) > > > n = n - 1 > > > } > > > System.out.println(bits) > > > val end = System.currentTimeMillis() > > > System.out.println(end-start) > > > } > > > }
> > > I'm enclosing the source and the bytecode.
> > > There are 2B iterations.
> > > On my Core 2 Quad running JDK 1.6 (32 bit), the code takes 2 ms to run.
> > > On my Mac Book Pro (Core Duo, JDK 1.5) it takes 6,600 ms.
> > > The run time on the Mac seems more "reasonable". What's going on?
> On a 64-bit machine running linux there's no difference between the > client and server times. I don't know whether this is because the > 64-bit client and server VM optimisations aren't as different as they > are in 32-bit mode or whether there's some advantage to 64-bit here > (if there is, gcc isn't taking advantage of it, see below).
I believe that the 64bit SUN JVM defaults to server mode. The expectation is that 64bit things would be used on a server.
On Dec 1, 2007 4:17 PM, Richard Warburton <richard.warbur...@gmail.com> wrote:
> > On a 64-bit machine running linux there's no difference between the > > client and server times. I don't know whether this is because the > > 64-bit client and server VM optimisations aren't as different as they > > are in 32-bit mode or whether there's some advantage to 64-bit here > > (if there is, gcc isn't taking advantage of it, see below).
> I believe that the 64bit SUN JVM defaults to server mode. The > expectation is that 64bit things would be used on a server.
I tried it with the -client flag as well. So it may not be a case of default so much as "Always runs in server mode".
It would definitely explain some of the problems I've seen with Swing on 64-bit VMs.
> On Dec 1, 2007 4:17 PM, Richard Warburton <richard.warbur...@gmail.com> wrote:
> > > On a 64-bit machine running linux there's no difference between the > > > client and server times. I don't know whether this is because the > > > 64-bit client and server VM optimisations aren't as different as they > > > are in 32-bit mode or whether there's some advantage to 64-bit here > > > (if there is, gcc isn't taking advantage of it, see below).
> > I believe that the 64bit SUN JVM defaults to server mode. The > > expectation is that 64bit things would be used on a server.
> I tried it with the -client flag as well. So it may not be a case of > default so much as "Always runs in server mode".
> It would definitely explain some of the problems I've seen with Swing > on 64-bit VMs.
And if I'd spent five minutes googling I would have discovered that this is indeed the case. Doh.