Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
nbody code - any suggestions for improvement?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 27 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Ian Ozsvald  
View profile  
 More options Jun 1 2011, 1:08 pm
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Wed, 1 Jun 2011 18:08:55 +0100
Local: Wed, Jun 1 2011 1:08 pm
Subject: nbody code - any suggestions for improvement?
Here's the nbody code I'm using for the EuroPython tutorial. I'm
comparing CPython, PyPy, Cython, ShedSkin and a few others tools (and
looking at profiling first), the goal is to teach people about their
options for making CPU-bound code faster.

The nbody code is based on the language shootout here:
http://shootout.alioth.debian.org/u32/performance.php?test=nbody

I'm not worried about whether the test is really 'apples vs apples', I
just care about the rough orders of magnitude. E.g. CPython on their
Intel box takes 20 mins, JavaScript V8 takes 71 seconds, Fortran, C
and Java take about 20 seconds each. Their site seems be undergoing an
upgrade right now - the graphs don't work and you can't seem to get to
the source code (but I have a copy of the nbody code).

I had to make a minor change to the code so that it would compile in
ShedSkin (the main dictionary had lists and floats, now it just has
lists *of* floats and I dereference mass where required). Here's the
code:
http://dl.dropbox.com/u/1314015/nbody_shedskin.py
- it is just a couple of functions, 'advance' is the monster that eats
all the time.

Roughly speaking for 50,000,000 iterations (e.g. 'python2.7
nbody_shedskin.py 50000000) on my MacBook 2GHz:
CPython 35 mins
PyPy 1.5 JIT 5mins11sec
ShedSkin0.8 -l 2min28sec
ShedSkin0.8 -l -b -w 1min56sec # e.g. shedskin -l -b -w
nbody_shedskin.py; make; ./nbody_shedskin 50000000

Ultimately this means that the compiled and 'best' version of ShedSkin
that I can make (and I'm hoping you can spot any flaws I've made...)
is still beaten by JavaScript V8! I'd love to be able to announce
better figures during my tutorial at EuroPython. However - ShedSkin
does beat PyPy (and PyPy nicely beats CPython). These are great
results, I'd just like to know if I've missed anything obvious in the
benchmark.

In each of the above 4 test cases I confirm that only 1 CPU (50% of my
dual-core MacBook) is used. I'm using CPython 2.7 32bit.

Any feedback gratefully received,
Ian.

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Jun 1 2011, 2:21 pm
From: Mark Dufour <mark.duf...@gmail.com>
Date: Wed, 1 Jun 2011 20:21:55 +0200
Local: Wed, Jun 1 2011 2:21 pm
Subject: Re: nbody code - any suggestions for improvement?

hi ian,

thanks for all your mails :) I will try to reply to all of them soon. for
now, please see attachment for a version of your code that is almost twice
as fast after compilation.

the main difference is that I added a 'body' class, and use attributes
instead of dicts and tuples to hold the physical constants. dict and tuple
indexing is very slow compared to simple attribute accesses in C/C++.
attributes are slower in cpython I guess, and that's probably the reason
this shootout implementation was coded up as it was, but there you have it.
imnsho, this version is also more readable.. ;)

thanks!
mark.

--
http://www.youtube.com/watch?v=E6LsfnBmdnk

  nbody2.py
4K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brent Pedersen  
View profile  
 More options Jun 1 2011, 2:24 pm
From: Brent Pedersen <bpede...@gmail.com>
Date: Wed, 1 Jun 2011 12:24:05 -0600
Local: Wed, Jun 1 2011 2:24 pm
Subject: Re: nbody code - any suggestions for improvement?

Heh, I was just writing to explain that I did almost identical here:
http://paste.pocoo.org/show/399045/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Jun 1 2011, 2:32 pm
From: Mark Dufour <mark.duf...@gmail.com>
Date: Wed, 1 Jun 2011 20:32:58 +0200
Local: Wed, Jun 1 2011 2:32 pm
Subject: Re: nbody code - any suggestions for improvement?

> Heh, I was just writing to explain that I did almost identical here:
> http://paste.pocoo.org/show/399045/

ah, I was probably faster because I practiced this.. :S

http://gitorious.org/shedskin/mainline/blobs/master/examples/nbody.py

note that here the algorithm has also been improved (by douglas mcneill
iirc).

thanks!!
mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brent Pedersen  
View profile  
 More options Jun 1 2011, 2:39 pm
From: Brent Pedersen <bpede...@gmail.com>
Date: Wed, 1 Jun 2011 12:39:48 -0600
Local: Wed, Jun 1 2011 2:39 pm
Subject: Re: nbody code - any suggestions for improvement?

On Wed, Jun 1, 2011 at 12:32 PM, Mark Dufour <mark.duf...@gmail.com> wrote:

>> Heh, I was just writing to explain that I did almost identical here:
>> http://paste.pocoo.org/show/399045/

> ah, I was probably faster because I practiced this.. :S

> http://gitorious.org/shedskin/mainline/blobs/master/examples/nbody.py

cool! I didn't know you could use an empty class like that and just
assign the attributes after.
That--and yours-- is also faster since it's not indexing into the
global MASS thing--which seems
weird even for a normal python implementation.

Also, Ian, you can make the code shorter using itertools.combinations:

PAIRS = list(combinations(SYSTEM, 2))

not sure of the effect on speed.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 1 2011, 5:09 pm
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Wed, 1 Jun 2011 22:09:38 +0100
Local: Wed, Jun 1 2011 5:09 pm
Subject: Re: nbody code - any suggestions for improvement?
Gentlemen - many thanks :-)

Mark - I'm using your version. Brent, cheers for your version too.

Re. the version in gitorious, it evolves differently so I'll ignore it
(trying not to investigate too many things at once!), I hadn't
realised that someone had tried this already :-) If I'd have looked in
the examples/ directory I'd have known!

My goal is to try to keep the programs mostly the same (I only changed
the shedskin version to make it compile) and to try various tools to
make the code faster. I'm being pragmatic and trying to teach how I
make code faster+maintainable for clients (and often - clients who
don't want to learn new things or change the way they support
things!).

More tomorrow I guess,
i.

On 1 June 2011 19:39, Brent Pedersen <bpede...@gmail.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 1 2011, 5:11 pm
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Wed, 1 Jun 2011 22:11:47 +0100
Local: Wed, Jun 1 2011 5:11 pm
Subject: Re: nbody code - any suggestions for improvement?
Interestingly - I think I can claim that the Mark/Brent version would
beat the V8 Javascript benchmark. In terms of squeezing a lot of
performance out of a piece of code with little work, it is quite
impressive.

Brent - I note your point about the odd indexing approach in the
original code. I agree, I found it quite tricky to read. However, that
often occurs in client HPC projects and if I go changing their code
too much, they reject the alterations in favour of what they know.
That'll be part of the story I tell.

i.

On 1 June 2011 22:09, Ian Ozsvald <i...@ianozsvald.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Spura  
View profile  
 More options Jun 1 2011, 8:02 pm
From: Thomas Spura <toms...@fedoraproject.org>
Date: Thu, 2 Jun 2011 02:02:06 +0200
Local: Wed, Jun 1 2011 8:02 pm
Subject: Re: nbody code - any suggestions for improvement?
On Wed, 1 Jun 2011 20:32:58 +0200

Mark Dufour wrote:
> > Heh, I was just writing to explain that I did almost identical here:
> > http://paste.pocoo.org/show/399045/

> ah, I was probably faster because I practiced this.. :S

> http://gitorious.org/shedskin/mainline/blobs/master/examples/nbody.py

> note that here the algorithm has also been improved (by douglas
> mcneill iirc).

> thanks!!
> mark.

When changing the devision of distande**3 to /
(distance*distance*distance) the time drops to 55-60%.

Maybe the power function could still need some love to be optimized...

        Thomas


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Isaac Gouy  
View profile  
 More options Jun 1 2011, 6:15 pm
From: Isaac Gouy <igo...@yahoo.com>
Date: Wed, 1 Jun 2011 15:15:09 -0700 (PDT)
Local: Wed, Jun 1 2011 6:15 pm
Subject: Re: nbody code - any suggestions for improvement?

On Jun 1, 10:08 am, Ian Ozsvald <i...@ianozsvald.com> wrote:
-snip-

> Their site seems be undergoing an upgrade right now - the graphs don't work and you
> can't seem to get to the source code (but I have a copy of the nbody code).

1) The benchmarks game website is OK now - please try refreshing your
browser cache.

2) You can web browse CVS to see Python 2 and Python 3 implementations
of n-body

http://anonscm.debian.org/viewvc/shootout/shootout/bench/nbody/?hidea...

3) And there's an implementation specifcally for PyPy

http://anonscm.debian.org/viewvc/shootout/shootout/bench/nbody/nbody....


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 2 2011, 4:10 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Thu, 2 Jun 2011 09:10:46 +0100
Local: Thurs, Jun 2 2011 4:10 am
Subject: Re: nbody code - any suggestions for improvement?
Thanks Thomas, that's on my list to try (but I'm switching to a
mandelbrot solver for now). Hopefully I can return to this tomorrow.
i.

On 2 June 2011 01:02, Thomas Spura <toms...@fedoraproject.org> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Jun 2 2011, 4:20 am
From: Mark Dufour <mark.duf...@gmail.com>
Date: Thu, 2 Jun 2011 10:20:37 +0200
Local: Thurs, Jun 2 2011 4:20 am
Subject: Re: nbody code - any suggestions for improvement?

On Wed, Jun 1, 2011 at 11:11 PM, Ian Ozsvald <i...@ianozsvald.com> wrote:
> Interestingly - I think I can claim that the Mark/Brent version would
> beat the V8 Javascript benchmark. In terms of squeezing a lot of
> performance out of a piece of code with little work, it is quite
> impressive.

I played a bit with GCC flags, and tried something suggested to me a while a
go.. -ffast-math. IIUC, we basically tell the CPU here that it shouldn't
care about the precise IEEE specification in weird corner cases, such as
dividing infinity by -1, things you may never want to occur in your code
anyway.

CCFLAGS=-O2 -march=native

srepmub@akemi:~/shedskin$ ./nbody2 10000000
-0.169075164
-0.169077842
Took: 0:00:12.129899

CCFLAGS=-O3 -ffast-math (so without -march=native!)

srepmub@akemi:~/shedskin$ ./nbody2 10000000
-0.169075164
-0.169077842
Took: 0:00:02.625101

I wasn't sure why the difference is this big, until I read games usually
don't care about IEEE preciseness....

so I should probably add this to the 'performance tips' section of the
tutorial.. :-)

thanks,
mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 2 2011, 4:41 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Thu, 2 Jun 2011 09:41:52 +0100
Local: Thurs, Jun 2 2011 4:41 am
Subject: Re: nbody code - any suggestions for improvement?
The website is better - last night I saw the graphs again but the
source code links still failed, this morning I see that the src code
pages work again too. Cool.

Re. the pypy version - maybe I'm staring at it too much but it looks
very much like the cpython version. What's different?

i.

On 1 June 2011 23:15, Isaac Gouy <igo...@yahoo.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 2 2011, 4:58 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Thu, 2 Jun 2011 09:58:09 +0100
Local: Thurs, Jun 2 2011 4:58 am
Subject: Re: nbody code - any suggestions for improvement?
IIRC ffast-math will use register-based short-cut arithmetic which can
have shorter precision than IEEE based arithmetic. From memory it'll
give you more rounding errors but should run faster. Normally my goal
is to preserve absolute precision (e.g. for physics - double precision
is already imprecise enough!). Definitely one for the tips section
though, many apps don't need all that precision.

E.g.
http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Optimize-Options.html
"This option is not turned on by any -O option since it can result in
incorrect output for programs which depend on an exact implementation
of IEEE or ISO rules/specifications for math functions. It may,
however, yield faster code for programs that do not require the
guarantees of these specifications. "

I'm checking it here:
shedskin0.8 on MacOS X (no native flag anyhow), 02 takes 1min16 using
your class-based code from yesterday.
Adding -ffast-math - no change in speed (result exactly the same)
-O3 and --fast-math - no change in speed (result exactly the same)

So my GCC on a Core2 Duo Macbook doesn't show any improvement (boo),
but that's probably because GCC is already using my hardware fairly
efficiently (yay). Anyhow, I've got to move on to the next task! Maybe
for you the difference was more to do with the native flag (you didn't
say if you'd tried that independently?)?

i.

On 2 June 2011 09:20, Mark Dufour <mark.duf...@gmail.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Jun 2 2011, 5:10 am
From: Mark Dufour <mark.duf...@gmail.com>
Date: Thu, 2 Jun 2011 11:10:16 +0200
Local: Thurs, Jun 2 2011 5:10 am
Subject: Re: nbody code - any suggestions for improvement?

> So my GCC on a Core2 Duo Macbook doesn't show any improvement (boo),
> but that's probably because GCC is already using my hardware fairly
> efficiently (yay). Anyhow, I've got to move on to the next task! Maybe
> for you the difference was more to do with the native flag (you didn't
> say if you'd tried that independently?)?

I tried it independently just now, and it's definitely -ffast-math.

thanks,
mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 2 2011, 5:14 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Thu, 2 Jun 2011 10:14:51 +0100
Local: Thurs, Jun 2 2011 5:14 am
Subject: Re: nbody code - any suggestions for improvement?
Interesting...what's your CPU? My Core2 Duo is old, it might be that
mine isn't that clever and yours is smarter?

On 2 June 2011 10:10, Mark Dufour <mark.duf...@gmail.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Jun 2 2011, 5:23 am
From: Mark Dufour <mark.duf...@gmail.com>
Date: Thu, 2 Jun 2011 11:23:01 +0200
Local: Thurs, Jun 2 2011 5:23 am
Subject: Re: nbody code - any suggestions for improvement?

On Thu, Jun 2, 2011 at 11:14 AM, Ian Ozsvald <i...@ianozsvald.com> wrote:
> Interesting...what's your CPU? My Core2 Duo is old, it might be that
> mine isn't that clever and yours is smarter?

mine's a bit newer I guess, but not that much:

model name    : Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc
arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2
ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dts tpr_shadow vnmi flexpriority

--
http://www.youtube.com/watch?v=E6LsfnBmdnk

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 2 2011, 5:41 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Thu, 2 Jun 2011 10:41:32 +0100
Local: Thurs, Jun 2 2011 5:41 am
Subject: Re: nbody code - any suggestions for improvement?
Certainly my Snow Leopard's g++ is older than most (4.2.1 - a common
complaint from Mac users). Since -ffast-math is potentially unsafe I'm
not so worried but it is nice to know that the option can do something
useful (it was the only real 'trick' I had back as a Senior Programmer
if IEEE precision wasn't required!).

On 2 June 2011 10:23, Mark Dufour <mark.duf...@gmail.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Jun 2 2011, 5:44 am
From: Mark Dufour <mark.duf...@gmail.com>
Date: Thu, 2 Jun 2011 11:44:42 +0200
Local: Thurs, Jun 2 2011 5:44 am
Subject: Re: nbody code - any suggestions for improvement?

On Thu, Jun 2, 2011 at 11:41 AM, Ian Ozsvald <i...@ianozsvald.com> wrote:
> Certainly my Snow Leopard's g++ is older than most (4.2.1 - a common
> complaint from Mac users). Since -ffast-math is potentially unsafe I'm
> not so worried but it is nice to know that the option can do something
> useful (it was the only real 'trick' I had back as a Senior Programmer
> if IEEE precision wasn't required!).

it would be interesting to see what happens when you install a recent linux
distro on there, with GCC 4.5 or 4.6.. ;-)

mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 2 2011, 5:46 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Thu, 2 Jun 2011 10:46:43 +0100
Local: Thurs, Jun 2 2011 5:46 am
Subject: Re: nbody code - any suggestions for improvement?
Sadly that's a right pain on MacOS and/or might get in the way of
system libs. I know people do upgrade GCC but I'd frankly be a bit
scared! I've nuked this machine once, I'm not losing a day again like
that :-) I hope to give the timings another go on my bigger
physics-office machine in a few weeks (but that's Windows - does
ShedSkin work with MSVC?).
i.

On 2 June 2011 10:44, Mark Dufour <mark.duf...@gmail.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Jun 2 2011, 5:52 am
From: Mark Dufour <mark.duf...@gmail.com>
Date: Thu, 2 Jun 2011 11:52:58 +0200
Local: Thurs, Jun 2 2011 5:52 am
Subject: Re: nbody code - any suggestions for improvement?

On Thu, Jun 2, 2011 at 11:46 AM, Ian Ozsvald <i...@ianozsvald.com> wrote:
> Sadly that's a right pain on MacOS and/or might get in the way of
> system libs. I know people do upgrade GCC but I'd frankly be a bit

I didn't mean to upgrade GCC, but to install a nice linux distro on a
separate partition.. :-) this distribution will probably upgrade your GCC
every 6 months or so for you.

> scared! I've nuked this machine once, I'm not losing a day again like
> that :-) I hope to give the timings another go on my bigger
> physics-office machine in a few weeks (but that's Windows - does
> ShedSkin work with MSVC?).

well, it doesn't really have to, because the windows version comes with GCC
(4.5.. :-)).. but there is a hidden (-v) flag, to generate more or less MSVC
compatible output (including makefile). I haven't heard of anyone trying
this recently, so I actually just made the option hidden for 0.8..
<http://twitter.com/IanOzsvald>there's also a hidden 'pypy' compatibility
(iirc, -p) mode that someone sent a patch for at one point.

mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 2 2011, 7:02 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Thu, 2 Jun 2011 12:02:21 +0100
Local: Thurs, Jun 2 2011 7:02 am
Subject: Re: nbody code - any suggestions for improvement?
Mark suggested I try -march=native as it'll enable SSE2 (I confess I'd
forgotten that - the native architecture switches did give me small
benefits in the past on e.g. Pentium, AMD64 specific platforms).

Annoyingly this switch doesn't work in my g++ (4.2.1), the suggestion
online is to use:
-m64 -mtune=core2
in its place. This doesn't make it run any faster. I also added
--fast-math but the speed didn't change.

Can someone else confirm Mark's -ffast-math switch improves
performance without changing the numerical output?

Ian.

On 2 June 2011 10:52, Mark Dufour <mark.duf...@gmail.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dewing  
View profile  
 More options Jun 3 2011, 12:37 am
From: Mark Dewing <markdew...@gmail.com>
Date: Thu, 2 Jun 2011 21:37:28 -0700 (PDT)
Local: Fri, Jun 3 2011 12:37 am
Subject: Re: nbody code - any suggestions for improvement?
I tried it using gcc 4.5.0 (on OpenSuse 11.3) and the version of
nbody.py in the shedskin examples (0.7.1.3).
Running with the default parameters on a Core 2 Duo, I get

4.0 seconds for -march=native
0.4 seconds for -ffast-math

One potential issue is the code raises values to the power of 0.5
rather than calling sqrt.  When I change the power to sqrt (either in
the python file or in the cpp file), the -march=native time drops to
2.8s.   (The -ffast-math time is unaffected).

Mark

On Jun 2, 6:02 am, Ian Ozsvald <i...@ianozsvald.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 3 2011, 4:11 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Fri, 3 Jun 2011 09:11:14 +0100
Local: Fri, Jun 3 2011 4:11 am
Subject: Re: nbody code - any suggestions for improvement?
I'm definitely missing something at this end it seems :-) I'll try
changing **2 -> sqrt, right now I'm arguing with a set of mandelbrot
solvers (I'll probably submit the shedskin version here shortly for
more suggestions!).

Cheers,
Ian.

On 3 June 2011 05:37, Mark Dewing <markdew...@gmail.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Dufour  
View profile  
 More options Jun 3 2011, 4:22 am
From: Mark Dufour <mark.duf...@gmail.com>
Date: Fri, 3 Jun 2011 10:22:04 +0200
Local: Fri, Jun 3 2011 4:22 am
Subject: Re: nbody code - any suggestions for improvement?

On Fri, Jun 3, 2011 at 10:11 AM, Ian Ozsvald <i...@ianozsvald.com> wrote:
> I'm definitely missing something at this end it seems :-) I'll try
> changing **2 -> sqrt, right now I'm arguing with a set of mandelbrot
> solvers (I'll probably submit the shedskin version here shortly for
> more suggestions!).

I also see the speedup with just -O2 -ffast-math (without -march=native), so
it's starting to look like GCC 4.2 is missing something.. :-)

a speedup of 10 times seems absurd though! (thanks mark, btw!)

mark.
--
http://www.youtube.com/watch?v=E6LsfnBmdnk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Ozsvald  
View profile  
 More options Jun 3 2011, 5:05 am
From: Ian Ozsvald <i...@ianozsvald.com>
Date: Fri, 3 Jun 2011 10:05:12 +0100
Local: Fri, Jun 3 2011 5:05 am
Subject: Re: nbody code - any suggestions for improvement?
Yeah, I've just tried a bunch of optimisation flags on my mandelbrot
problem and they barely change the speed at all. I think my gcc might
be out of date. I might have found a drag/drop newer version of gcc
for Snow Leopard (but I'm wary about conflicts).

I might just use your timing results in my talk!

i.

On 3 June 2011 09:22, Mark Dufour <mark.duf...@gmail.com> wrote:

--
Ian Ozsvald (A.I. researcher, screencaster)
i...@IanOzsvald.com

http://IanOzsvald.com
http://SocialTiesApp.com/
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 27   Newer >
« Back to Discussions « Newer topic     Older topic »