Google Groups Home
Help | Sign in
Message from discussion Performance characteristics of mutable static primitives?
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
hlovatt  
View profile
 More options Apr 7, 6:03 am
From: hlovatt <howard.lov...@gmail.com>
Date: Mon, 7 Apr 2008 03:03:56 -0700 (PDT)
Local: Mon, Apr 7 2008 6:03 am
Subject: Re: Performance characteristics of mutable static primitives?
This is a very interesting discussion and reveals a lot about the
difficulties of programming for multiple cores.

I was trying to understand what was going on and was 'messing about'
with the code and I noticed that almost any change I made slowed the
code down considerably - more than the 1 or 2 thread options in Java 6
does. Therefore I am not sure how realistic the code is. If the code
did more than simply increment a variable or two than the problem
might go away (because contention would be less and because the
uncontested operations would be a bigger percentage).

Going back to the original code. If it is OK to miss a few increments
(hence non-volitile statics) then the following should be OK:

          threads[ j ] =
            new Thread() {
              public void run() {
                final int total = totalSize / threadCount;
                for ( int k = 0; k < total; k++ ) {
                  if ( ( k & 0xFF ) == 0 ) {  // New
                    i += 256;                     // New
                    firedCount++;              // New
                  }                                   // New
//                  if ( ( i++ & 0xFF ) == 0 ) { firedCount++; } //
Original
                }
              }
            };

The above version doesn't show the problem on Java 6 and is quicker
than the original.

On Apr 2, 6:48 pm, Charles Oliver Nutter <charles.nut...@sun.com>
wrote:

> I ran into a very strange effect when some Sun folks tried to benchmark
> JRuby's multi-thread scalability. In short, adding more threads actually
> caused the benchmarks to take longer.

> The source of the problem (at least the source that, when fixed, allowed
> normal thread scaling), was an increment, mask, and test of a static int
> field. The code in question looked like this:

> private static int count = 0;

> public void pollEvents(ThreadContext context) {
>    if ((count++ & 0xFF) == 0) context.poll();

> }

> So the basic idea was that this would call poll() every 256 hits,
> incrementing a counter all the while. My first attempt to improve
> performance was to comment out the body of poll() in case it was causing
> a threading bottleneck (it does some locking and such), but that had no
> effect. Then, as a total shot in the dark, I commented out the entire
> line above. Thread scaling went to normal.

> So I'm rather confused here. Is a ++ operation on a static int doing
> some kind of atomic update that causes multiple threads to contend? I
> never would have expected this, so I wrote up a small Java benchmark:

> http://pastie.org/173993

> The benchmark does basically the same thing, with a single main counter
> and another "fired" counter to prevent hotspot from optimizing things
> completely away. I've been running this on a dual-core MacBook Pro with
> both Apple's Java 5 and the soylatte Java 6 release. The results are
> very confusing:

> First on Apple's Java 5

> ~/NetBeansProjects/jruby ➔ java -server Trouble 1
> time: 3924
> fired: 3906250
> time: 3945
> fired: 3906250
> time: 1841
> fired: 3906250
> time: 1882
> fired: 3906250
> time: 1896
> fired: 3906250
> ~/NetBeansProjects/jruby ➔ java -server Trouble 2
> time: 3243
> fired: 4090645
> time: 3245
> fired: 4100505
> time: 1173
> fired: 3906049
> time: 1233
> fired: 3906188
> time: 1173
> fired: 3906134

> Normal scaling here...1 thread on my system uses about 60-65% CPU, so
> the extra thread uses up the remaining 35-40% and the numbers show it.
> Then there's soylatte Java 6:

> ~/NetBeansProjects/jruby ➔ java -server Trouble 1
> time: 1772
> fired: 3906250
> time: 1973
> fired: 3906250
> time: 2748
> fired: 3906250
> time: 2114
> fired: 3906250
> time: 2294
> fired: 3906250
> ~/NetBeansProjects/jruby ➔ java -server Trouble 2
> time: 3402
> fired: 3848648
> time: 3805
> fired: 3885471
> time: 4145
> fired: 3866850
> time: 4140
> fired: 3839130
> time: 3658
> fired: 3880202

> Don't compare the times directly, since these are two pretty different
> codebases and they each have different general performance
> characteristics. Instead pay attention to the trend...the soylatte Java
> 6 run with two threads is significantly slower than the run with a
> single thread. This mirrors the results with JRuby when there was a
> single static counter being incremented.

> So what's up here?

> - Charlie


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google