Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Global array operations: a performance hit?
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
glen herrmannsfeldt  
View profile  
 More options Jun 17 2008, 8:15 pm
Newsgroups: comp.lang.fortran
From: glen herrmannsfeldt <g...@ugcs.caltech.edu>
Date: Tue, 17 Jun 2008 16:15:46 -0800
Local: Tues, Jun 17 2008 8:15 pm
Subject: Re: Global array operations: a performance hit?
James Van Buskirk wrote:

(snip of example with DO loops)

>>r(0,:)    = rhub
>>r(nr+1,:) = rmax
>>dt(0,:)   = 0.0
>>dft(0,:)  = 0.0
>>dfr(0,:)  = 0.0
>>dt(nr+1,:)   = 0.0
>>dft(nr+1,:)  = 0.0
>>dfr(nr+1,:)  = 0.0
>>I found the execution time of the latter to be higher than the former,
>>as if many DO loops were executed instead than just one. Why use
>>global array operations then? Isn't better to stick to old plain DO
>>loops? Thanks,
> Normally an initialization loop like this one would be faster as
> separate loops than one fused loop because it's faster to access
> memory consecutively rather than jumping around as implied by the
> fused loop.  However in this case the loops appear to be setting
> boundary values so they are traversing rows rather than columns of
> the arrays.  As a consequence the code jumps around in memory no
> matter what the compiler does and loop fusion can win out because
> it implies less loop overhead which otherwise would be of negligible
> importance compared to memory access considerations (assuming that
> the data set is too large to fit in cache).

The cache effect can be complicated in cases like this.
If the different statements are on the same elements of
the same array, then a single loop helps them stay in cache.

If speed is that important, you might try reversing the
subscript order (in the whole program).  Well, the general rule
is to arrange the subscripts such that the leftmost subscript
changes fastest in array operations.  That is the order they
are stored in memory, the order they will be done in array
operations, and the order for I/O if just an array name is
specified.

> One thing to investigate is whether the r(i,j), dt(i,j), dft(i,j),
> and dfr(i,j) always get accessed together.  If so, you could group
> them as a derived type and the above loop could go 4X as fast as
> the structure of arrays code listed above.

The old struct of array vs. array of struct trick.

http://coding.derkeiler.com/Archive/Fortran/comp.lang.fortran/2005-06...

-- glen


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.