skipping frames in xtc files

53 views
Skip to first unread message

Tyler Reddy

unread,
Apr 18, 2013, 12:51:59 PM4/18/13
to mdnalysis-...@googlegroups.com
Hi,

After discussing this with a colleague, I was a bit surprised to find that skipping frames in a trajectory using [::skip_value] notation preforms worse than an explicit modulus test, as my understanding was that we (Oli, I think) implemented this functionality as a modulus test for the skip_value under the hood (I'm pretty sure I remember when this was patched about 3 years ago).

My test code and the results (similar over 10 replicates and on two different machines) are below.  A quick look at the source code (iterDCD() function in MDAnalysis/coordinates/xdrfile/core.py) suggests that there's a bit more than a simple modulus check there--I do see the modulus check, but there's also self._check_slice_indices(start, stop, step) in iterDCD(), and maybe I'll have to look at this in more detail, but I can't imagine that the baked-in 'frame skipping' should be slower than performing an external modulus if done right.

#test code--------------------------------------------------------------
#April 2013: MDA test code to see if usking a frame skip value performs differently from an explicit modulus check when parsing through an .xtc trajectory

import MDAnalysis, time
from MDAnalysis.tests.datafiles import GRO,XTC
U = MDAnalysis.Universe(GRO,XTC)

#start the skip value test:
start = time.time()
for ts in U.trajectory[::2]: #every other frame
    print 'Skip test ::2 frame number:', ts.frame
print 'Skip test time (s):', time.time() - start

#start the modulus-based test:
start = time.time()
for ts in U.trajectory:
    if ts.frame%2 != 0: #every other frame
        print 'Modulus test frame number:', ts.frame
print 'Modulus test time (s):', time.time() - start
#end test code--------------------------------------------------------------


Representative Results:

Skip test ::2 frame number: 1
Skip test ::2 frame number: 3
Skip test ::2 frame number: 5
Skip test ::2 frame number: 7
Skip test ::2 frame number: 9
Skip test time (s): 0.070916891098
Modulus test frame number: 1
Modulus test frame number: 3
Modulus test frame number: 5
Modulus test frame number: 7
Modulus test frame number: 9
Modulus test time (s): 0.0331590175629

Cheers,
Tyler



 

Oliver Beckstein

unread,
Apr 18, 2013, 5:12:58 PM4/18/13
to mdnalysis-...@googlegroups.com
Hi Tyler,

On 18 Apr, 2013, at 09:51, Tyler Reddy wrote:

> Hi,
>
> After discussing this with a colleague, I was a bit surprised to find that skipping frames in a trajectory using [::skip_value] notation preforms worse than an explicit modulus test, as my understanding was that we (Oli, I think) implemented this functionality as a modulus test for the skip_value under the hood (I'm pretty sure I remember when this was patched about 3 years ago).

I think that the code below pretty much does that:

In coordinates.xdrfile.core.TrjReader.__getitem__():

def iterDCD(start=frame.start, stop=frame.stop, step=frame.step):
start, stop, step = self._check_slice_indices(start, stop, step)
if step < 0:
raise NotImplementedError("XTC/TRR do not support reverse iterating for performance reasons")
for ts in self:
# simple sequential plodding through trajectory --- SLOW!
frameindex = ts.frame - 1
if frameindex < start: continue
if frameindex >= stop: break
if (frameindex - start) % step != 0: continue
yield self.ts


> My test code and the results (similar over 10 replicates and on two different machines) are below. A quick look at the source code (iterDCD() function in MDAnalysis/coordinates/xdrfile/core.py) suggests that there's a bit more than a simple modulus check there--I do see the modulus check, but there's also self._check_slice_indices(start, stop, step) in iterDCD(),

The _check_slice_indices() is not expensive and in any case, once we're in the 'for ts in self' loop (i.e. __iter__()), we only do the if statements and iterate through the frames sequentially.

I don't understand where a factor of 2 in execution time comes from. Did you try running it through a profiler (line_profiler or cProfile)?


> and maybe I'll have to look at this in more detail, but I can't imagine that the baked-in 'frame skipping' should be slower than performing an external modulus if done right.

It should take pretty much the same amount of time if done as a skipping forward.

If you can shed some light on this I would be really curious as to what's going on.

By the way, you might be interested in following the discussion on Issue 127: (https://groups.google.com/forum/#!msg/mdnalysis-devel/szaMMcvY3Ik/i90t5l2Mth0J ).

Oliver
--
Oliver Beckstein * orbe...@gmx.net
skype: orbeckst * orbe...@gmail.com

Tyler Reddy

unread,
Nov 20, 2013, 6:52:43 AM11/20/13
to mdnalysis-...@googlegroups.com
By the way, this behaviour seems to have changed now. I suspect this may relate to the improvements to xtc read performance in MDA. A quick check in IPython:

#perform initial test Universe setup:
import MDAnalysis, MDAnalysisTests
from MDAnalysis.tests.datafiles import GRO,XTC
U = MDAnalysis.Universe(GRO,XTC)

%%timeit -n 200
for ts in U.trajectory[::2]: #every other frame
    print 'Skip test ::2 frame number:', ts.frame
...
200 loops, best of 3: 13.4 ms per loop

%%timeit -n 200

for ts in U.trajectory:
    if ts.frame%2 != 0: #every other frame
        print 'Modulus test frame number:', ts.frame
...
200 loops, best of 3: 26.5 ms per loop



--
You received this message because you are subscribed to the Google Groups "MDnalysis discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mdnalysis-discus...@googlegroups.com.
To post to this group, send email to mdnalysis-...@googlegroups.com.
Visit this group at http://groups.google.com/group/mdnalysis-discussion?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Reply all
Reply to author
Forward
0 new messages