On 4 Jun, 2012, at 14:03, francesco oteri wrote:
> I am working to a format permitting to read it randomly.
> The basic idea is creating a sort of frame index that for each frame, stores its position
> in the .xtc file. It can be loaded as optional permitting to MDAnalysis, and other tools,
> to randomly access the files.
LOOS from Alan Grossfield's lab
http://loos.sourceforge.net/ (which is a rather nice project, by the way) already uses a frame cache for XTC. Basically, when the XTC is read the first time then an index of frames and file offsets is built, which then allows random access, see
http://loos.sourceforge.net/classloos_1_1_x_t_c.html for a start.
I would love to have something similar in MDAnalysis but I don't know if there are inherent problems with e.g. big files >2GB etc. I didn't have the time to make a serious attempt.
My idea would have been to store the frame list in memory and also as an additional file on disk (also using the XDR file format that is used for TRR and XTC), together with a checksum that the C library can use to decide if the frame list on disk still corresponds to the XTC on disk. In pseudo code:
# first read the trajectory once:
# 1. build the frame list if needed
# 2. tells us how many frames are in the trajectory
scanXTC("traj.xtc"):
cachefile = xdr_read(".traj.framecache") or None
cs = calculate checksum(XTC)
if not cachefile or cachefile.checksum != cs:
# build a new cache. slow
for frame in xtc:
framecache.append((frame, file-offset))
xdr_write(cachefile, framecache, cs)
return cachefile.numframes
# reading a frame is done by looking up the file offset for the
# frame in the frame list, seeking to the frame, and then reading
# the frame as usual
read_frame_XTC(xtcfile, framecache, frame):
xtcfile.fseek( framecache[frame] )
return xtcfile.read_frame()
I think that this could all be implemented in the C code of the xtc/trr library ("libxdr") even though I have been writing the pseudo code in a object-oriented manner.
If anyone starts working on this in earnest then please open an issue in the issue tracker to coordinate development. I would assign the issue to whoever seems most eager to work on it :-).
Oliver
--
Oliver Beckstein *
orbe...@gmx.net
skype: orbeckst *
orbe...@gmail.com