w_ipa successful trajectories

Yao Fu

unread,

Nov 22, 2024, 1:57:34 PM11/22/24

to westpa-users

Hi WESTPA Community,

I'm having trouble retrieving the segment IDs of recycled trajectories during post simulation analysis with w_ipa. I have a discrepancy between the number of successful trajectories reported in different datasets. (since I changed the target state of the analysis scheme compared to the tstate file, w_succ can't accurately provide information of successful walkers.
Using the 'arrivals' dataset in direct.h5, I find 2171 successful trajectories after summing them up from each iteration. However, using the 'assignments' dataset in assign.h5 and checking each segment for occurrences of the bin index 0 at any timepoint, I get a total of 4016 successful trajectories.
so I'm confused about the correct way to track recycled trajectories, and attached are my west.cfg, direct.h5, and assign.h5 files for reference. This simulation uses 2D pcoords, and the target state is the bound state with pcoords (4.0, 4.0).

Thank you!
Yao

direct.h5

assign.h5

west.cfg

Leung, Jeremy

unread,

Nov 25, 2024, 2:14:45 PM11/25/24

to westpa...@googlegroups.com

Hi Yao,

I'll first point you to this page, which should label the differences between each dataset: https://github.com/westpa/westpa/wiki/HDF5-File-Organization-of-Simulation-Data#user-content-Overall_structure_of_assignh5

What you probably care about is the `conditional_arrivals` and/or the `conditional_fluxes` data sets. `conditional arrivals` should be equivalent to the `conditional_fluxes` dataset, normalized with the summed weight from the `labeled_populations` dataset from all bins. The normalization should be very close to 1, since you ran with recycling. If you did cumulative averaging, you need to do that to the `labeled_population` dataset yourself (though since it's closed to 1 due to recycling, it shouldn't matter too much).

Now looking at the included drawing, to get the "successful" trajectories, you need to get all the (1)s, but not (2-4). `assignments` will give you extra points because it might have included points like (2). To easily find all the successful trajectories, I suggest using LPATH's extract step. (https://github.com/chonglab-pitt/lpath contains the paper and installation instructions)

Running something like the following (assuming you're going from unbound to bound, but if not, swap the 0 and 1)

lpath extract -we -W west.h5 -A ANALYSIS/TEST/assign.h5 --source-state 0 \
    --target-state 1 --extract-output output.pickle --out-dir succ_traj

lpath.readthedocs.io

The `output.pickle` file should be nested list of list, containing all the successful trajectories. The first dimension is all of the successful pathways, and each would contain as list of lists which indicate the `iter_id/seg_id/state_id/pcoord_or_auxdata/frame#/weight`. The last frame of each list would be the frame it hits the target. The first frame would be the last time it exited the source.

Best,

Jeremy L.

---
Jeremy M. G. Leung
PhD Candidate, Chemistry
Graduate Student Researcher, Chemistry (Chong Lab)
University of Pittsburgh | 219 Parkman Avenue, Pittsburgh, PA 15260
jml...@pitt.edu | [He, Him, His]

--
You received this message because you are subscribed to the Google Groups "westpa-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to westpa-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/westpa-users/5ce2ab59-8dcb-4483-80aa-a1e2e458a619n%40googlegroups.com.
<direct.h5><assign.h5><west.cfg>

Yao Fu

unread,

Nov 26, 2024, 3:07:31 PM11/26/24

to westpa-users

Hi Jeremy,

Thank you for your reply. It works!

Best,

Yao

Reply all

Reply to author

Forward