Hello,
I have a bit of a technical question regarding basis state recycling when using recycling boundary conditions (i.e. steady state mode) without istate generation. Possibly a bug report.
I ran a production simulation of protein-protein binding with 960 basis states (1 initial segment per bstate). The basis states were randomly oriented pairs of proteins separated by a large distance. When examining my west.h5 data, I noticed something unexpected. I looked at the basis state ids that were selected during recycling events with the following code:
from westpa.analysis import Run
with Run(westh5) as run:
for iter_idx in range(51,201):
# Get the recycled segment and basis state ids
recycled_basis_ids = run.h5file[f"iterations/iter_{iter_idx+1:08d}/new_weights/index"]["initial_state_id"]
print(recycled_basis_ids)
and I noticed the following pattern:
[ 1 2 6 16 17 22 30 32 38 43 47 49 50 54 56 59 68 70 74]
[ 1 2 6 16 17 22 30 32 38 43 47 49 50 54 56 59 68 70 74]
[ 1 2 6 16 17 22 30 32 38 43 47 49 50 54 56 59 68 70]
[ 1 2 6 16 17 22 30 32 38 43 47 49 50 54 56 59 68 70]
[ 1 2 6 16 17 22 30 32 38 43 47 49 50]
[ 1 2 6 16 17 22 30 32 38 43 47]
[ 1 2 6 16 17 22 30 32 38]
[ 1 2 6 16 17 22 30 32 38 43 47 49 50 54 56 59 68 70 74 75 78 82 90 95]
[ 1 2 6 16 17 22 30 32 38 43 47 49 50 54 56 59 68]...
I expected that basis state ids would be selected from the pool of available basis states at random, but it appears this is not the case. The bstates are sampled in the same order at every iteration, so bstate 1 is always selected for recycling. Where in WESTPA is the code that determines how bstate ids are selected during recycling (without istates)? I think it should be updated to randomize the selection of bstates at each iteration where recycling occurs, sampling randomly from their user-defined weights.
Cheers,
Hayden