Data passing for multiple pcoords

21 views
Skip to first unread message

Shashank Shastry

unread,
Jan 28, 2026, 10:58:15 AMJan 28
to westpa-users
Hello westpa-users,

I am trying to implement 2D steady -state westpa trying to capture protein dimer complexes in membranes. Currently, I am running into issues with reading pcoords. Here is what I have tried so far:
- I have tested my pcoord script, no issues outputting data to dist.txt and cangle.txt
- There seem to be no MD related issues, GROMACS runs fine and segment 1 completes with no errors.
- I tinkered a bit with variable naming in the westpa_scripts/runseg.sh file. Perhaps I am doing this step incorrectly?
- I modified the pcoord return line to this:

cat $WEST_CURRENT_SEG_DATA_REF/dist.txt | awk '{print $1;}' > $WEST_DIST_RETURN
cat $WEST_CURRENT_SEG_DATA_REF/cangle.txt | awk '{print $1;}' > $WEST_CANGLE_RETURN

My output log is below:

WESTPA environment activating ...
Updating system with the options from the configuration file
Creating HDF5 file '/raid/.../west.h5'
1 target state(s) present
Calculating progress coordinate values for basis states.
1 basis state(s) present
Calculating progress coordinate values for start states.
0 start state(s) present
Preparing initial states

        Total bins:            135
        Initial replicas:      5 in 1 bins, total weight = 1
        Total target replicas: 675
       
1-prob: 0.0000e+00
Simulation prepared.
1 of 135 (0.740741%) active bins are populated
per-bin minimum non-zero probability:       1
per-bin maximum probability:                1
per-bin probability dynamic range (kT):     0
per-segment minimum non-zero probability:   0.2
per-segment maximum non-zero probability:   0.2
per-segment probability dynamic range (kT): 0
norm = 1, error in norm = 0 (0*epsilon)
WESTPA environment activating ...
WESTPA environment active! :)
Updating system with the options from the configuration file
Maximum wallclock time: 3 days, 0:00:00

Wed Jan 28 09:39:43 2026
Iteration 1 (200 requested)
Beginning iteration 1
5 segments remain in iteration 1 (5 total)
1 of 135 (0.740741%) active bins are populated
per-bin minimum non-zero probability:       1
per-bin maximum probability:                1
per-bin probability dynamic range (kT):     0
per-segment minimum non-zero probability:   0.2
per-segment maximum non-zero probability:   0.2
per-segment probability dynamic range (kT): 0
norm = 1, error in norm = 0 (0*epsilon)
Waiting for segments to complete...
-- ERROR    [westpa.core.propagators.executable] -- could not read pcoord for Segment 3 from '/tmp/tmpgu4ggtbt': ValueError('cannot reshape array of size 0 into shape (101,2)')
-- ERROR    [westpa.core.propagators.executable] -- could not read pcoord for Segment 0 from '/tmp/tmpmw1w647w': ValueError('cannot reshape array of size 0 into shape (101,2)')
-- ERROR    [westpa.core.propagators.executable] -- could not read pcoord for Segment 1 from '/tmp/tmpu37nc8dw': ValueError('cannot reshape array of size 0 into shape (101,2)')
-- ERROR    [westpa.core.propagators.executable] -- could not read pcoord for Segment 4 from '/tmp/tmpqg64f98w': ValueError('cannot reshape array of size 0 into shape (101,2)')
-- ERROR    [westpa.core.propagators.executable] -- could not read pcoord for Segment 2 from '/tmp/tmpgl7c2954': ValueError('cannot reshape array of size 0 into shape (101,2)')
-- ERROR    [westpa.core.sim_manager] -- propagation failed for 5 segment(s):
0  
1  
2  
3  
4
exception caught; shutting down
-- ERROR    [w_run] -- error message: propagation failed for 5 segments
-- ERROR    [w_run] -- Traceback (most recent call last):
  File "/home/.../micromamba/envs/westpa/lib/python3.11/site-packages/westpa/cli/core/w_run.py", line 61, in run_simulation
    sim_manager.run()
  File "/home/.../micromamba/envs/westpa/lib/python3.11/site-packages/westpa/core/sim_manager.py", line 769, in run
    self.check_propagation()
  File "/home/.../micromamba/envs/westpa/lib/python3.11/site-packages/westpa/core/sim_manager.py", line 667, in check_propagation
    raise PropagationError('propagation failed for {:d} segments'.format(len(failed_segments)))
westpa.core.sim_manager.PropagationError: propagation failed for 5 segments

I hope this is not a repeat thread ( I didn't find anything that quite answered my question), please refer me to an existing thread if  there is one. Any  pointers would be appreciated, thank you!

Leung, Jeremy

unread,
Jan 28, 2026, 11:14:03 AMJan 28
to westpa...@googlegroups.com
Hi Shashank,

Welcome!

You are indeed implementing it incorrectly.  When passing to $WEST_XXXX_RETURN, you are saving it as auxdata. (This is also assuming you have correctly configured the dataset in west.cfg https://github.com/westpa/westpa/wiki/Configuration-File#data-manager).  It's good practice to keep a copy of it there, but more importantly, you need to send both columns to $WEST_PCOORD_RETURN to be used the pcoord. Tutorial 5.2 (The one by p53) should be a good example to follow.

```
paste <(cat ca-rmsd-p53.dat | tail -n +2 | awk {'print $2'}) <(cat dist-end-to-end.dat | tail -n +2 | awk {'print $2'})>$WEST_PCOORD_RETURN

```


I'm assuming your `get_pcoord.sh` is formatted a bit differently (and may be correct) so you're passing `w_init`. Please double check in your west.h5 (via command line here): `h5ls -d west.h5['ibstates/0/bstate_pcoord']` that the correct pcoord is passed to WESTPA.  (for other options, see https://github.com/westpa/westpa/wiki/Navigating-the-west.h5-File)

Best,

Jeremy L.


---
Jeremy M. G. Leung, PhD
Research Assistant Professor, Chemistry (Chong Lab)
University of Pittsburgh | 219 Parkman Avenue, Pittsburgh, PA 15260
jml...@pitt.edu | [He, Him, His]

On Jan 28, 2026, at 10:57 AM, Shashank Shastry <ss...@illinois.edu> wrote:

You don't often get email from ss...@illinois.edu. Learn why this is important
--
You received this message because you are subscribed to the Google Groups "westpa-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to westpa-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/westpa-users/0e9b50e9-2fa0-4aae-86a9-e0029f309530n%40googlegroups.com.

Shashank Shastry

unread,
Feb 5, 2026, 11:59:41 AMFeb 5
to westpa-users
Conversation continued off-forum, I was able to finally setup my WESTPA simulation with 2 pcoords and multiple bstates. I am still working on implementing the HDF5 style of data management. Hopefully this is helpful to anyone else who comes looking for a similar solution.

Shashank:

Hi Jeremy,

that worked splendidly, thank you!

On a related note, I have also been trying to add multiple bstates going off of the 7.5 HDF5 tutorial. I understand the setup for the bstates.txt file and the pcoord.init file but I'm a little lost on how this information is passed around internally for westpa. When i look at the init.sh, there's no change to how BSTATE_ARGS are defined or passed. Could you explain how the folders are descended into and whether all bstates must be named the same, have same topology, etc? I assume even the waters and ions must be the same between bstates..

Jeremy:

Glad that worked out.

When WESTPA is informed about bstates during w_init using the `--bstate-file` flag (each line in bstates.txt being a new bstate), it will spawn a worker for each one to calculate their pcoord. In each worker, the $WEST_STRUCT_DATA_REF environment variable will be replaced with '$WEST_SIM_ROOT/bstates/{basis_state.auxref}', the last part being the corresponding item in third column of bstates.txt. The format of what is replaced is defined in the `west.data.data_refs.basis_states` key in west.cfg.

Then, each worker will launch `get_pcoord.sh` (also specified in west.cfg). You can see that that each worker will use the $WEST_STRUCT_DATA_REF to pass the corresponding pcoord (and other files if using HDF5 Framework) to WESTPA.

Ideally, each bstate should use the same topology (i.e. same number of ions, atoms etc.). This is because as you propagate forward, it would be increasingly difficult to keep track of which topology to use.

Shashank:

I think I only partially understand. I have set it up so that my topology is shared, however, i'm still confused about the setup for data passing. As you mentioned, my bstates file now reflects only the directory names and i see where $WEST_STRUCT_DATA_REF and $WEST_SIM_ROOT/bstates/{basis_state.auxref} are used in west.cfg and runseg.sh. Could you clarify how the bstate coordinate file and trajectory files get transferred? I'm looking at a mixture of the gromacs NaCl tutorial and the HDF5 tutorial as i try to frankenstein my WESTPA setup together.

Jeremy:

`$WEST_STRUCT_DATA_REF` is only used by `get_pcoord.sh` during initialization (w_init). runseg.sh / w_run executes on a different mechanism and might vary depending if you use the HDF5 Trajectory Storage Framework or not. It should be pretty clear in `get_pcoord.sh` that the coordinate files (to be read by mdtraj by default) are passed to `$WEST_TRAJECTORY_RETURN`, restart files (to be tarred up for the next iterations) are passed to `$WEST_RESTART_RETURN`. The former you need to pass in the topology and coordinate files for mdtraj to load the trajectory. For the latter, you can pack anything in there necessary for restart of a simulation. Large files that are consistent across iterations/segments (topologies or what not) should be softlinked later on to save space (i.e. don't pass to $WEST_RESTART_RETURN).


During the simulation/w_run:
If you're using the HDF5 Framework (i.e. `west.data.data_refs.iteration` is set to something in `west.cfg`), then what you copy into `$WEST_RESTART_RETURN` within `get_pcoord.sh` is tarred up inside `traj_segs/iter_00000000.h5` and the contents automatically untarred to the correct offspring directory (e.g. in iteration 1 relevant segment number). If the parent is a segment from the previous iteration, it'll do a similar thing, but from files passed to `$WEST_RESTART_RETURN` in `runseg.sh` in the parent segment.

Or else, you will need to use something like this block of code, which will use the $WEST_CURRENT_SEG_INITPOINT_TYPE environment to determine if it's a bstate/istate (recycled or iteration 1) or a continuing segment (everything else). In the former case, `$WEST_PARENT_DATA_REF` will point to the exact same thing as `$WEST_STRUCT_DATA_REF` (with the correct parent auxref substituted). In the latter case, it will point to what's listed in `west.data.data_refs.segment` in west.cfg, with the iteration number and segment number of the parent segment substituted. You can also softlink any necessary files/topology etc here. 


A lot of this should be explained in the WESTPA WIKI and the tutorial manuscript in 7.1/7.2/7.5/7.6. here is another GROMACS simulation setup you could reference. The get_pcoord.sh creates temp files so you don't have race conditions when calculating your pcoord on-the-fly, if you're doing that. 



Reply all
Reply to author
Forward
0 new messages