Help with assign.h5 (from w_assign/w

Akshay

未讀,

2022年10月28日上午9:47:532022/10/28

收件者：westpa-users

Hello WESTPA community,

I would like to seek clarification on the assign.h5 file generated from w_assign/w_ipa command on WESTPA-2022.02.

1) Why do all the iterations in the 'trajlabels' key have the same number of segments ?

The format for accessing data as per the manual (https://github.com/westpa/westpa/wiki/man:w_assign#Output_format) is /trajlabels [iteration][segment][timepoint]. The number of segments is correctly listed using the 'nsegs' key.

2) What does 'statelabels' mean and how is it different from 'trajlabels' ?

3) Since 'trajlabels' contains the label of the last visited macrostate by each trajectory, can this information be readily used as an input to msm_we module where the input is west.h5 file? Is there a better way than to edit the west.h5 to include this information (from assign.h5) with minor modifications to the source code ?

Thanks in advance for the help,

Akshay

PhD student, IIT D

Jeremy Leung

未讀,

2022年11月15日下午3:07:382022/11/15

收件者：westpa-users

Hi Akshay,

Sorry it took us a while to answer this.

1) The 'trajlabels' dataset labels the state the your segments in west.h5 has last visited. It has the same number of segments in every iteration due to limitations with numpy arrays where dimension have to be of fixed shape (https://numpy.org/doc/stable/user/basics.creation.html#arrays-creation). Those non-existent trajectories are always labeled as not occupying in any states (so if you have 15 states defined (0-14), they'll always be labeled 15 in every frame).

2) The 'statelabels' dataset indicate what state that frame is currently in, versus 'trajlabels' which is the state it last visited. So if you have a trajectory that went from State 0 --> unlabeled region --> State 1, `trajlabels` will indicate "0,0,1"; whereas `statelabels` will look like "0,15,1", assuming you have 15 states defined (0-14 being the labeled ones).

3) Regarding passing to msm_we, theoretically yes, but I believe msm_we require much finer state definitions for the microbins, which is probably really tedious to define in a west.cfg. I would suggest using those definitions as your "stratified" bins in msm_we and let msm_we cluster it further. John Russo would probably be a better person to answer this.

And about augmenting the 'trajlabels' information, we suggest adding that as an auxiliary dataset. It should be quite straightforward to write code in runseg.sh where it 1) reads all the pcoord, 2) assign it to states and 3) pass that dataset to the auxdata. If you want WESTPA to do it, here's an example of assigning "color" to all trajectories using the bin mapper.

More information here about assign.h5: https://github.com/westpa/westpa/wiki/HDF5-File-Organization-of-Simulation-Data#Overall_structure_of_assignh5_STUB

Best,

Jeremy L.

Akshay

未讀,

2022年11月17日凌晨12:29:382022/11/17

收件者：westpa-users

Hello Jeremy,

Thanks for addressing my queries and for all the tips. The 'coloring' scheme is very analogous to my case. I will try it out and get back in case of difficulties.

Thanks,

Akshay,

回覆所有人

回覆作者

轉寄

Help with assign.h5 (from w_assign/w_ipa)

Akshay

Jeremy Leung

Akshay