LPATH, Linguistics Pathway Analysis of Trajectories using Hierarchical clustering

62 views
Skip to first unread message

Yanxiao Han

unread,
Aug 25, 2023, 6:55:24 PM8/25/23
to westpa-users
Dear westpa developers,

I am wondering if there is an example for how to use LPAH for clustering the trajectory from WE simulation.

Please let me know.

Thanks,

Yanxiao

Leung, Jeremy

unread,
Aug 27, 2023, 3:12:30 PM8/27/23
to westpa...@googlegroups.com
Hi Yanxiao,

You can follow the examples in this branch: https://github.com/atbogetti/LPATH/tree/examples/examples/WE

-- JL

---
Jeremy M. G. Leung
PhD Candidate, Chemistry
Graduate Student Researcher, Chemistry (Chong Lab)
University of Pittsburgh | 219 Parkman Avenue, Pittsburgh, PA 15260
jml...@pitt.edu | [He, Him, His]

--
You received this message because you are subscribed to the Google Groups "westpa-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to westpa-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/westpa-users/53e7b1d8-ba4a-4201-9546-7674f4496eaen%40googlegroups.com.

Yanxiao Han

unread,
Aug 28, 2023, 2:03:57 PM8/28/23
to westpa...@googlegroups.com
Hi Jeremy,
Thanks so much for sharing the example. 

1. Regarding the source/target states, how did you define the boundaries and states as shown below in the west.cfg file?
How is this part related to step 1 (clustering the conformation) ?
2. I would also want to visualize the final trajectories. Does the 'representative_segments.txt' contain the iteration and segments number which I can use to trace back and get some representative trajectories?

   

   C7_EQ:
        enabled: True
        bins:
          - type: RectilinearBinMapper
            boundaries:
              - [0, 15, 25, 90, 165, 180, 190, 250, 305, 315, 330, 361]
              - [0, 10, 30, 40, 90, 100, 110, 125, 180, 205, 230, 305, 361]
        states:
          - label: C # C7_eq
            coords:
              - [195, 60]
              - [195, 95]
              - [255, 60]
              - [255, 95]
          - label: A # C7_ax
            coords:
              - [30,310]


Thanks so  much!

Yanxiao

Leung, Jeremy

unread,
Aug 28, 2023, 2:37:02 PM8/28/23
to westpa...@googlegroups.com
Hi Yanxiao,

1. This is exactly the same as how bins/target states are defined for a WESTPA simulation and for w_assign/w_ipa. Boundaries define the bin boundaries and in coords: you provide any value in the desired bins to "mark" those bins as the state. The C7_eq state takes up 4 different bins in our binning scheme, so I had to provide four different values. See WESTPA Tutorial 7.4 or the Binning page for more information. 

In our alanine dipeptide example, running `lpath.discretize` does exactly what Step 1 of the LPATH paper described (i.e., running w_assign in the backend). In our example, this state definition is only used to select successful trajectories from C7_eq --> C7_ax in `lpath.extract`, but you can assign more states than source/sink and use it for `lpath.match`.


2. `representative_segments.txt` contains the last-frame segments corresponding to the highest weight in each pathway class. They should be in the form:

   Iteration number /Segment number /State ID /auxdata or pcoord/ frame number/ weight
           # "auxdata or pcoord" is variable length. Depending on your specifications there could be none.

      You can use westpa.analysis and/or w_trace to retrieve the trajectory using the given iteration number and segment number.

See our wiki for more help re: any of the concepts above.

-- JL

---
Jeremy M. G. Leung
PhD Candidate, Chemistry
Graduate Student Researcher, Chemistry (Chong Lab)
University of Pittsburgh | 219 Parkman Avenue, Pittsburgh, PA 15260
jml...@pitt.edu | [He, Him, His]

Yanxiao Han

unread,
Mar 14, 2024, 5:09:58 PMMar 14
to westpa...@googlegroups.com
Hi Jeremy,

I have tried Lpath in my case with two dimension boundary or reaction coordinates as the two angles in your example. 
I want to ask whether the Lpath can be used for cases having just one dimension boundary.

Thanks,

Yanxiao



Leung, Jeremy

unread,
Mar 15, 2024, 10:52:42 AMMar 15
to westpa...@googlegroups.com
Hi Yanxiao,

WESTPA's w_assign (and by extension LPATH) is generalizable such that you can use however many dimensions to define your states. So yes, a single dimension will work, so will 3,4,5 dimensions.

You can follow the example at the bottom of this page where we load in a single dimension from auxdata (or pcoord) and pass that function to w_assign.


---
Jeremy M. G. Leung
PhD Candidate, Chemistry
Graduate Student Researcher, Chemistry (Chong Lab)
University of Pittsburgh | 219 Parkman Avenue, Pittsburgh, PA 15260
jml...@pitt.edu | [He, Him, His]



From: westpa...@googlegroups.com <westpa...@googlegroups.com> on behalf of Yanxiao Han <hanya...@gmail.com>
Sent: Thursday, March 14, 2024 5:08:32 PM
To: westpa...@googlegroups.com <westpa...@googlegroups.com>
Subject: Re: [westpa-users] LPATH, Linguistics Pathway Analysis of Trajectories using Hierarchical clustering
 

Yanxiao Han

unread,
Mar 15, 2024, 5:27:40 PMMar 15
to westpa-users
Hi Jeremy,

I tried different ways to obtain the assign.h5 file for a one dimension boundary case and do the clustering, but none of them work well.
1. using the  w_assign -W west.h5 --config-from-file --scheme DEFAULT --construct-dataset module.load_distances --serial
TypeError: coords must be 2-dimensional
2. use lpath discretize -we -W ./west.h5 --assign-arguments "--config-from-file --scheme DEFAULT
KeyError: 'labels'
3. use the assign.h5 from w_ipa then  lpath extract
also get KeyError:'labels'

I uploaded the west.h5, cfg and other file in the link below. The boundary is one dimension, which is a center-to-center distance between a protein and a drug. It works for two dimension but not here.

I'd appreciate if you can help me out.

Thank you!

Yanxiao

Jeremy Leung

unread,
Mar 18, 2024, 12:17:34 PMMar 18
to westpa-users
Hi Yanxiao,

These are two completely different problems.

1) If you are just working on your pcoord, then you don't really need a custom construct-dataset function (in fact the`expand_dims(dataset, axis=2)` will actually make the already ok 2D pcoord array into 3D). Simply calling `w_assign -W west.h5 --config-from-file --scheme DEFAULT`  would do it. It'll grab the pcoord by default and run using the DEFAULT scheme specified in the west.cfg.

2) This is caused by the fact that you didn't comment out the `lpath extract` line in `run_lpath.sh`. See 3) for the solution. Running the follow line is exactly the same as running the `w_assign` command I wrote above in 1) :
`lpath discretize -we -W ./west.h5 --assign-arguments "-W west.h5 --config-from-file --scheme DEFAULT" `

3) It's throwing an error because you specified `-a labels` as an option. This attempts to pull the 'labels' dataset from the west.h5's auxdata and save it as part of lpath.extract output objects. You don't have that dataset in your west.h5 file.  I'm assuming you copied this from the LPATH tutorial, where we made that dataset from kmeans clustering.

This is further down the line, but since you only have two states in the state definition, the `lpath.match` pathway comparison will perform very poorly. Make sure you have some extra states in between your source and target states.

Best,

Jeremy L.

Yanxiao Han

unread,
Mar 20, 2024, 4:29:58 PMMar 20
to westpa...@googlegroups.com
Hi Jeremy,

Thank you for your help. The problems were resolved.
Thank you for your suggestions for defining more states, it works. Otherwise, those trajectories can only be in one cluster.

Bests,

Yanxiao

Reply all
Reply to author
Forward
0 new messages