error while running MAB

174 views
Skip to first unread message

moutos...@gmail.com

unread,
Apr 26, 2021, 7:47:53 PM4/26/21
to westpa-users
Hi, 
I am trying to run MAB. I downloaded the files available at https://github.com/westpa/user_submitted_scripts/tree/main/Adaptive_Binning
I followed the usual script I use to run the simulation calling env.sh, init.sh and run.sh.
However, I got the following error:

Traceback (most recent call last):
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/cmds/w_run.py", line 24, in <module>
    sim_manager = westpa.rc.get_sim_manager()
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/west_tools/westpa/_rc.py", line 242, in get_sim_manager
    self._sim_manager = self.new_sim_manager()
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/west_tools/westpa/_rc.py", line 236, in new_sim_manager
    sim_manager = extloader.get_object(drivername)(rc=self)
  File "/home/msaha/westpa/nacl_amb_spce-hp/glycine/2ps/B2UB/adaptive-binning/manager.py", line 50, in __init__
    self.we_driver = self.rc.get_we_driver()
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/west_tools/westpa/_rc.py", line 272, in get_we_driver
    self._we_driver = self.new_we_driver()
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/west_tools/westpa/_rc.py", line 266, in new_we_driver
    we_driver = extloader.get_object(drivername)(rc=self)
  File "/home/msaha/westpa/nacl_amb_spce-hp/glycine/2ps/B2UB/adaptive-binning/driver.py", line 63, in __init__
    self.system = system or self.rc.get_system_driver()
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/west_tools/westpa/_rc.py", line 488, in get_system_driver
    self._system = self.new_system_driver()
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/west_tools/westpa/_rc.py", line 335, in new_system_driver
    system = extloader.get_object(sysdrivername)(rc=self)
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/west_tools/westpa/extloader.py", line 62, in get_object
    module = load_module(modspec, path)
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/west_tools/westpa/extloader.py", line 29, in load_module
    (fp, pathname, desc) = imp.find_module(next_component, path)
  File "/home/msaha/.conda/envs/westpa-2020.02/lib/python3.7/imp.py", line 296, in find_module
    raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named 'system'


While trying with files from https://github.com/westpa/user_submitted_scripts/tree/main/Adaptive_Binning/adaptive_2.0, it says 'ModuleNotFoundError: No module named 'westpa.core''.

How can I run adaptive binning?

Thanks.
Moutoshi


Anthony Bogetti

unread,
Apr 26, 2021, 7:51:51 PM4/26/21
to westpa...@googlegroups.com
Hi Moutoshi,

It looks like you are using WESTPA 1.0 (from the conda install), correct?  In this case, please use the non-2.0 adaptive binning files here.  The import statements are slightly different going from WESTPA-1.0 to WESTPA-2.0.  Once WESTPA-2.0 is officially released we will clean up all of these files, sorry for the confusion.

Let me know if you have any other questions or errors.

Best,
Anthony

--
You received this message because you are subscribed to the Google Groups "westpa-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to westpa-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/westpa-users/67e67a43-4f0f-4e92-8945-301bbcdc1b77n%40googlegroups.com.

Anthony Bogetti

unread,
Apr 26, 2021, 7:55:30 PM4/26/21
to westpa...@googlegroups.com
In addition, make sure that your west.cfg file matches this one, especially when specifying the driver, sim_manager and we_driver. (It's okay to use this file in this case even though it is in the 2.0 folder)

Best ,
Anthony


MOUTOSHI SAHA

unread,
Apr 26, 2021, 7:58:42 PM4/26/21
to westpa...@googlegroups.com
Hi Anthony,

Thanks for your reply. I am using westpa -2020.02   (downloaded from https://github.com/westpa/westpa/wiki/WESTPA-Quick-Installation). Is that westpa-1.0?

Regards-
Moutoshi



On Mon, Apr 26, 2021 at 7:51 PM Anthony Bogetti <anthony...@gmail.com> wrote:


--
Regards-
Moutoshi Saha

Anthony Bogetti

unread,
Apr 26, 2021, 8:02:14 PM4/26/21
to westpa...@googlegroups.com
Yes, that is WESTPA-1.0.  In that case, copy the adaptive.py, driver.py and manager.py files from the directory I linked to my previous email into your $WEST_SIM_ROOT and make sure that your west.cfg file has the sections in the west.cfg file that I linked in my previous email as well.  After that, you should be good to initialize and run your WESTPA simulation (of course, please adjust the parameters in your adaptive.py to suit your system and goals).

Best,
Anthony

MOUTOSHI SAHA

unread,
Apr 26, 2021, 8:42:18 PM4/26/21
to westpa...@googlegroups.com
It's running now. Thanks for your help Anthony. 

Regards-
Moutoshi



--
Regards-
Moutoshi Saha

moutos...@gmail.com

unread,
May 5, 2021, 4:44:41 PM5/5/21
to westpa-users
Hi Anthony,

While running MAB, I am getting the following error.

"Wed May  5 14:31:33 2021
Iteration 359 (1000 requested)
Beginning iteration 359
48 segments remain in iteration 359 (48 total)
12 of 20 (60.000000%) active bins are populated
per-bin minimum non-zero probability:       3.52653e-06
per-bin maximum probability:                0.582973
per-bin probability dynamic range (kT):     12.0156
per-segment minimum non-zero probability:   8.81633e-07
per-segment maximum non-zero probability:   0.147089
per-segment probability dynamic range (kT): 12.0248
norm = 1, error in norm = 0 (0*epsilon)
exception caught; shutting down
-- ERROR    [w_run] -- Traceback (most recent call last):
  File "/home/msaha/.conda/envs/westpa-2020.02/westpa-2020.03/lib/cmds/w_run.py", line 49, in <module>
    sim_manager.run()
  File "/home/msaha/westpa/adaptive-binning/left2right-5ps/manager.py", line 638, in run
    self.run_we()
  File "/home/msaha/westpa/adaptive-binning/left2right-5ps/manager.py", line 559, in run_we
    self.we_driver.construct_next()
  File "/home/msaha/westpa/adaptive-binning/left2right-5ps/driver.py", line 698, in construct_next
    self._run_we()
  File "/home/msaha/westpa/adaptive-binning/left2right-5ps/driver.py", line 538, in _run_we
    self._recycle_walkers()
  File "/home/msaha/westpa/adaptive-binning/left2right-5ps/driver.py", line 329, in _recycle_walkers
    .format(n_recycled_walkers,len(self.avail_initial_states)))
driver.ConsistencyError: need 4 initial states for recycling, but only 2 present'"

Could you please tell me why is it happening?

Thanks.
Moutoshi

Anthony Bogetti

unread,
May 7, 2021, 1:24:40 PM5/7/21
to westpa...@googlegroups.com
Hi Moutoshi,

Thanks for posting your question.  I'll have to check with another lab member and we will get back to you soon with a solution/ description of what is going on to cause this error.

Best,
Anthony

Anthony Bogetti

unread,
May 10, 2021, 10:48:33 AM5/10/21
to westpa...@googlegroups.com
Hi Moutoshi,

Was this error obtained after stopping and restarting a WESTPA simulation?  Or was your simulation interrupted with the error during a continuous run that started from iteration one?  One thing to try would be to w_truncate your simulation back 1 or 2 iterations (also removing the binbounds.txt file from your WEST_SIM_ROOT if it exists), resubmit your simulation and see if that fixes the problem.  Note that you will also need to remove corresponding seg_logs and traj_segs after w_truncating since w_truncate only removes information from the west.h5 file.

If that doesn't work, please let me know and we can troubleshoot further.  Feel free to send along any other log files (like seg_logs and slurm.logs etc.) as those will help to more accurately diagnose what is going on.

Best,
Anthony

moutos...@gmail.com

unread,
May 11, 2021, 1:34:12 PM5/11/21
to westpa-users
Hi Anthony,

It failed during a continuous iteration. I tried w_truncate and rerun but it failed again. attached is the west.log, seg_logs file for rerun.
The files are for a run that failed at iteration 4859 first time and at iteration 4862 during rerun. 

When the first recycling occurred from target to unbound, the run failed. I would appreciate your help on this. 

Again, I ran NaCl MAB and got the PMF while the west.log showed no recycling happened. For manual binning, the recycling happened at iteration 8 for NaCl as per west.log. I was wondering why the west.log for MAB-NaCl showed no recycling happened while I got almost similar PMF for both MAB and manual binning NaCl runs. 

Thanks.
Moutoshi  
004861.tar
west.log
004862-000023.log
004854.tar
004856.tar
004862-000010.log
004855.tar
004862-000000.log

Anthony Bogetti

unread,
May 11, 2021, 2:23:43 PM5/11/21
to westpa...@googlegroups.com
Hi Moutoshi,

Thanks for sending these files; I will take a look at it.  Would you also be able to send your adaptive.py file? 

Thanks,
Anthony

moutos...@gmail.com

unread,
May 11, 2021, 11:39:31 PM5/11/21
to westpa-users
Hi Anthony,

Here is the adaptive.py file. 
Thanks for your help.

Regards-
Moutoshi
adaptive.py

Anthony Bogetti

unread,
May 12, 2021, 10:08:42 AM5/12/21
to westpa...@googlegroups.com
Thanks Moutoshi.  We've taken a look and your setup looks fine.  I have never seen this error before and am a little unsure about what might be causing it.

To help isolate the issue, please try the following:
  1. w_truncate back a few iterations like you did before.  Be sure to delete the corresponding seg_logs and traj_segs from their respective folders.
  2. Continue the truncated simulation with a manual binning scheme (this might take a bit of adjustment from your current setup) and see if you are still seeing a crash or if it keeps going.  This will help us isolate the issue to either the MAB scheme or WESTPA in general.
We will send you a script a bit later today or tomorrow that will add some helpful output to your west.log that can help us even further, but for now try the suggestion above and let me know what happens.

Your simulation has been running for a long time, did you notice any errors before this?  It's a little surprising it would run so long and fail after so many iterations with no prior indications.  It seems from the west.log that you have recycling occurring as it should and your probabilities are not too low.  Have you run the simulation this long with a manual binning scheme?  If so, what was the result?

Best,
Anthony

Anthony Bogetti

unread,
May 12, 2021, 10:21:05 AM5/12/21
to westpa...@googlegroups.com
Hi Moutoshi,

Here is the script I was referring to in my previous email regarding troubleshooting.  Go ahead and w_truncate then run your simulation again but with this adaptive_verbose.py script instead, which is the same as your previous one but will print out more information into your west.log file about what is happening.  Please send the log file after the crash occurs so we can take a look.

Best,
Anthony
adaptive_verbose.py

moutos...@gmail.com

unread,
May 12, 2021, 11:06:50 AM5/12/21
to westpa-users

Thanks Anthony. I will do what you suggested and let you know the update. 
I noticed that the error occurred at the iteration where recycling started for the first time. Everything was fine before recycling started. Also, I did not run a manual binning for this run. I will try running a manual binning simulation as well. 

Also, can you please tell me why I got a PMF for MAB for a steady state non equilibrium NaCl run while the west.log shows no recycling happened? Attached are the adaptive.py and west.log for NaCl. I thought for steady state run, recycling will happen from bound to unbound. But it did not happen for NaCl and I still got the PMF. 

Thanks.
Moutoshi
west-10251033.log
init.sh
adaptive.py

moutos...@gmail.com

unread,
May 12, 2021, 12:00:53 PM5/12/21
to westpa-users
Hi Anthony,
The run failed again. Here is the west.log file. I used the new script. 
Thanks.
Moutoshi

west.log

Anthony Bogetti

unread,
May 12, 2021, 12:50:09 PM5/12/21
to westpa...@googlegroups.com
Hi Moutoshi,

In the case of NaCl, you will probably need to set the target state to be greater than 1 Angstrom in order to see recycling events.  For the tutorial example, we used <2.6 Angstroms as the target state based on the Na-Cl bond distance from an energy-minimized structure with one of the Amber force fields.  I think the ions probably won't get much closer than that, meaning you won't see any recycling unless you increase your target state definition.  Without recycling, you are essentially just running an equilibrium WE simulation which would explain the similarities you are seeing there.  Try increasing the target state definition in your adaptive.py file and let me know how that goes; you should see some recycling before 100 iterations have completed. 

If you are using a different potential/force field, you can test this out by searching your h5 file's pcoord data set to see how close the two ions get when bound and use that as your target state definition, or maybe slightly above that value.

We are currently looking into the error from the other log you sent and will get back to you soon regarding that.

Best,
Anthony

Gabriel Monteiro da Silva

unread,
Jul 31, 2021, 1:02:31 PM7/31/21
to westpa-users
Hi Anthony,

Just wanted to report that I have encountered a similar error in my steady-state simulations using adaptive sampling: the simulation will crash with the following error message: 

Sat Jul 31 00:46:19 2021
Iteration 332 (350 requested)
Beginning iteration 332
424 segments remain in iteration 332 (424 total)
53 of 72 (73.611111%) active bins are populated
per-bin minimum non-zero probability:       1.36748e-76
per-bin maximum probability:                0.718155
per-bin probability dynamic range (kT):     174.352
per-segment minimum non-zero probability:   1.70936e-77
per-segment maximum non-zero probability:   0.110655
per-segment probability dynamic range (kT): 174.562
norm = 1, error in norm = 2.22045e-16 (1*epsilon)
exception caught; shutting down
-- ERROR    [w_run] -- Traceback (most recent call last):
  File "/users/gmontei2/.conda/envs/wpa/westpa-2020.03/lib/cmds/w_run.py", line 49, in <module>
    sim_manager.run()
  File "/gpfs/data/brubenst/gmontei2/tk/abl1_rmsd2/manager.py", line 637, in run
    self.run_we()
  File "/gpfs/data/brubenst/gmontei2/tk/abl1_rmsd2/manager.py", line 558, in run_we
    self.we_driver.construct_next()
  File "/gpfs/data/brubenst/gmontei2/tk/abl1_rmsd2/driver.py", line 697, in construct_next
    self._run_we()
  File "/gpfs/data/brubenst/gmontei2/tk/abl1_rmsd2/driver.py", line 537, in _run_we
    self._recycle_walkers()
  File "/gpfs/data/brubenst/gmontei2/tk/abl1_rmsd2/driver.py", line 328, in _recycle_walkers
    .format(n_recycled_walkers,len(self.avail_initial_states)))
driver.ConsistencyError: need 2 initial states for recycling, but only 1 present


But then if I continue the simulation (even without truncating, just deleting binbounds.txt) it works and properly recycles. It's not really a big deal since it doesn't break anything but rather just mildly interrupts it. It seems that pcoords reaching recycling values is what triggers this.

Here is my adaptive.py file:

from __future__ import print_function, division

import numpy

from west.propagators import WESTPropagator

from west.systems import WESTSystem

from westpa.binning import RectilinearBinMapper

from westpa.binning import FuncBinMapper

from westpa.binning import RecursiveBinMapper

import logging

log = logging.getLogger('westpa.rc')

PI = numpy.pi

from numpy import *

pcoord_dtype = numpy.float32

#THESE ARE THE PARAMETERS YOU CAN CHANGE

bintargetcount=8 #number of walkers per bin

numberofdim=2  # number of dimensions

binsperdim=[8,8]   # You will have prod(binsperdim)+numberofdim*(2+2*splitIsolated)+activetarget bins total

pcoordlength=6 # length of the pcoord

maxcap=[45,45] #for each dimension enter the maximum number at which binning can occur, if you do not wish to have a cap use inf

mincap=[9,9]  #for each dimension enter the minimum number at which binning can occur, if you do not wish to have a cap use -inf

#How and or target states works is the following. Both target state and target state direction are arrays of arrays. Inner arrays act as and statements while outer arrays impose or conditions. Internal arrays follow the indexing of the dimensions.

targetstate=[[36,13]]    #enter boundaries for target state or None if there is no target state in that dimension

targetstatedirection=[[1,-1]]  #if your target state is meant to be greater that the starting pcoor use 1 or else use -1. This will be done for each dimension in your simulation

activetarget=1 #if there is no target state make this zero

splitIsolated=1     #choose 0 to disable the use of bottleneck walkers (not recomended)

#########

def function_map(coords, mask, output):

splittingrelevant=True #This is to make sure splitting is relevant (not relevant for binner after recycling for example)
originalcoords=copy(coords) #It is a good idea to keep an original array

maxlist=[] #Preparing array to contain maximum pcoords in each dimension

minlist=[] #Preparing array to contain minimum pcoords in each dimension

difflist=[] #Preparing array to contain "bottleneck" values in positive direction for each dimensio #Preparing array to contain "bottleneck" values in negative direction for each dimensionn

flipdifflist=[] #Preparing array to contain "bottleneck" values in negative direction for each dimension

for n in range(numberofdim): #going through each dimension

try:    #because binning should be handled different for recycled trajectories we load in a binbounds.txt created after an iteration completes

extremabounds=loadtxt('binbounds.txt') 

currentmax=amax(extremabounds[:,n])

currentmin=amin(extremabounds[:,n])

except: #during initialization this may not exitst so use current coords for extrema

currentmax=amax(coords[:,n])

currentmin=amin(coords[:,n])

if maxcap[n]<currentmax: #Checking the maxcap in each dimension

                currentmax=maxcap[n]

if mincap[n]>currentmin: #Checking the mincap in each dimension

                currentmin=mincap[n]

maxlist.append(currentmax) #Need arrays for our extrema since there may be multiple depending on number of dimension

minlist.append(currentmin)

try: #Recycled trajectories should not be tagged and will throw exception to be handled by except statement

temp=column_stack((coords[:,n],coords[:,numberofdim])) #Create an array containing progress coordinates of one dimension and associated probailities

temp=temp[temp[:,0].argsort()] #Sort this by progress coordinate

for p in range(len(temp)):  #This just deals with the fact that currently received probailities are in float32 (it is probably best to disregard probailities smaller than E-39 anyway for tagging

                if temp[p][1]==0:

                temp[p][1]=10**-39

fliptemp=flipud(temp) #Recived sorted array in opposite direction

difflist.append(0) #Provide starting minimum of 0 (in very unlikely case of pcoord 0 and no tagged this could cause arbitrary tag (very minor impact), work to fix)

flipdifflist.append(0)

maxdiff=0 

flipmaxdiff=0

for i in range(1,len(temp)-1): #calculating of the "bottleneck" values, we need to sum all of the probability past a potential "bottleneck"

comprob=0

flipcomprob=0

j=i+1

while j<len(temp): #calculating the cumulative probability past the each walker in each direction

comprob=comprob+temp[j][1]

flipcomprob=flipcomprob+fliptemp[j][1]

j=j+1

if temp[i][0]<maxcap[n] and temp[i][0]>mincap[n]:

if (-log(comprob)+log(temp[i][1]))>maxdiff: #we want to find the point where the difference between the walker and the cumulative probability past it is at a maximum, we use logarithms to compare differences

difflist[n]=temp[i][0]

maxdiff=-log(comprob)+log(temp[i][1])

if fliptemp[i][0]<maxcap[n] and fliptemp[i][0]>mincap[n]:

if (-log(flipcomprob)+log(fliptemp[i][1]))>flipmaxdiff:

flipdifflist[n]=fliptemp[i][0]

flipmaxdiff=-log(flipcomprob)+log(fliptemp[i][1])

except:

splittingrelevant=False  #if an error is thrown tagging of bottleneck walkers is not needed

for i in range(len(output)): #this section deals with proper assignment of walkers to bins

binnumber=2*numberofdim #essentially the bin number 
   
intarget=False

for k in range(len(targetstate)):
for l in range(len(targetstate[k])):

if (activetarget==1) and targetstate[k][l] is not None:

if (originalcoords[i,l]*targetstatedirection[k][l]) >= (targetstate[k][l]*targetstatedirection[k][l]): #if the target state has been reached assign to following bin

intarget=True
else:
l=len(targetstate[k])

intarget=False

if intarget:
k=len(targetstate)
binnumber=prod(binsperdim)+numberofdim*2

for n in range(numberofdim):

if (binnumber==prod(binsperdim)+numberofdim*2): #this ends the loop if binned in target state, n= numberofdim should not go in above line because of elif statements 

n=numberofdim

elif coords[i,n]>=maxlist[n] or originalcoords[i,n]>=maxcap[n]: #assign maxima or those over max cap to own bin

binnumber= 2*n

n=numberofdim

elif coords[i,n]<=minlist[n] or originalcoords[i,n]<=mincap[n]: #assign minima or those under minima to own bin

binnumber =2*n+1

n=numberofdim

elif splittingrelevant and coords[i,n]==difflist[n] and splitIsolated==1: #assign bottleneck walker in one direction to bin

binnumber=prod(binsperdim)+numberofdim*2+2*n+activetarget

n=numberofdim

elif splittingrelevant and coords[i,n]==flipdifflist[n] and splitIsolated==1: #assign bottleneck walker in other direction to bin

binnumber=prod(binsperdim)+numberofdim*2+2*n+activetarget+1

n=numberofdim

if binnumber==2*numberofdim: #calculate  binning for evenly spaced bins

for j in range(numberofdim):

binnumber = binnumber + (digitize(coords[i][j],linspace(minlist[j],maxlist[j],binsperdim[j]+1))-1)*prod(binsperdim[0:j])

output[i]=binnumber

return output

#######
class System(WESTSystem): #class initialization

def initialize(self):

self.pcoord_ndim = numberofdim

self.pcoord_len = pcoordlength

self.pcoord_dtype = numpy.float32 

self.bin_mapper = FuncBinMapper(function_map, prod(binsperdim)+numberofdim*(2+2*splitIsolated)+activetarget) #Changed binsperbin to binsperdim
 
self.bin_target_counts = numpy.empty((self.bin_mapper.nbins,), numpy.int_)

self.bin_target_counts[...] = bintargetcount

...

And my tstates file

a_loop_in    36    13


Thanks!

Anthony Bogetti

unread,
Jul 31, 2021, 2:17:07 PM7/31/21
to westpa...@googlegroups.com
Hi Gabriel,

Thanks for reporting this error that you encountered and I am glad you were able to proceed with your simulation after some troubleshooting.  This is interesting, was there by any chance a binbounds.txt file present in your $WEST_SIM_ROOT from a previous simulation when the error first occurred?  Or maybe that would have been overwritten automatically.  I will take a closer look and see how to fix this and prevent any future inconveniences.

Best,
Anthony

Gabriel Monteiro da Silva

unread,
Jul 31, 2021, 2:49:48 PM7/31/21
to westpa-users
Thanks Anthony for the very fast reply. My job submission script removes the binbounds.txt file before starting WESTPA so the file was fresh from that part of the run.  
Reply all
Reply to author
Forward
0 new messages