Problems with Docker Image and ipy parallel computing

77 views
Skip to first unread message

Miriam Gade

unread,
Sep 12, 2021, 1:23:00 PM9/12/21
to hddm-users

Dear HDDM users,
I run HDDM 0.8 in the docker image under windows 10 and encounter the following problem. When I run this model with a single chain, every works nice and finishes in due time:

def run_model(id):
    print('running model%i'%id);
    
    import hddm
    import random
    #import os
    
  

    exp_name = 'Flanker'
    print('running models %i'%id, 'for for exp', exp_name)
    
    # USE the absolute directory in docker.
    dbname = '/home/jovyan/example/df_' + exp_name + '_chain_vaz_test_%i.db'%id # define the database name, which uses pickle format
    mname  = '/home/jovyan/example/df_' + exp_name + '_chain_vaz_test_%i'%id    # define the name for the model
    fname  = '/home/jovyan/example/df_' + exp_name + '.csv'
    df = hddm.load_csv('/home/jovyan/hddm/FlankerHDDM.csv')
      df_subj = df['subj_idx'].unique()
     df_test = df[df['subj_idx'].isin(df_subj)]
    
    m = hddm.HDDM(df_test, depends_on={'v': ['congruency', 'block'], 'a': ['block'], 't': ['congruency', 'block']}, include=('sv', 'st'), group_only_nodes=('sv', 'st'), p_outlier=.05)
    m.sample(10000, burn=1000, dbname='db%i'%id, db='pickle')
    
    m.save(mname)
    
    return m
  

However, when trying to run several chains in parallel  to check the Gelman Rubin statistics with the following code

from ipyparallel import Client

v = Client()[:]

start_time = time.time()  # the start time of the processing
import os
cur_dir = os.getcwd()
print('The current working directory is', cur_dir)
jobs = v.map(run_model, range(4)) # 4 is the number of CPUs
models = jobs.get()

print("\nRunning 4 chains used: %f seconds." % (time.time() - start_time))



I get the following error message.

Exception in callback BaseAsyncIOLoop._handle_events(53, 1) handle: <Handle BaseAsyncIOLoop._handle_events(53, 1)> Traceback (most recent call last): File "/opt/conda/lib/python3.7/asyncio/events.py", line 88, in _run self._context.run(self._callback, *self._args) File "/opt/conda/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 139, in _handle_events handler_func(fileobj, events) File "/opt/conda/lib/python3.7/site-packages/zmq/eventloop/zmqstream.py", line 456, in _handle_events self._handle_recv() File "/opt/conda/lib/python3.7/site-packages/zmq/eventloop/zmqstream.py", line 486, in _handle_recv self._run_callback(callback, msg) File "/opt/conda/lib/python3.7/site-
packages/zmq/eventloop/zmqstream.py", line 438, in _run_callback callback(*args, **kwargs) File "<decorator-gen-153>", line 2, in _dispatch_reply File "/opt/conda/lib/python3.7/site-packages/ipyparallel/client/client.py", line 75, in unpack_message return f(self, msg) File "/opt/conda/lib/python3.7/site-packages/ipyparallel/client/client.py", line 937, in _dispatch_reply handler(msg) File "/opt/conda/lib/python3.7/site-packages/ipyparallel/client/client.py", line 830, in _handle_apply_reply self.results[msg_id] = serialize.deserialize_object(msg['buffers'])[0] File "/opt/conda/lib/python3.7/site-packages/ipyparallel/serialize/serialize.py", line 145, in deserialize_object canned = pickle.loads(pobj) File "/opt/conda/lib/python3.7/site-packages/hddm/models/base.py", line 739, in __setstate__ super(HDDMBase, self).__setstate__(d) File "/opt/conda/lib/python3.7/site-packages/kabuki/hierarchical.py", line 396, in __setstate__ self.load_db(d['dbname'], db=d['db']) File "/opt/conda/lib/python3.7/site-packages/kabuki/hierarchical.py", line 822, in load_db db = db_loader(dbname) File "/opt/conda/lib/python3.7/site-packages/pymc/database/pickle.py", line 82, in load file = open(filename, 'rb') FileNotFoundError: [Errno 2] No such file or directory: 'db3'

why can it create db0 when running a single chain but only one and no more.

updated the ipykernel via docker and reduced the model but nothing changed - ideas and suggestions are highly welcome, thanks! Miriam

hcp...@gmail.com

unread,
Sep 13, 2021, 8:57:34 PM9/13/21
to hddm-users
Hi, Miram,

To debug this, the first thing I would try is to test the multi-chain using the example data within the docker image. If I could run the example code without error, then I would examine my own code.  Below are some comments for the code you've posted:
Blue: comment
Red: the code that might cause the error
Green: suggested code

```````````````````````````````````````````
def run_model(id):
    print('running model%i'%id);
    
    import hddm
    import random  # this module was used in the example to randomly select a few participants, so that we do not need to wait too long, seems that you did not select participants

    #import os
 
    exp_name = 'Flanker'
    print('running models %i'%id, 'for for exp', exp_name)
    
    # USE the absolute directory in docker.
    dbname = '/home/jovyan/example/df_' + exp_name + '_chain_vaz_test_%i.db'%id # define the database name, which uses pickle format
    mname  = '/home/jovyan/example/df_' + exp_name + '_chain_vaz_test_%i'%id    # define the name for the model
    fname  = '/home/jovyan/example/df_' + exp_name + '.csv'          # This line and the above two mean that data will be save in `/home/jovyan/example/`
    df = hddm.load_csv('/home/jovyan/hddm/FlankerHDDM.csv')  # This line shows that data is in `/home/jovyan/hddm/`, please make sure there are two folders, `example` and `hddm`, under `/home/jovyan/`.
    # df = hddm.load_csv(fname)
    df_subj = df['subj_idx'].unique()
     df_test = df[df['subj_idx'].isin(df_subj)] # you may not need this step to create an new data called "df_test", you can use "df" directly.

    
    m = hddm.HDDM(df_test, depends_on={'v': ['congruency', 'block'], 'a': ['block'], 't': ['congruency', 'block']}, include=('sv', 'st'), group_only_nodes=('sv', 'st'), p_outlier=.05)
    m.sample(10000, burn=1000, dbname='db%i'%id, db='pickle')  # Here, `dbname='db%i'%id` means that you did not use the `dbname` defined above, that is, you did not use the absolute path.
    # m = hddm.HDDM(df, depends_on={'v': ['congruency', 'block'], 'a': ['block'], 't': ['congruency', 'block']}, include=('sv', 'st'), group_only_nodes=('sv', 'st'), p_outlier=.05)
    # m.sample(10000, burn=1000, dbname=dbname, db='pickle')    

    m.save(mname)
    
    return m
````````````````````````````````````````````

Hope the above helps.

Best,
Chuan-Peng

Miriam Gade

unread,
Sep 14, 2021, 1:21:28 AM9/14/21
to hddm-...@googlegroups.com
Dear Chuan Peng, 
Thanks for your suggestions. The  example runs without errors and I will adjust the code accordingly and check whether the folders exist. I did run the code a couple of weeks ago with a simplier model and only ten subjects and got no error when going for parallel chains and could extract all values from the models. I do not come this far now,  tried less subjects and simplier models, but the error message persists. I will keep you updated. Thanks for coming back to this  post so quickly, regards Miriam
--
You received this message because you are subscribed to a topic in the Google Groups "hddm-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hddm-users/KZTURQ3UlDw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hddm-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hddm-users/8402908c-5119-4579-b2c4-8d6fbff9b9e3n%40googlegroups.com.

Miriam Gade

unread,
Sep 19, 2021, 4:50:55 AM9/19/21
to hddm-...@googlegroups.com
Hi, just to let you know, it works! thanks for pointing me to the right issues, best miriam
Reply all
Reply to author
Forward
0 new messages