HDDM 0.8 on Windows 10 Dockerimage, unable to load .csv (file not found errror)

50 views
Skip to first unread message

Miriam Gade

unread,
Aug 16, 2021, 11:18:07 AM8/16/21
to hddm-users
Hi,
I try to load a .csv file (unix ending, comma separated) to the  hcp4715 / hddm_docker - I worked through the example code (with example data set) and everything worked out well. I can access the file with the following code:

import csv
with open('SimonHDDMKomma.csv','r') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

which gives:
'subj_idx', 'rt', 'response', 'block', 'congruency'] ['695', '0.716', '1', '1', '0'] ['695', '0.774', '1', '1', '0'] ['695', '0.555', '1', '1', '0'] ['695', '0.595', '1', '1', '0'] ['695', '0.621', '1', '1', '0'] ['695', '0.672', '1', '1', '0'] ['695', '0.426', '1', '1', '0'] ['695', '0.518', '1', '1', '0'] ['695', '0.574', '1', '1', '0'] ['695', '0.597', '1', '1', '0'] ['695', '0.466', '1', '1', '0'] ['695', '0.436', '1', '1', '0'] ['695', '0.566', '1', '1', '0'] ['695', '0.564', '1', '1', '0'] ['695', '0.562', '1', '1', '0'] ['695', '0.4', '1', '1', '0']

however, I get consistent error messages when I run this code on my data set:

from ipyparallel import Client

v = Client()[:]

start_time = time.time()  # the start time of the processing
jobs = v.map(run_model, range(4)) # 4 is the number of CPUs
models = jobs.get()

print("\nRunning 4 chains used: %f seconds." % (time.time() - start_time))


error message:
traceback (most recent call last)<string> in <module> /opt/conda/lib/python3.7/site-packages/ipyparallel/client/remotefunction.py in <lambda>(f, *sequences) 141 142 if sys.version_info[0] >= 3: --> 143 _map = lambda f, *sequences: list(map(f, *sequences)) 144 else: 145 _map = map <ipython-input-22-ee259895968a> in run_model(id) /opt/conda/lib/python3.7/site-packages/kabuki/utils.py in load_csv(*args, **kwargs) 132 :SeeAlso: save_csv, pandas.read_csv() 133 """ --> 134 return pd.read_csv(*args, **kwargs) 135 136 /opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision) 674 ) 675 --> 676 return _read(filepath_or_buffer, kwds) 677 678 parser_f.__name__ = name /opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds) 446 447 # Create the parser. --> 448 parser = TextFileReader(fp_or_buf, **kwds) 449 450 if chunksize or iterator: /opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds) 878 self.options["has_index_names"] = kwds["has_index_names"] 879 --> 880 self._make_engine(self.engine) 881 882 def close(self): /opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py in _make_engine(self, engine) 1112 def _make_engine(self, engine="c"): 1113 if engine == "c": -> 1114 self._engine = CParserWrapper(self.f, **self.options) 1115 else: 1116 if engine == "python": /opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds) 1889 kwds["usecols"] = self.usecols 1890 -> 1891 self._reader = parsers.TextReader(src, **kwds) 1892 self.unnamed_cols = self._reader.unnamed_cols 1893 pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__() pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source() FileNotFoundError: [Errno 2] File SimonHDDMKomma.csv does not exist: 'SimonHDDMKomma.csv'

the single chain model works fine with this data set. any help or hint would be really appreciated - best Miriam 

hcp...@gmail.com

unread,
Aug 16, 2021, 10:23:20 PM8/16/21
to hddm-users
Hi, Miriam,

I've troubled with this issue when using `ipyparallel` too.
One solution is using full path instead of relative path when define `run_model`. For example, in the example notebook here (https://github.com/hcp4715/hddm_docker/blob/master/example/HDDM_official_tutorial_reproduced.ipynb), I defined `run_model` like this:

```
def run_model(id):
    print('running model%i'%id);
    import hddm
    exp_name = 'cavanagh'
    model_tag = 'm1'

    #### USE absolute pathes in docker.
   # define the database name, which uses pickle format
    dbname = '/home/jovyan/example/df_' + exp_name + '_' + model_tag + '_chain_%i.db'%id
    # define the name for the model
    mname = '/home/jovyan/example/df_' + exp_name + '_' + model_tag + '_chain_%i'%id
    fname = '/opt/conda/lib/python3.7/site-packages/hddm/examples/cavanagh_theta_nn.csv'
   data = hddm.load_csv(fname)

   m2 = hddm.HDDM(data)
   m2.find_starting_values()
   m2.sample(5000, burn=20, dbname=dbname, db='pickle')  # it's necessary to save the model data
   m2.save(mname)

   return m2
```

As you can see from above, all paths are full paths.

PS: this is a bit odd, and I am trying other way for parallel processing. I will update the example notebook once I am satisfied with new solution.

Hope it helps.

Best,
Chuan-Peng

Miriam Gade

unread,
Aug 17, 2021, 8:42:12 AM8/17/21
to hddm-users
great, thanks so much - it now works smoothly - have nice day, best Miriam
Reply all
Reply to author
Forward
0 new messages