Saving and loading models

414 views
Skip to first unread message

Akos Szekely

unread,
Jun 7, 2016, 5:52:41 PM6/7/16
to hddm-users
Hello all,

I'm attempting to construct a model and estimate and sample for a dataset of 124800 values, which means that saving and loading sampled models is essential.  Unfortunately, neither is particularly cooperative.  I finally got this to work for saving:
self.model.sample(10000,burn=1000,thin=5,db='pickle', dbname=os.path.join(self.pth,'model_'+self.File_end+'.db'))
self.model.save(os.path.join(self.pth,'model_'+self.File_end+'.db'))

but neither of two options work for loading:
with open(os.path.join(self.pth,'model_'+self.File_end+'.db'),'r') as f:
                self.model=pickle.load(f)
AttributeError: 'dict' object has no attribute 'gen_stats'

self.model=hddm.load(os.path.join(self.pth,'model_'+self.File_end+'.db'))
Traceback (most recent call last):
  File "C:\Users\mohantylab\Experiments_and_Data\Projects\MEmoS\MEmoS\HDDM_MEmoS.py", line 219, in Model
    self.model=hddm.load(os.path.join(self.pth,'model_'+self.File_end+'.db'))
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\kabuki\utils.py", line 24, in load
    pickle.load(f)
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\pickle.py", line 1378, in load
    return Unpickler(file).load()
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\pickle.py", line 858, in load
    dispatch[key](self)
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\pickle.py", line 1217, in load_build
    setstate(state)
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\hddm\models\base.py", line 699, in __setstate__
    super(HDDMBase, self).__setstate__(d)
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\kabuki\hierarchical.py", line 396, in __setstate__
    self.load_db(d['dbname'], db=d['db'])
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\kabuki\hierarchical.py", line 799, in load_db
    db = db_loader(dbname)
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\pymc\database\pickle.py", line 83, in load
    container = std_pickle.load(file)
ImportError: No module named copy_reg


when self.model.gen_stats() is run on it.
I actually tried to patch the first variant by using the "with" command and managed to hit the maximum allowable recursions.
Any suggestions?

Best,
Akos Szekely

Thomas Wiecki

unread,
Jun 8, 2016, 3:57:47 AM6/8/16
to hddm-...@googlegroups.com
Hi Akos,

Have you tried model.load_db()?

Best,
Thomas

--
You received this message because you are subscribed to the Google Groups "hddm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hddm-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Akos Szekely

unread,
Jun 8, 2016, 12:59:31 PM6/8/16
to hddm-users
Hello Thomas,

Thanks for your quick reply.
Just tried that by creating the model in the usual way and then calling load_db() on top of it and got the following (tried it as .db formed from sampling, .db formed by using model.save, and .pickle from model.save):

Traceback (most recent call last):
  File "C:\Users\mohantylab\Experiments_and_Data\Projects\MEmoS\MEmoS\HDDM_MEmoS.py", line 225, in Model
    self.model.load_db(os.path.join(self.pth,'model_'+self.File_end+'_2.pickle'))

  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\kabuki\hierarchical.py", line 799, in load_db
    db = db_loader(dbname)
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\pymc\database\sqlite.py", line 238, in load
    db = Database(dbname)
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\pymc\database\sqlite.py", line 189, in __init__
    existing_tables = get_table_list(self.cur)
  File "C:\Users\mohantylab\WinPython-32bit-2.7.10.1\python-2.7.10\lib\site-packages\pymc\database\sqlite.py", line 266, in get_table_list
    ORDER BY name""")
DatabaseError: file is encrypted or is not a database


Best,
Akos Szekely

Thomas Wiecki

unread,
Jun 11, 2016, 5:57:50 AM6/11/16
to hddm-...@googlegroups.com
Sure that you saved it with the same python and hddm version as you're trying to load it?

Akos Szekely

unread,
Jun 13, 2016, 12:13:39 PM6/13/16
to hddm-users
Hello Thomas,

Yeah, it's all with the same python and hddm versions.  I'll post my code for it:
self.data=hddm.load_csv(os.path.join(self.pth,'Master.csv'))
        self.LogFile.write('loaded master data\n')
        try:
            self.LogFile.write('trying to load model...\n')
            print('trying to load model...')
#            self.model=hddm.load(os.path.join(self.pth,'model_'+self.File_end+'.db'))
            if self.Info['Info']=='False':
                self.model=hddm.HDDM(self.data,depends_on=self.dependancies,p_outlier=0.05,include=RemovableKeys,informative=False,trace_subjs=False)
            elif self.Info['Info']!='False':
                self.model=hddm.HDDM(self.data,depends_on=self.dependancies,p_outlier=0.05,include=RemovableKeys,informative=True,trace_subjs=False)

            self.model.load_db(os.path.join(self.pth,'model_'+self.File_end+'_2.pickle'))
#            with open(os.path.join(self.pth,'model_'+self.File_end+'.pickle'),'r') as f:
#                self.model=pickle.load(f)
#            try:
#                self.model.gen_stats()
#            except:
#                print traceback.format_exc()
            try:
                self.model.gen_stats()
            except:
                print traceback.format_exc()
            self.LogFile.write('got model\n')
            print('got model')
        except:
            self.LogFile.write('%s\n' %(traceback.format_exc()))
            print('%s\n' %(traceback.format_exc()))
            self.LogFile.write('new model\n')
            print('new model')
            if self.Info['Info']=='False':
                self.model=hddm.HDDM(self.data,depends_on=self.dependancies,p_outlier=0.05,include=RemovableKeys,informative=False,trace_subjs=False)
            elif self.Info['Info']!='False':
                self.model=hddm.HDDM(self.data,depends_on=self.dependancies,p_outlier=0.05,include=RemovableKeys,informative=True,trace_subjs=False)
            self.LogFile.write('Made the model\n')
            print('Made the model')
#            self.model.find_starting_values()
#            print 'got starting vals',timer.getTime()
            if self.Info['Debug?'] in ['no','No']:
                self.model.sample(10000,burn=1000,thin=5,db='pickle', dbname=os.path.join(self.pth,'model_'+self.File_end+'.db')) #was 5000, need reference
            else:
                self.model.sample(200,burn=10,db='pickle', dbname=os.path.join(self.pth,'model_'+self.File_end+'.db'))
            self.LogFile.write('Sampled model fully\n')
            print('Sampled model fully')
            try:
                self.LogFile.write('Saving model to pickle...\n')
                print('Saving model to pickle...')
                self.model.save(os.path.join(self.pth,'model_'+self.File_end+'_2.pickle'))
#                self.model.db.commit()
#                with open(os.path.join(self.pth,'model_'+self.File_end+'.pickle'),'wb') as f:
#                    pickle.dump(self.model,f)
                self.LogFile.write('Saving complete!\n')
                print('Saving complete!')
            except:
                self.LogFile.write('%s\n' %(traceback.format_exc()))
                print('%s' %(traceback.format_exc()))
        self.s=self.model.gen_stats()
        self.LogFile.write('Generated stats\n')
        print('Generated stats')
        self.s.to_csv(os.path.join(self.pth,'Everything_'+self.File_end+'.csv'))
        self.LogFile.write('Wrote stats to file\n')
        print('Wrote stats to file')
        self.DIC=self.model.dic
        self.LogFile.write('Obtained DIC\n')
        print('Obtained DIC')

As a note, this is all part of a function within a class, hence the use of self.  This may not work as-is, since there are several other parts to the code, but with a random data set and pared down a bit it'll run.

Best,
Akos Szekely
Reply all
Reply to author
Forward
0 new messages