Hi Arthur,
wow, this is a lot of models ... Just curious, how long does it take to run? Sorry to hear that you eventually got the memory error issue. I have actually never seen that before and don't know whether it is
a) your system running out of memory (could also be a Windows-specific thing; on Linux/macOS, it would do swapping when you run out of main memory.)
b) whether it has something to do with Python's size limit for dictionaries + the deepcopy call itself
In any case, the first thing I would probably do after fitting is to dump the content of efs.subsets_ to a json or yaml file. In case of another crash, you can at least read from the file for the analysis so that you don't have to rerun everything.
Other than that, the get_metric_dict code is pretty simple, you could try to run it directly:
fdict = deepcopy(self.subsets_)
for k in fdict:
std_dev = np.std(self.subsets_[k]['cv_scores'])
bound, std_err = self._calc_confidence(
self.subsets_[k]['cv_scores'],
confidence=confidence_interval)
fdict[k]['ci_bound'] = bound
fdict[k]['std_dev'] = std_dev
fdict[k]['std_err'] = std_err
I.e., in the code above, you can remove the deepcopy call and modify self.subsets_ in place:
for k in fdict:
std_dev = np.std(self.subsets_[k]['cv_scores'])
bound, std_err = self._calc_confidence(
self.subsets_[k]['cv_scores'],
confidence=confidence_interval)
self.subsets_[k]['ci_bound'] = bound
self.subsets_[k]['std_dev'] = std_dev
self.subsets_[k]['std_err'] = std_err
I would then save the contents to yaml, for example:
with open('subsets.yml', 'w') as outfile:
yaml.dump(efs.subsets_, outfile, default_flow_style=False)
Best,
Sebastian
> --
> You received this message because you are subscribed to the Google Groups "mlxtend" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
mlxtend+u...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/mlxtend/cd4fb40d-0313-42ed-a53a-61f7281656ad%40googlegroups.com.