This is more of a joblib/Parallel matter than gensim-specific.
`calc_wmdistance` (`word2vec_model.wmdistance`) is an instance-bound method, which can't be pickled (to be passed to other processes). It might work for you if you:
(1) Define a global method to do your operation, which expects the model as a parameter. EG:
def calc_wmdistance(model, doc1, doc2):
return model.wmdistance(doc1, doc2)
(2) Use this global as the function passed to `delayed()`, and add the model as the 1st parameter. EG: `delayed(calc_wmdistance)(word2vec_model, cands_descr[0], descr) ...`
While this may solve the error, you may not get the desired speedup, as the model is (likely) quite large and pickle-sending it to the child processes (which then each have a full copy of the model) for each calculation might dominate the runtime.
It might be better to restructure the code so that each subprocess loads the model once itself (likely also using the `mmap` optional argument to `load()` so that the bulk of the models are shared), then each calculates distances for an equal-sized batch of the target words. That'd best minimize duplicate effort/memory.
- Gordon