Word2Vec/Doc2Vec `save()` reuse the `save()` from `gensim.utils.SaveLoad`, which has an optional `sep_limit` parameter to specify how large a numpy array must be to be saved separately. See:
and
A numpy array whose `.size` is greater than this `sep_limit` value will be stored as a separate file. You could set this value to be much larger to force everything into the single python-pickled model file... but note that pickling breaks at some member size (I think 2GB) so larger models will eed to use separate-storage.
The right sized machine depends mostly on your model specifics – vocabulary-size in the Word2Vec case, and also in Doc2Vec the count of training documents. You mainly want to be sure there's enough RAM for the full model (absolutely no swapping). When logging is on, you can see estimates of the memory-needs printed during the `build_vocab()` steps.
- Gordon