There is a warning generated if you call `train()` again with an effective `alpha` higher than it's previously reached – often indicative of a mistake. We could add other warnings.
Some of the confusion arises from the dual paths offered by the model – "there's more than one way to do it!" – where you might trigger all training by supplying a corpus, or do vocabulary-building/training explicitly later. In the 1-liner all-ini-initialization approach, the case for matching the word2vec.c expectations (and common need for multiple iterations) is strongest. On the other hand, if calling `train()` yourself, you might not expect a remembered-parameter-from-earlier-initialization to have such an effect, and some example code from when the default was `iter=1` has led people astray.
Perhaps `train()` should require an explicit (non-default) `passes` parameter. In the 1-liner/initialization-trains case, it'd be called with the 'iter' value. For anyone calling it themselves, they'd have to make an explicit choice of 'passes'. And after the change, any old code without the parameter would (by design) break, forcing a change to explicit specification.
Perhaps even `alpha` and `min_alpha` should be explicit required parameters to `train()`, to ensure those calling it directly aren't see-sawing the values each call. Calling 'train()' directly is kind of an advanced approach, so requiring this level of choice may be appropriate.
- Gordon