Ultimately, you'll have to start adding some human-judged, gold-standard topics to existing articles – to have a training set, or be able to evaluate any other ad hoc methods you devise.
Using either the inferred vector for ['machine', 'learning'], or (if you have compatible co-trained word-vectors) the average of wv('machine') & wv('learning'), as a sort of bootstrapped starting-point for the topic 'machine learning' might be better than nothing. But, such tiny-phrase vectors are likely to be idiosyncratic, perhaps not matching human senses of the "centroid" for that topic. You'd probably want the anchor for the topic to move from its initial point, or grow to include multiple points, as you actually confirm/reject, with human expertise, that certain docs truly fit under that topic.
To the extent that you have predetermined topics, is the sole description of those topics their short text names? Or, do you have (or could you find/create) groups of docs that are definitely representative of those topics? EG's suggestion of a web search for those topic-names would help use the search engine's encoded understanding of the topic to flesh out the training set. For your domain, and if the topics come from something already established, perhaps there are even better sets of representative docs for each topic. (You could mix these 'reference docs' into your corpus during training, or just infer their doc-vecs for them post-training.) Even something like the text of the Wikipedia article 'Machine Learning' – as a whole or by section – might, when fed to your model, give a better point/points than the tiny phrase 'Machine Learning' itself.
You may want to review a followup paper about the Paragraph Vector algorithm (aka gensim's Doc2Vec) that applied it to both Wikipedia articles, and Arxiv papers:
"Document Embedding with Paragraph Vector"
They used existing (human) categories of these corpuses to auto-score the quality of doc-vectors, under different metaparameters, based on how well pairs of docs declared to be in the same topic had closer doc-vecs to each other than some 3rd document picked randomly.
Perhaps there's other structure in your docs – subheads, footnotes, capitalized-phrases – that could serve as a sort of extra hint of docs that "oughtta be" close to each other, and thus help metaoptimize the Doc2Vec model for topical purposes.
And if you can generate a seed set of "known points and their strongly-associated topics", perhaps from ref docs borrowed/hand-selected from elsewhere, a K-Nearest-Neighbors report of predicted topics for unlabeled docs would work pretty well. And where it doesn't, each time you hand-correct the labeled-topics for a doc/doc-vector, it'd improve for all others. That is, use an iterative process: once you've got at least one 'seed point' for every topic, check all your docs' distances from all seed-points. Manually review the document with the "most confused" topics - giving it definitive labels. Then repeat. (And, for all hand-labeled docs, constantly re-score whether they'd be properly classified, by their nearest-known-neighbor, if their own labels were ignored.)
- Gordon