myVec = model.infer_vector(doc_words, alpha=0.1, min_alpha=0.0001, steps=5)Infer a vector for given post-bulk training document.
Document should be a list of (word) tokens.
Then you need to find the most similar docvec:
d2v_model.docvecs.most_similar(myVec)
most_similar(positive=[], negative=[], topn=10, clip_start=0, clip_end=None)Find the top-N most similar docvecs known from training. Positive docs contribute positively towards the similarity, negative docs negatively.
This method computes cosine similarity between a simple mean of the projection weight vectors of the given docs. Docs may be specified as vectors, integer indexes of trained docvecs, or if the documents were originally presented with string tags, by the corresponding tags.
The ‘clip_start’ and ‘clip_end’ allow limiting results to a particular contiguous range of the underlying doctag_syn0norm vectors. (This may be useful if the ordering there was chosen to be significant, such as more popular tag IDs in lower indexes.)