Is there a way, and if yes, what is the most efficient way, to get from a VectorSpaceModel a vector that represents the counts (or, after a transform, the tf*idf scores) of each term in a new or existing document? What DocumentVectorBuilder.buildVector creates is not really a document vector in that sense, but the linear (or some other) combination of all the term vectors that occur in a document.
What I would like to get instead is something that would give me for an existing document each of the tf*idf scores of all the terms that occur in the document or would create a new vector from a string of terms for a new document.
So the number of elements in the vector I want would be the total number of different terms that occur in the corpus for a dense vector and the number of different terms that occur in the document for a sparse vector. The number of elements the DocumentVectorBuilder.buildVector method gives me is, if I understood correctly, the number of documents in the corpus for the dense vector and the number of documents in which any of the terms occurs for the sparse vector.
Thanks and sorry if I am missing something blatantly obvious here,
Johann