Hum... spoke too soon. When I tried to modify the ESA example to work from my mallet embeddings file, I discovered that the path you pass to the VectorIndexReader is actually a directory containing .jdb files for a DB that indexes the word embedding vectors. But what I am able to generate is just a single ASCII file containing the vectors in ascii format, one vector per line.
After much snooping around, I think I found a bunch of classes that I can put together to write a small app for converting the ASCII vector file to a BerkeleyDB dump. But I get the feeling that this app must already exist and I just haven't found it.
Basically, what I have in mind is to write an app that:
* Create a VectorIndexWriter
* Reads each line of the ASCII file
* Creates a SparseVector object from that line and uses the VectorIndexWriter's put() method to add the vector to the index
Does that sound about right? And also, does something like that already exist? I don't want to go re-inventing the wheel.