numberbatch-en-17.04.txt.gz or some modified form?I'm traveling and can't get you the exact details right now, but:
The example and paper you're referring to are from the 16.04 version (April 2016). There should be references to hire to reproduce that version in the conceptnet-numberbatch Git repository.
However, you might as well keep using 17.04. Although that one example (Cumberbatch to actor) broke - it was a silly example and this data was never focused on representing facts about specific people -the vectors perform even better on all evaluations now.
The paper you're referring to was our first attempt to publish; two later papers were published.
The similarity metric is cosine similarity - the dot product of normalized vectors.
The current conceptnet5 repository has a script to reproduce our most recently published results. You need the conceptnet5 code, not just the vectors, because you need to be able to produce vectors for terms that are in the vocabulary of ConceptNet but don't correspond to a row in the matrix. (We could have included all these additional vectors, but it's an unreasonable waste of RAM when they can just be inferred.)
--
You received this message because you are subscribed to the Google Groups "conceptnet-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conceptnet-use...@googlegroups.com.
To post to this group, send email to conceptn...@googlegroups.com.
Visit this group at https://groups.google.com/group/conceptnet-users.
For more options, visit https://groups.google.com/d/optout.
By the way: I just realized from someone else's report that the en-17.04 vectors were corrupted. That would explain the problem. You should go to the site to download en-17.04b.