Is there a standard for handling strings like "Albert Einstein" when retrieving vectors using pre-trained models?

39 views
Skip to first unread message

Michael Simpson

unread,
Dec 13, 2019, 2:54:57 PM12/13/19
to fastText library
Is there a standard way to retrieve vectors for inputs such as "New York" and "Albert Einstein" when using the pre-trained models provided by fastText (e.g. cc.en.300.bin)?

Is it standard to replace spaces with underscores or dashes when calling getVector, e.g. "new_york" or "new-york"?

Alternatively, should such entities be treated like sentences and use the getSentenceVector method instead?

I have not been able to find an authoritative answer yet.

Thanks.
Reply all
Reply to author
Forward
0 new messages