Negative Values Scores from CompareTerms

13 views
Skip to first unread message

Michael Ruepp

unread,
Aug 7, 2015, 7:43:55 AM8/7/15
to Semantic Vectors
Hi, I also got a lot negative Values from CompareTerms Algorithm which somehow messes with the compareTo Method of the SearchResult.

I have a large ArrayList with the Results of a CompareTerms run, which contains also negative Values, but am not able to Collection.sort(result) - result is of Type SearchResult.

The code just jumps to the end...

Why negative results?

Regards,

Michael

Dominic Widdows

unread,
Aug 7, 2015, 11:57:43 AM8/7/15
to semanti...@googlegroups.com
Hi Michael,

Negative cosine similarities easily occur if the angle between two vectors is more than a right angle. Algebraically, with random projection they occur because coordinates are initialized to a mixture of 0, +1 and -1, and with singular value decomposition, when the projection onto one of the principal axes gives a negative component. (With purely positive-valued factorization techniques, they don't occur.)

Best wishes,
Dominic

--
You received this message because you are subscribed to the Google Groups "Semantic Vectors" group.
To unsubscribe from this group and stop receiving emails from it, send an email to semanticvecto...@googlegroups.com.
To post to this group, send email to semanti...@googlegroups.com.
Visit this group at http://groups.google.com/group/semanticvectors.
For more options, visit https://groups.google.com/d/optout.

Michael Ruepp

unread,
Aug 7, 2015, 12:07:51 PM8/7/15
to semanti...@googlegroups.com
Thanks Dominic,

i thought, the Score returned by the runCompareTerms Method is always a 1-0 Value representing similarity somehow in % ?

But is it the proper assumption than the higher the Value, the more similar the two vectors?

So if I have a 0,9 is more similar than a -0.5, right?

Regards,
signature.asc

Dominic Widdows

unread,
Aug 7, 2015, 12:20:15 PM8/7/15
to semanti...@googlegroups.com
Yes, higher scores give greater similarities.

People frequently assume that similarities would be between 0 and 1, which of course is not how the cosine function / scalar product behaves in general, but often this mathematical difference only becomes apparent when we see the results in practice.

Best wishes,
Dominic
Reply all
Reply to author
Forward
0 new messages