Entity Extraction API - Top Entities - Returned score values

31 views
Skip to first unread message

Aaron Phillips

unread,
Nov 11, 2019, 1:42:49 PM11/11/19
to Dandelion Support Forum
Hi,

Following the documentation for the response field of score of top entities, I understand that it is not absolute and cannot be reused for different text.

However, I'm interested in any ideas for how to make use of the field.

I often do entity detection, asking for up to 10 top entities, however I cannot make much use of the score field.

  • It's pretty typical that all my scores coming back are less than 0.001.  
  • They do not sum to a round or obvious number.
  • I cannot figure out a way to rescale, because it would rely on a bit more information.

I currently use the score to sort the entities and apply a scale independently to top entities to boost importance of the keywords.

I had hoped that I could use the scores, to be a bit more accurate, however I've concluded myself this is not possible.

Do you think my conclusion is correct?

Cheers,
Aaron

Giacomo Berardi

unread,
Nov 14, 2019, 8:11:20 AM11/14/19
to Dandelion Support Forum
Hi Aaron,
the top entities ranking algorithm works better with longer and richer texts.
On texts with few entities it is difficult to understand which ones are the most significative, thus they will probably have low scores in general.
Maybe you need a different ranking approach, that takes into account of the confidence of the annotations and combines them differently?
I suggest you to experiment different methods that combine all the annotations and the top entity scores

Cheers

Giacomo Berardi
Dandelion team
Reply all
Reply to author
Forward
0 new messages