Calculating the scores

0 views

Skip to first unread message

peter....@nrc-cnrc.gc.ca

unread,

Apr 6, 2007, 10:06:33 AM4/6/07

to SemanticRelations

Hi,

Here is more information about how the scores were calculated:

Lewis, D.D. Evaluating text categorization.
Proceedings of the Speech and Natural Language Workshop,
Asilomar, 312-318, 1991.

http://acl.ldc.upenn.edu/H/H91/H91-1061.pdf

We use the Macroaveraged F score. The F score is calculated separately
for each of the seven relations, and then the seven F scores are
averaged. Macroaveraging gives equal weight to each of the seven
relations.

The formula for Precision can be problematic when a system classifies
everything as "false". This is discussed in the above paper under
"Arithmetic Anomalies". We use the convention of treating 0/0 as 1.0.
This convention can be justified by observing that, in most precision-
recall plots, precision tends to head approximately towards 1.0 as
recall drops.