Hello Annemieke!
As you probably know, Annif suggestions always come with a score value
between 0 and 1. But the interpretation of that value varies by
algorithm and usually it doesn't have any specific meaning except that
higher values represent more confident suggestions.
You can apply a threshold when using the suggest command (or API
method), but this simply drops the suggestions below that threshold.
If you want to have a better picture of the relationship between
suggestion scores and actual likelyhood of the topic being correct, then
you need some sort of gold standard evaluation set. That means a
document corpus with verified subjects, with documents from the subject
area you are interested in (not necessarily the same as that the
training set). Then you can do one or more of the following:
1. Run "annif eval" on the evaluation corpus: this will give you e.g. F1
score and nDCG (by default for 10 suggestions per document), which give
a picture of the overall quality of results
2. Run "annif optimize" on the evaluation corpus: this will give you
suggested limit and threshold values that maximize the F1 score
3. Configure a PAV ensemble project around the project(s) you are using
and then train that project with the gold standard set. For this to
work, you need a sufficiently large set - thousands of documents. The
PAV ensemble will internally, for each concept/subject, create an
isotonic regression model that estimates the relationship between the
score (supplied by the algorithm) and the likelihood that the suggestion
is correct. After training, you can use the PAV ensemble project to
suggest subjects and the scores it will give are likelihoods (e.g. 0.5
means an estimated 50% chance of being correct), though only for those
subjects that were common enough in the training data that it was able
to form a regression model; by default, this means that at least 10
documents in the training set had that subject (you can adjust this with
the min-docs setting).
These are the facilities built into Annif; it doesn't really have a
mechanism for saying "I don't know what this document is about", except
very indirectly, by e.g. applying a high treshold. I see that Anna
already answered with a pointer to Qualle; that tool is maybe a more
appropriate solution for answering the question "is the quality of
automated suggestions good enough or should this be checked manually?".
-Osma
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
annif-users...@googlegroups.com
> <mailto:
annif-users...@googlegroups.com>.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/annif-users/64ed6040-d0c6-429c-8165-4d829c1caed2n%40googlegroups.com <
https://groups.google.com/d/msgid/annif-users/64ed6040-d0c6-429c-8165-4d829c1caed2n%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel.
+358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi