Hi Sandro!
Those are excellent questions.
First, to automatically find an optimal weighting, you should define a
simple ensemble (not NN) with the same sources. Then run annif hyperopt
on that project against a validation set (not the final test set, that
would be cheating!) with a sufficient number of trials (at least 100 or
so). That will report the best weights and you can then use them also in
the NN ensemble.
If you want to experiment with changing the weights in the NN ensemble
directly, you unfortunately cannot use the --cached option but will need
to retrain from scratch every time you change the weights. This is a
little bit unfortunate since it wouldn't be strictly necessary, but the
corpus is stored in such a way that the defined weights are already
incorporated into the preprocessed data that is stored on disk and can
be reused with the --cached option. I edited the wiki page of the NN
ensemble backend to make this a little bit clearer.
Anyway, I recommend that you use the first option, running hyperopt on
the simple ensemble. Those weights should be good enough for the NN
ensemble too.
Best,
Osma
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
annif-users...@googlegroups.com
> <mailto:
annif-users...@googlegroups.com>.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/annif-users/10c79a8b-2d6b-453f-95d4-26067f4d955bn%40googlegroups.com
> <
https://groups.google.com/d/msgid/annif-users/10c79a8b-2d6b-453f-95d4-26067f4d955bn%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel.
+358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi