error in eval

35 views
Skip to first unread message

Aurélie Thébault

unread,
Jul 5, 2023, 11:23:45 AM7/5/23
to Annif Users
Hi all, 
When evaluating annif model, I have a recurrent error and I do not see where it comes from. Do you have any idea?

# Train project
! annif train {project} --jobs {njobs} {input_file}

/bin/bash: /home/aurelie/anaconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Backend mllm: starting train Backend mllm: preparing training data /home/aurelie/anaconda3/envs/abes_index/lib/python3.10/site-packages/sklearn/feature_extraction/text.py:528: UserWarning: The parameter 'token_pattern' will not be used since 'tokenizer' is not None' warnings.warn( Backend mllm: training model Backend mllm: saving model

# Evaluate project
! annif eval --limit {max_nb_concepts}  --metrics-file {metric_file_path} --results-file {result_file_path} --jobs {njobs} -v "DEBUG" {project} {test_file}

/bin/bash: /home/aurelie/anaconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash) debug: creating app with configuration annif.default_config.Config debug: Reading configuration file projects.cfg in CFG format debug: loading subjects from data/vocabs/rameau/subjects.csv Writing per subject evaluation results to /home/aurelie/ABES/labo-indexation-ai/ANNIF/reports/rameau-mllm-snowball-fr.csv debug: Initializing project 'rameau-mllm-snowball-fr' debug: Project 'rameau-mllm-snowball-fr': initialized analyzer: <annif.analyzer.snowball.SnowballAnalyzer object at 0x7f896a25e560> debug: Project 'rameau-mllm-snowball-fr': initialized subjects: <annif.corpus.subject.SubjectIndex object at 0x7f89951bdfc0> debug: Project 'rameau-mllm-snowball-fr': initializing backend debug: Backend mllm: loading model from data/projects/rameau-mllm-snowball-fr/mllm-model.gz Traceback (most recent call last): File "/home/aurelie/anaconda3/envs/abes_index/bin/annif", line 8, in <module> sys.exit(cli()) File "/home/aurelie/.local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/home/aurelie/.local/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/aurelie/.local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/aurelie/.local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/aurelie/.local/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/home/aurelie/.local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs)
...
[subj.labels[language] for subj in self._subject_index], # Label File "/home/aurelie/anaconda3/envs/abes_index/lib/python3.10/site-packages/annif/eval.py", line 199, in <listcomp> [subj.labels[language] for subj in self._subject_index], # Label TypeError: 'NoneType' object is not subscriptable

Best regards, 

Aurélie

juho.i...@helsinki.fi

unread,
Jul 6, 2023, 3:41:25 AM7/6/23
to Annif Users
Hi Aurélie,

I think this has something to do with the loaded vocabulary. Actually at first try I could reproduce the same error message you are having, but not anymore after trying with some previous Annif versions.

Try reloading the vocabulary to Annif (with the "load-vocab" command, try also the "--force" option to overwrite the old loaded vocabulary to avoid just updating it). Also retraining the project could be needed.

Maybe you have updated to Annif v0.59 or newer recently? Annif v0.59 included some significant changes in the vocabulary handling, which require reloading of previously loaded vocabularies and retraining of existing models.

If the problem remains, please post
  • project configuration
  • Annif version (output of "annif --version")
  • output of "annif list-vocabs"
  • format of the vocabulary file you load (tsv, csv or some skos format like ttl or rdf)

-Juho

Aurélie Thébault

unread,
Jul 13, 2023, 11:36:22 AM7/13/23
to Annif Users
Thanks a lot for your answer Juho, it was exactly that !!
I have  a question regarding the use of URIs by ANNIF. In the vocab file, we must provide functional URIs and I am wondering what ANNIF does with them. Do you have some inputs to share?
Thanks a lot to all this group for its efficiency !!

Regards, 

Aurélie

Osma Suominen

unread,
Jul 25, 2023, 4:14:50 AM7/25/23
to annif...@googlegroups.com
Hello Aurélie,

Annif at least currently doesn't care much what the URIs are; mostly
they are considered opaque identifiers for the concepts/subjects. There
are a few ways the URIs are important though:

1. Annif represents the vocabulary internally using SKOS/RDF, where the
URIs are used as identifiers for concepts. Malformed URIs (for example
containing whitespace) would probably not work. Annif uses rdflib for
RDF handling and it is quite strict about URI syntax.

2. Annif uses the URIs in corpus data files.

3. When Annif gives suggestion results on the command line (suggest
operation) or via the REST API, it will output the URIs alongside labels
and scores.

4. In the Annif web UI, the URIs for suggested subjects are displayed as
clickable links.

Annif never directly accesses (resolves) the URIs via HTTP or other
protocols. So you could even use non-resolvable URIs such as mailto: or
URNs. In fact the "20 newsgroups" example corpus uses news: URIs which
refer to historical Usenet newsgroups from the early days of the Internet.

Cheers,
Osma

On 13/07/2023 18:36, Aurélie Thébault wrote:
> Thanks a lot for your answer Juho, it was exactly that !!
> I have  a question regarding the use of URIs by ANNIF. In the vocab
> file, we must provide functional URIs and I am wondering what ANNIF does
> with them. Do you have some inputs to share?
> Thanks a lot to all this group for its efficiency !!
>
> Regards,
>
> Aurélie
>
> Le jeudi 6 juillet 2023 à 09:41:25 UTC+2, juho.i...@helsinki.fi a écrit :
>
> Hi Aurélie,
>
> I think this has something to do with the loaded vocabulary.
> Actually at first try I could reproduce the same error message you
> are having, but not anymore after trying with some previous Annif
> versions.
>
> Try reloading the vocabulary to Annif (with the "load-vocab"
> command, try also the "--force" option to overwrite the old loaded
> vocabulary to avoid just updating it). Also retraining the project
> could be needed.
>
> Maybe you have updated to Annif v0.59 or newer recently? Annif v0.59
> <https://github.com/NatLibFi/Annif/releases/tag/v0.59.0> included
> some significant changes in the vocabulary handling, which require
> reloading of previously loaded vocabularies and retraining of
> existing models.
>
> If the problem remains, please post
>
> * project configuration
> * Annif version (output of "annif --version")
> * output of "annif list-vocabs"
> * format of the vocabulary file you load (tsv, csv or some skos
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/031015da-e9b7-423e-8bee-ae1408538795n%40googlegroups.com <https://groups.google.com/d/msgid/annif-users/031015da-e9b7-423e-8bee-ae1408538795n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi
Reply all
Reply to author
Forward
0 new messages