error when testing annif

46 views
Skip to first unread message

Andrea Brandstätter

unread,
Aug 26, 2023, 5:35:25 AM8/26/23
to Annif Users
Hi everyone,
I installed annif version 0.59.0 created a project with follwong details:
[thwildau-tfidf-de]
name=THWILDAU TFIDF project
language=de
backend=tfidf
vocab=rvkthwildau
analyzer=snowball(german)

Then loaded the vocabulary and trained annif withour any error messages but when testing echo "Technologie" | annif suggest thwildau-tfidf-de I get tons of errors which I don't understand:
File "/home/andrea/annif-venv/bin/annif", line 8, in <module>
    sys.exit(cli())
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/flask/cli.py", line 357, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/andrea/annif-venv/lib/python3.10/site-packages/annif/cli.py", line 383, in run_suggest
    (subj.labels[project.vocab_lang],
KeyError: 'de'

What does this mean? What can I do to prevent this error?
Any suggestion for a newbie like me?

Additionally I would like know what is the best way to upgrade to new version? Delete previous version and start fresh?

Thx in advance!
Best Andrea


juho.i...@helsinki.fi

unread,
Aug 26, 2023, 7:30:36 AM8/26/23
to Annif Users
Hi Andrea!

Annif v0.59 is quite old, and Annif v1.0 got just released, so it is a good idea to upgrade to it now. However, note that already trained MLLM, STWFSA and NN ensemble projects will stop working and need to be retrained after the upgrade.

You can upgrade to the newest version, 1.0, by running

    pip install --upgrade annif
   
(After that you can enable the tab key completion, which can be handy, see here: https://github.com/NatLibFi/Annif/tree/v1.0.0#shell-compeletions)


Considering the error you got, if it remains after the upgrade to v1.0, maybe something has gone wrong in the vocabulary loading. Please run

    annif list-vocabs

The output shows the vocabularies and the language codes each vocabulary supports in the Languages field. In your case it should contain the "de" code.

From where have you loaded the vocabulary, a TSV or a SKOS file? When loading from TSV files the language code should be given with the --language option.

In any case, you can try reloading the vocabulary with

    annif load-vocab <vocab-id> --force [--language <lang-code>]
 
and then retraining the project.

Hope this helps!

-Juho

Andrea Brandstätter

unread,
Aug 27, 2023, 5:43:15 AM8/27/23
to juho.i...@helsinki.fi, Annif Users

Hi Juho!

Thanks for your support! I upgraded annif, changed language in projects.cfg and retrained.

No error message any more!

But if I run eco command for testing I get no suggestion. So I'm wondering what's better: error message or zero response? ;-)

Any ideas?

Best Andrea


-----------------------------------------------

Mag. Andrea Brandstätter

Team Bibliothekssysteme

Vienna University Library

Teinfaltstraße 8

A 1010 Vienna

AUSTRIA

Tel. +43-1-4277-150 69

Mobile +43-664-602 77-150 69




Von: annif...@googlegroups.com <annif...@googlegroups.com> im Auftrag von juho.i...@helsinki.fi <juho.i...@helsinki.fi>
Gesendet: Samstag, 26. August 2023 13:30
An: Annif Users
Betreff: Re: error when testing annif
 
--
You received this message because you are subscribed to the Google Groups "Annif Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to annif-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com.

Osma Suominen

unread,
Aug 28, 2023, 6:58:32 AM8/28/23
to annif...@googlegroups.com
Hi Andrea!

Can you post more details of the commands you are running and the output
of "annif list-vocabs" and "annif list-projects"?

Is your training data properly formatted? How does it look?

-Osma

On 27/08/2023 12:43, Andrea Brandstätter wrote:
> Hi Juho!
>
> Thanks for your support! I upgraded annif, changed language in
> projects.cfg and retrained.
>
> No error message any more!
>
> But if I run eco command for testing I get no suggestion. So I'm
> wondering what's better: error message or zero response? ;-)
>
> Any ideas?
>
> Best Andrea
>
>
> -----------------------------------------------
>
> Mag. Andrea Brandstätter
>
> Team Bibliothekssysteme
>
> Vienna University Library
>
> Teinfaltstraße 8
>
> A 1010 Vienna
>
> AUSTRIA
>
> Tel. +43-1-4277-150 69
>
> Mobile +43-664-602 77-150 69
>
>
>
> ------------------------------------------------------------------------
> *Von:* annif...@googlegroups.com <annif...@googlegroups.com> im
> Auftrag von juho.i...@helsinki.fi <juho.i...@helsinki.fi>
> *Gesendet:* Samstag, 26. August 2023 13:30
> *An:* Annif Users
> *Betreff:* Re: error when testing annif
> <mailto:annif-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at?utm_medium=email&utm_source=footer>.

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi

Andrea Brandstätter

unread,
Aug 29, 2023, 6:23:14 AM8/29/23
to Osma Suominen, annif...@googlegroups.com

Hi Osam,

command list-vocabs shows this:


Vocabulary ID  Languages  Size    Loaded
----------------------------------------
stw            de,en      6243    True  
rvkthwildau    de         783147  True  

result of command list projects is:

Project ID         Project Name            Vocabulary ID  Language  Trained  Modification time  
------------------------------------------------------------------------------------------------
stw-tfidf-en       STW TFIDF project       stw            en        True     2022-11-27 15:26:32
thwildau-tfidf-de  THWILDAU TFIDF project  rvkthwildau    de        True     2023-08-27 11:27:35

I tried to run command echo "Technologie" | annif suggest thwildau-tfidf-de which might be a stupid Idea since the loaded vocab is only a part of a much bigger *tsv file.  But when I tried to load the complete *tsv file with approx. 700000 rows, the transformation into subjects.ttl gets killed.


The training data looks like this:

vocab:

th-wildau:ZE68028-ZE68035 Westeuropa
th-wildau:ZE68028 Westeuropa allgemein
th-wildau:ZE68030 Großbritannien und Nordirland


text file:

Wirtschaft / Recht Die interne Revision als Internal Consultant - Analyse am Beispiel der Allgemeinen Deutschen Direktbank AG Hochschulschrift th-wildau:PN216
Europäisches Management Der Krieg um die Talente geht weiter: Organisationsentwicklung auf Basis des McKinsey 7S-Framework durch Etablierung von Wertschätzung als Bindemittel gegen einen internen Fachkräftemangel deutscher Dienstleistungsunternehmen. Organisationsentwicklung Dienstleistungsbetrieb Hochschulschrift Organisationsentwicklung Dienstleistungsbetrieb th-wildau:QP120
Europäisches Management Der zentralamerikanischen Zollunion: Fortschritte und Hindernisse in Bezug auf die Etablierung des freien Warenverkehrs Zollunion Freier Warenverkehr Hochschulschrift Zollunion Freier Warenverkehr th-wildau:QP120


Best

Andrea


-


Von: annif...@googlegroups.com <annif...@googlegroups.com> im Auftrag von Osma Suominen <osma.s...@helsinki.fi>
Gesendet: Montag, 28. August 2023 12:58
An: annif...@googlegroups.com
Betreff: Re: AW: error when testing annif
 
To unsubscribe from this group and stop receiving emails from it, send an email to annif-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/annif-users/bca0a335-003f-a927-ce90-9e557cbfce9e%40helsinki.fi.

Osma Suominen

unread,
Sep 1, 2023, 3:06:01 AM9/1/23
to annif...@googlegroups.com
Hi Andrea!

Thanks for the details. I see that you've loaded both STW and your own
vocabulary. Am I right that you used STW for testing and things are
working when you use that vocabulary, but not with your own?

It would be useful to see the output of loading the vocabulary, which
you said is getting killed. Is there a traceback that could provide hints?

I noticed that you are not using real URIs to identify concepts, but
strings like "th-wildau:ZE68030". I am not sure if these will work,
since Annif uses RDF and SKOS internally and the rdflib library which is
used for RDF processing is quite strict about URIs. So it might be that
the conversion to SKOS/RDF/ttl is failing because your concept
identifiers are not valid URIs. But it's hard to tell without seeing
some kind of error message or traceback.

Can you show the commands you've run and their complete output?

-Osma

On 29/08/2023 13:23, Andrea Brandstätter wrote:
> Hi Osam,
>
> command list-vocabs shows this:
>
>
> Vocabulary ID  Languages  Size    Loaded
> ----------------------------------------
> stw            de,en      6243    True
> rvkthwildau    de         783147  True
>
> result of command list projects is:
>
> Project ID         Project Name            Vocabulary ID  Language
> Trained  Modification time
> ------------------------------------------------------------------------------------------------
> stw-tfidf-en       STW TFIDF project       stw            en
> True     2022-11-27 15:26:32
> thwildau-tfidf-de  THWILDAU TFIDF project  rvkthwildau    de
> True     2023-08-27 11:27:35
>
> I tried to run command /echo "Technologie" | annif suggest
> thwildau-tfidf-de/ which might be a stupid Idea since the loaded vocab
> ------------------------------------------------------------------------
> *Von:* annif...@googlegroups.com <annif...@googlegroups.com> im
> Auftrag von Osma Suominen <osma.s...@helsinki.fi>
> *Gesendet:* Montag, 28. August 2023 12:58
> *An:* annif...@googlegroups.com
> *Betreff:* Re: AW: error when testing annif
> <https://github.com/NatLibFi/Annif/tree/v1.0.0#shell-compeletions>)
>> https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com> <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com?utm_medium=email&utm_source=footer>>.
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Annif Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to annif-users...@googlegroups.com
>> <mailto:annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at> <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at?utm_medium=email&utm_source=footer>>.
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 15 (Unioninkatu 36)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.s...@helsinki.fi
> http://www.nationallibrary.fi <http://www.nationallibrary.fi>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/bca0a335-003f-a927-ce90-9e557cbfce9e%40helsinki.fi <https://groups.google.com/d/msgid/annif-users/bca0a335-003f-a927-ce90-9e557cbfce9e%40helsinki.fi>.

Andrea Brandstätter

unread,
Sep 2, 2023, 12:30:41 PM9/2/23
to Osma Suominen, annif...@googlegroups.com

Hi Osam!


Thx for your support and patience! 


I've no idea what is different to my last try beside the fact that I'm not longer on holiday in Greece! But today erverything works as expected:



But if I got you right it is better to use URIs instead indivdual normalized vocab?


Best Andrea



-----------------------------------------------

Mag. Andrea Brandstätter

Team Bibliothekssysteme

Vienna University Library

Teinfaltstraße 8

A 1010 Vienna

AUSTRIA

Tel. +43-1-4277-150 69

Mobile +43-664-602 77-150 69




Von: annif...@googlegroups.com <annif...@googlegroups.com> im Auftrag von Osma Suominen <osma.s...@helsinki.fi>
Gesendet: Freitag, 1. September 2023 09:05
An: annif...@googlegroups.com
Betreff: Re: AW: AW: error when testing annif
 

Osma Suominen

unread,
Sep 4, 2023, 7:40:01 AM9/4/23
to annif...@googlegroups.com
Hi Andrea!

Good to hear that you got it working!

Regarding URIs, please see this previous post on a similar topic:
https://groups.google.com/g/annif-users/c/tOdhz4s-bbs/m/6_bHP7-MBwAJ

-Osma

On 02/09/2023 19:30, Andrea Brandstätter wrote:
> Hi Osam!
>
>
> Thx for your support and patience!
>
>
> I've no idea what is different to my last try beside the fact that I'm
> not longer on holiday in Greece! But today erverything works as expected:
>
>
>
> But if I got you right it is better to use URIs instead indivdual
> normalized vocab?
>
>
> Best Andrea
>
>
>
> -----------------------------------------------
>
> Mag. Andrea Brandstätter
>
> Team Bibliothekssysteme
>
> Vienna University Library
>
> Teinfaltstraße 8
>
> A 1010 Vienna
>
> AUSTRIA
>
> Tel. +43-1-4277-150 69
>
> Mobile +43-664-602 77-150 69
>
>
>
> ------------------------------------------------------------------------
> *Von:* annif...@googlegroups.com <annif...@googlegroups.com> im
> Auftrag von Osma Suominen <osma.s...@helsinki.fi>
> *Gesendet:* Freitag, 1. September 2023 09:05
> *An:* annif...@googlegroups.com
> *Betreff:* Re: AW: AW: error when testing annif
>>> https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com> <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com>> <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/annif-users/f06b1de7-f82b-4ca8-9d0c-e772df0effe7n%40googlegroups.com?utm_medium=email&utm_source=footer>>>.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Annif Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to annif-users...@googlegroups.com
>>> <mailto:annif-users...@googlegroups.com
>> <mailto:annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>>>.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at> <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at>> <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/annif-users/847d5a46cf22484eace36b174a66d3f7%40univie.ac.at?utm_medium=email&utm_source=footer>>>.
>>
>> --
>> Osma Suominen
>> D.Sc. (Tech), Information Systems Specialist
>> National Library of Finland
>> P.O. Box 15 (Unioninkatu 36)
>> 00014 HELSINGIN YLIOPISTO
>> Tel. +358 50 3199529
>> osma.s...@helsinki.fi
>> http://www.nationallibrary.fi <http://www.nationallibrary.fi>
> <http://www.nationallibrary.fi <http://www.nationallibrary.fi>>
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Annif Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to annif-users...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/annif-users/bca0a335-003f-a927-ce90-9e557cbfce9e%40helsinki.fi <https://groups.google.com/d/msgid/annif-users/bca0a335-003f-a927-ce90-9e557cbfce9e%40helsinki.fi> <https://groups.google.com/d/msgid/annif-users/bca0a335-003f-a927-ce90-9e557cbfce9e%40helsinki.fi <https://groups.google.com/d/msgid/annif-users/bca0a335-003f-a927-ce90-9e557cbfce9e%40helsinki.fi>>.
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 15 (Unioninkatu 36)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.s...@helsinki.fi
> http://www.nationallibrary.fi <http://www.nationallibrary.fi>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/67788dd2-8ea4-3d32-261f-df9fa11c90ed%40helsinki.fi <https://groups.google.com/d/msgid/annif-users/67788dd2-8ea4-3d32-261f-df9fa11c90ed%40helsinki.fi>.
Reply all
Reply to author
Forward
0 new messages