Usiing spacy analyzer and language model

14 views
Skip to first unread message

Enrico Laloli

unread,
Oct 27, 2022, 8:21:45 AM10/27/22
to Annif Users
Hello, I am trying to use Spacy as an analyzer, but cannot get it working. 

I my  project configurarion I have this: 

analyzer=spacy('nl_core_news_lg',lowercase=1)

I installed spacy in the virtual environment. 

pip install .[spacy]

Downloaded the language model.

python -m spacy download nl_core_news_lg

It then tells me:  You can now load the package via spacy.load('nl_core_news_lg'). I suppose this is done in the Annif python code. 

When using it with annif train, I get following errors:

Error: Operation failed: Loading spaCy model ''nl_core_news_lg'' failed - please download the model.

[E050] Can't find model ''nl_core_news_lg''. It doesn't seem to be a Python package or a valid path to a data directory.

Enrico


Osma Suominen

unread,
Oct 27, 2022, 9:00:51 AM10/27/22
to annif...@googlegroups.com
Hi Enrico,

try configuring the analyzer without using quotes, like this:

analyzer=spacy(nl_core_news_lg,lowercase=1)

Hope this helps,
Osma
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/d7d920d8-09c9-4c93-887a-a7df700300fdn%40googlegroups.com <https://groups.google.com/d/msgid/annif-users/d7d920d8-09c9-4c93-887a-a7df700300fdn%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi

cbs bibliotheek

unread,
Oct 27, 2022, 10:13:26 AM10/27/22
to Osma Suominen, annif...@googlegroups.com
Thanks, Osma. You are quite right about the quotes. 
I doubt that the big spacy language model is an advance on the simplemma analyzer. The last one is much faster and the results with the spacy model do not give me an indication that the quality improves. This pertains to Dutch text. But I have not analysed this in full.

Cheers,

Enrico
Op do 27 okt. 2022 om 15:00 schreef Osma Suominen <osma.s...@helsinki.fi>:
You received this message because you are subscribed to a topic in the Google Groups "Annif Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/annif-users/05fe2gCudJ8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to annif-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/annif-users/f9bdfc59-e8bf-931d-e573-9372f57e3971%40helsinki.fi.
Reply all
Reply to author
Forward
0 new messages