Re: Error with fasttext

54 views
Skip to first unread message

Osma Suominen

unread,
Jul 25, 2023, 4:39:41 AM7/25/23
to annif...@googlegroups.com
Hello Elisabeth,

it appears that you've found a bug: there seems to be a problem when
combining the fastText backend with the spaCy analyzer. Other Annif
analyzers will convert newlines to regular spaces, but apparently spaCy
doesn't always do that, so at least some newlines are passed on to
fastText which then gives the "predict processes one line at a time"
error. At least that is my interpretation of the situation.

I tried to reproduce this locally using a similar configuration, but
wasn't successful - all the input files I tried seemed to work. Could
you check if there is a particular document in your test set that
triggers this error, for example by running "annif suggest" on
individual files until you find one that gives the same error? Could you
then send me that file?

-Osma

PS. I see that in your configuration you have set "language=german".
This is wrong, you should be using the BCP 47 language code "de"
instead. But this is not the cause of the current problem you're seeing.


On 12/07/2023 12:07, 'Elisabeth Mecking' via Annif Users wrote:
> Hi,
> I have trained a model with this configuration
> [rvk_s-fasttext-ger.spacy]
> name=RVK_S fasttext German spacy
> language=german
> backend=fasttext
> analyzer=spacy(de_core_news_sm,lowercase=1)
> dim=100
> lr=0.25
> epoch=5
> loss=hs
> limit=1
> chunksize=24
> vocab=rvk_s
>
> For testing, I have a folder with text documents and corresponding key
> files, which worked fine when I did the same with backend tfidf and
> omikuji. In this one, I get an error message when doing annif eval
> "ValueError: predict processes one line at a time (remove '\n')"
> What can I do? Suggestions seem to work, when I do echo and annif
> suggest I get an output.
> Thanks for your help
> Elisabeth
>
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/c731775b-9ad7-4073-8d53-682f3b1ef78fn%40googlegroups.com <https://groups.google.com/d/msgid/annif-users/c731775b-9ad7-4073-8d53-682f3b1ef78fn%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi
Reply all
Reply to author
Forward
0 new messages