Using Annif in Python Code

58 views
Skip to first unread message

Antony

unread,
Nov 15, 2019, 10:55:43 PM11/15/19
to Annif Users
Hello,

Is it possible to import annif and call suggest method directly within the code? Trying to see if it can be used like in a Python script or notebook instead of command-line. 

I would like to do this in order to store the results together with the file metadata. If this is possible, do you have an example somewhere?

Thanks!

Osma Suominen

unread,
Nov 18, 2019, 5:36:08 AM11/18/19
to annif...@googlegroups.com
Hi Antony,

That's a good question!

Currently the Annif codebase is quite tightly tied with the Flask and
Connexion frameworks, which might make it a bit difficult to use it as a
Python library. I'd suggest that you take a look at Annif-client [1,2]
instead - it's a simple Python wrapper around the Annif REST API and
allows you to interact with a running Annif instance (local or remote).
So you could start up Annif from the CLI using "annif run" and then use
annif-client in your notebook.

-Osma

[1] https://github.com/NatLibFi/Annif-client

[2] https://pypi.org/project/annif-client/
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/a9d25e0d-8d78-4a39-b334-4bd58aa421ed%40googlegroups.com
> <https://groups.google.com/d/msgid/annif-users/a9d25e0d-8d78-4a39-b334-4bd58aa421ed%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi

Antony Owino

unread,
Nov 19, 2019, 12:41:00 PM11/19/19
to Annif Users
Ah, good to know. Wish I found out about Annif-client earlier. I'll give it a try soon and compare esp speed. I managed to integrate annif into my script/notebook after looking at the tests. Here it is if anyone else would need it.

from pprint import pprint
import
annif
import annif.project
from annif.suggestion import SuggestionResult, \
   LazySuggestionResult, ListSuggestionResult
from annif.corpus import SubjectIndex
import annif.backend

text = """SOME TEXT HERE AND THERE YOU CAN TRY"""

class Object(object):
   pass
project = Object()
project.language= 'en'
project.analyzer = annif.analyzer.get_analyzer('snowball(english)')
project.vocab='yso-en'
project.subjects = SubjectIndex.load('./data/vocabs/yso-en/subjects')

tfidf_type = annif.backend.get_backend("tfidf")
tfidf = tfidf_type(
   backend_id='tfidf-en',
   config_params={'limit': 10},
   datadir=str('./data/projects/tfidf-en')
)

results = tfidf.suggest(text, project)
print(len(results))

for result in results:
   print(result.label)
   pprint(result, indent=2)


Cheers!

Janne

unread,
Aug 31, 2022, 12:55:52 AM8/31/22
to Annif Users
I'm also interested in using Annif directly in Python code. Since 2019, are there any changes in codebase to make this easier?
I tried the solution above, but it doesn't seem to work anymore as it gives an error "yield Subject(uri=annif.util.cleanup_uri(row['uri']), KeyError: 'uri'"
Apparently either CSV library or how Annif reads files have changed somehow.

Osma Suominen

unread,
Aug 31, 2022, 2:46:21 AM8/31/22
to annif...@googlegroups.com
Hi Janne,

I'm guessing you are using the latest code on the master branch? There
has been a bunch of changes since the last release 0.58 in how
vocabularies are stored and accessed. In particular, the on-disk
vocabulary format was just switched from TSV (a file called "subjects")
to a new CSV format (a file called "subjects.csv"). The code you are
using is attempting to read from the old file "subjects", but the
current Annif code expects a CSV file so reading it fails. See also the
message I just posted here about semantic versioning, which touches on
the same topics - backwards compatibility and changes to the vocabulary
support.

This code is using quite low level Annif internals, which tend to change
frequently. I think it would be better to access projects at a slightly
higher level, through AnnifRegistry. But this all is uncharted territory
as Annif doesn't really try to be a software library with a stable
Python API. If you (or others) have such needs, could you please open an
issue on GitHub and explain what you would like to do with Annif through
Python code?

-Osma

Janne kirjoitti 31.8.2022 klo 7.55:
> I'm also interested in using Annif directly in Python code. Since 2019,
> are there any changes in codebase to make this easier?
> I tried the solution above, but it doesn't seem to work anymore as it
> gives an error "yield Subject(uri=annif.util.cleanup_uri(row['uri']),
> KeyError: 'uri'"
> Apparently either CSV library or how Annif reads files have changed somehow.
>
> On Tuesday, November 19, 2019 at 7:41:00 PM UTC+2 Antony Owino wrote:
>
> Ah, good to know. Wish I found out about Annif-client earlier. I'll
> give it a try soon and compare esp speed. I managed to integrate
> annif into my script/notebook after looking at the tests. Here it is
> if anyone else would need it.
>
> |
> frompprint importpprint
> importannif
> importannif.project
> fromannif.suggestion importSuggestionResult,\
>    LazySuggestionResult,ListSuggestionResult
> fromannif.corpus importSubjectIndex
> importannif.backend
>
> text ="""SOME TEXT HERE AND THERE YOU CAN TRY"""
>
> classObject(object):
>    pass
> project =Object()
> project.language='en'
> project.analyzer =annif.analyzer.get_analyzer('snowball(english)')
> <https://groups.google.com/d/msgid/annif-users/a9d25e0d-8d78-4a39-b334-4bd58aa421ed%40googlegroups.com?utm_medium=email&utm_source=footer
> <https://groups.google.com/d/msgid/annif-users/a9d25e0d-8d78-4a39-b334-4bd58aa421ed%40googlegroups.com?utm_medium=email&utm_source=footer>>.
>
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 15 (Unioninkatu 36)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529 <tel:+358%2050%203199529>
> osma.s...@helsinki.fi
> http://www.nationallibrary.fi <http://www.nationallibrary.fi>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/129b71bc-ccf8-4ed0-9097-c8ed376263e7n%40googlegroups.com
> <https://groups.google.com/d/msgid/annif-users/129b71bc-ccf8-4ed0-9097-c8ed376263e7n%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages