annif index text files in one file?

已查看 14 次
跳至第一个未读帖子

Enrico Laloli

未读,
2022年11月1日 07:41:092022/11/1
收件人 Annif Users
The command "annif index" takes a directory with a set of text files. It would be helpful if these texts together with an identifier per text could be processed as one .tsv or .csv file. This is especially handy with short bits of text. 
What do you think?

Enrico

Osma Suominen

未读,
2022年11月1日 10:27:552022/11/1
收件人 annif...@googlegroups.com
Hi Enrico,

I think that's a good feature request. Maybe you could open an issue on
GitHub?

It would also be helpful to have some examples of what the input and
output files could look like. For example, the output format needs to be
able to provide many subjects (often 10) per document and squeezing that
into the columns of a tsv/csv file could be done in a number of ways,
some perhaps more elegant and/or more easy to process than others,
depending also on other tools you want to include in the workflow.

It also shouldn't be too hard to implement this outside Annif, for
example with the annif-client library for Python:
https://github.com/NatLibFi/Annif-client

-Osma
> --
> You received this message because you are subscribed to the Google
> Groups "Annif Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to annif-users...@googlegroups.com
> <mailto:annif-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/annif-users/b17be074-bcdf-47a4-bc3b-0faeec093836n%40googlegroups.com <https://groups.google.com/d/msgid/annif-users/b17be074-bcdf-47a4-bc3b-0faeec093836n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi

cbs bibliotheek

未读,
2022年11月1日 11:56:072022/11/1
收件人 Osma Suominen、annif...@googlegroups.com

Op di 1 nov. 2022 om 15:27 schreef Osma Suominen <osma.s...@helsinki.fi>:
You received this message because you are subscribed to a topic in the Google Groups "Annif Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/annif-users/8Nvo7YxlX18/unsubscribe.
To unsubscribe from this group and all its topics, send an email to annif-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/annif-users/fadb8566-050e-8cd4-8c51-80a5b2511dbb%40helsinki.fi.
回复全部
回复作者
转发
0 个新帖子