Running Omikuji

32 views
Skip to first unread message

christelann...@gmail.com

unread,
Jul 5, 2023, 5:51:55 AM7/5/23
to Annif Users

Dear Annif-team,
Yesterday I worked with Sara and then things went really smooth - obviously, because she knows how to work Annif.
Now I am on my own, trying to get Omikuji to run and I am having an error that I do not know how to solve.

What I did:
_ Installed Omikuji (pip install omikuji). That went ok.
- I prepared my vocabulary (polmat.ttl). That went ok.
- I have the cfg file: 
[omikuji-parabel-de]
name=Omikuji Parabel German
language=de
backend=omikuji
analyzer=snowball(german)
vocab=yso-de

My files are in the subfolder: annif_2023_traintestvalide (there are three subfolders in there test_set, train_set and validate_set )

>> Now I want to go to the next step which is the training data generation 
(annif train tfidf-en /path/to/Annif-corpora/training/yso-finna-en.tsv.gz ) which obviously needs to be altered for omikuji and my German stuff. But now something goes wrong and this is what I get: C:\Users\AnnemiekeR\Python\Policey_Bern\annif>annif train omikuji-parabel-de /annif_2023_traintestvalidate/train_set/yso-de.tsv.gz
Usage: annif train [OPTIONS] PROJECT_ID [PATHS]...
Try 'annif train --help' for help.

Error: Invalid value for '[PATHS]...': Path '/annif_2023_traintestvalidate/train_set/yso-de.tsv.gz' does not exist.

C:\Users\AnnemiekeR\Python\Policey_Bern\annif>annif train omikuji-parabel-de /annif_2023_traintestvalidate/
Usage: annif train [OPTIONS] PROJECT_ID [PATHS]...
Try 'annif train --help' for help.

Error: Invalid value for '[PATHS]...': Path '/annif_2023_traintestvalidate/' does not exist.

C:\Users\AnnemiekeR\Python\Policey_Bern\annif>annif train omikuji-parabel-de annif_2023_traintestvalidate\train_set
warning: Could not create backend omikuji, make sure you've installed optional dependencies
Traceback (most recent call last):
  File "C:\Users\AnnemiekeR\Anaconda3\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\AnnemiekeR\Anaconda3\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\AnnemiekeR\Anaconda3\Scripts\annif.exe\__main__.py", line 7, in <module>
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\click\core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\flask\cli.py", line 586, in main
    return super(FlaskGroup, self).main(*args, **kwargs)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\click\core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\click\core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\click\core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\click\core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\flask\cli.py", line 426, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\click\core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\annif\cli.py", line 191, in run_train
    proj.train(documents, backend_params, jobs)
  File "C:\Users\AnnemiekeR\Anaconda3\lib\site-packages\annif\project.py", line 225, in train
    beparams = backend_params.get(self.backend.backend_id, {})
AttributeError: 'NoneType' object has no attribute 'backend_id'

C:\Users\AnnemiekeR\Python\Policey_Bern\annif>annif train omikuji_parabel_de \annif_2023_traintestvalidate\train_set
Usage: annif train [OPTIONS] PROJECT_ID [PATHS]...
Try 'annif train --help' for help.

Error: Invalid value for '[PATHS]...': Path '\\annif_2023_traintestvalidate\\train_set' does not exist.

C:\Users\AnnemiekeR\Python\Policey_Bern\annif>annif train omikuji-parabel-de /annif_2023_traintestvalidate/train_set/yso-de.tsv.gz
Usage: annif train [OPTIONS] PROJECT_ID [PATHS]...
Try 'annif train --help' for help.

Error: Invalid value for '[PATHS]...': Path '/annif_2023_traintestvalidate/train_set/yso-de.tsv.gz' does not exist.

C:\Users\AnnemiekeR\Python\Policey_Bern\annif>
Please help me solve this issue! 

Best,
Annemieke

christelann...@gmail.com

unread,
Jul 5, 2023, 6:03:47 AM7/5/23
to Annif Users
Dear all,

So basically what I do not get is where does the

" annif train yso-tfidf-en data-sets/yso-nlf/yso-finna-small.tsv.gz " (or in my case annif train omikuji-parabel-de ) stuff comes from. What/ where do I get the .tsv.gz file from and what should the command look like in my case). I am trying to get my poster ready for the DH2023 so I am a bit on a tight schedule.

Best,
Annemieke

Op woensdag 5 juli 2023 om 11:51:55 UTC+2 schreef christelann...@gmail.com:

juho.i...@helsinki.fi

unread,
Jul 5, 2023, 7:02:57 AM7/5/23
to Annif Users
Hi Annemieke!

The first two error messages in your email show that those paths given to "annif train" command do not exist. This you already solved by giving the correct one, i.e. annif_2023_traintestvalidate\train_set. I think this command should work in your case.

But then, the error message of this command indicates that the Omikuji dependency is not working:

C:\Users\AnnemiekeR\Python\Policey_Bern\annif>annif train omikuji-parabel-de annif_2023_traintestvalidate\train_set
warning: Could not create backend omikuji, make sure you've installed optional dependencies
Traceback (most recent call last):
...

To verify that Omikuji is really installed, could you try running this (in the same terminal as you are using Annif):

    pip list | grep omikuji

It should show the version of the installed Omikuji, currently "omikuji 0.5.0". But in your case I suspect there is something wrong there, so it might not show anything. In that case try to reinstall Omikuji. But maybe you have been using a Python virtual environment (venv) previously, and the Omikuji got installed in the venv, but now you have not activated the venv (with the command "source annif-venv/bin/activate")?

Also, you can check the usage of the most important Annif commands from these tutorial slides: https://github.com/NatLibFi/Annif-tutorial/blob/master/presentations/tfidf-project-slides.pdf

I'm happy to help further if needed :)

-Juho

christelann...@gmail.com

unread,
Jul 5, 2023, 7:08:54 AM7/5/23
to Annif Users
Dear Juho,

According to the pip list I have

omikuji                       0.5.0

But still getting the same errors.



C:\Users\AnnemiekeR\Python\Policey_Bern\annif>annif train omikuji-parabel-de /annif_2023_traintestvalidate/
Usage: annif train [OPTIONS] PROJECT_ID [PATHS]...
Try 'annif train --help' for help.

Error: Invalid value for '[PATHS]...': Path '/annif_2023_traintestvalidate/' does not exist.

C:\Users\AnnemiekeR\Python\Policey_Bern\annif>

(I am sure this folder does exist! Though I do want it to look in the train_set folder



Would it be possible for you video call briefly?

Best,
Annemieke

Op woensdag 5 juli 2023 om 13:02:57 UTC+2 schreef juho.i...@helsinki.fi:

juho.i...@helsinki.fi

unread,
Jul 5, 2023, 7:18:39 AM7/5/23
to Annif Users
Hi,

Remove the first "/" character in the path, and give the full path to the tsv.gz file: use "annif_2023_traintestvalidate/train_set/yso-de.tsv.gz".

Yes, I'm available for a brief video call, there could be still something else wrong with Omikuji too.

-Juho
Reply all
Reply to author
Forward
0 new messages