Vectorizer for Xtrasnformer

14 views
Skip to first unread message

Ball

unread,
Jan 15, 2026, 1:16:02 PMJan 15
to Annif Users
Hi all,
We ran into a weird issue when trying to do evaluation on a trained Xtransformer model. Annif is giving us the follwing error:
File "/usr/local/lib/python3.12/site-packages/joblib/numpy_pickle.py", line 650, in load
   with open(filename, 'rb') as f:
        ^^^^^^^^^^^^^^^^^^^^
IsADirectoryError: [Errno 21] Is a directory: 'data/projects/xtransformer-event-en/vectorizer'
 

We ran Annif on our own library server as well as on our campus HPCC machine. We use the same dataset, project config on both machines but we use docker container for our local library machine but have to use singularity for the install on campus machine. Training on our own server creates the vectorizer as a file but campus machine creates it as a directory (see attached images). The evaluation fails and gives us the above error if we run it on the model trained on campus machine. Did anyone run into this error? or have suggestions to troubleshoot it?

Thank,

Lucaslocal_machine.pnghpcc_machine.png

Lakshmi rb

unread,
Jan 19, 2026, 3:06:41 AMJan 19
to Annif Users
Hello,

It appears the issue stems from a version discrepancy between your local Docker environment and the campus HPCC Singularity container.

The PECOS integration for the X-Transformer backend recently switched to using the native PECOS vectorizer, which offers significantly better performance than the previous implementation. A key difference in this update is how the model is stored:

  • Older versions: Saved the vectorizer as a single file (using joblib/numpy_pickle).

  • Newer versions: Save the vectorizer as a directory containing multiple PECOS artifacts.

Ensure both environments are using the exact same commit from the pecos branch. If you are pulling an older version of the PR on one machine, the loading logic will fail. 

Thanks and regards,

Lakshmi Rajendram Bashyam



Angewandte Forscherin// Automatisierung der Sacherschließung

Applied researcher // Automation of Subject Indexing

Neuer Jungfernstieg 21

20354 Hamburg

T: +49 176 29139271

E: l.rajendr...@zbw.eu 

Ball

unread,
Jan 19, 2026, 3:19:41 PMJan 19
to Annif Users
Thanks for your explanation. We now have the newer version on our local machine after updating our Docker container.
Regards,
Lucas
Reply all
Reply to author
Forward
0 new messages