Hi all,
I am trying to run the pre-trained gnd-all-xtransformer-en project with Annif (invoked with annif suggest -p projects.d -l 20 gnd-all-xtransformer-en) on google colab -The link is in the end-. I’m running into a two-step failure. I wanted to check if anyone has seen the same and to ask for recommended dependency versions or a patch.
Environment:
annif 1.5.0.dev0 (source checkout in /content/Annif)
transformers 4.49.0
What I did / command:
Sequence of issues:
Initial failure — missing tokenizer config
I resolved this by fixing a linking / path problem (some model files/configs weren’t on the expected path in my projects.d layout). After correcting that, the command proceeded further.
New failure (after resolving the above)
Short stderr/log noise about XLA/CUDA plugin registration (I think noisy and unrelated), then an abort with:
This happens immediately after the noisy XLA messages. Disabling GPU (CUDA_VISIBLE_DEVICES="") and setting TF_XLA_FLAGS="--tf_xla_enable_xla_devices=false" did not avoid the crash for me.
Colab link with Editor access for any suggests:
https://colab.research.google.com/drive/11MHIlLDUmh6_UMprITtT__oEldNrW0zu?usp=sharing
Thanks in advance.
Dear Mohamad,
great to hear that there is interest in working with the Xtransformer Backend.
Unfortunately, there is currently a bunch of version-issues and incompatibilities with the underlying pecos-library, which seems to be no longer maintained by the creator.
When I look into your notebook I see a lot of install errors. You definitely need to pin or downgrade some of the libraries. A configuration which worked for me (haven’t had any time testing it recently) was:
- huggingface-hub==0.21.3
- libpecos==1.2.4
- numpy==1.26.4
- …
- safetensors==0.4.2
- sentencepiece==0.2.0
- tokenizers==0.15.2
- torch==1.13.1
- transformers==4.38.2
Not saying there is no other solution.
The backend is still dev-status and I think no one can make any promises right now, how soon we will get that working for everyone. The dependency-chaos caused by pecos is not so easy to resolve for Annif-main.
Hope this helps.
Best,
Maximilian
Maximilian Kähler
German National Library
Metadata, Automation
Deutscher Platz 1
D-04103 Leipzig
Germany
Phone: + 49 341 2271- 133
mailto:m.ka...@dnb.de
https://www.dnb.de/ki-projekt
Von: annif...@googlegroups.com <annif...@googlegroups.com>
Im Auftrag von Mohamad Mmdouh
Gesendet: Mittwoch, 31. Dezember 2025 19:22
An: Annif Users <annif...@googlegroups.com>
Betreff: Fail to run XTransformer using pre-trained gnd-all-xtransformer-en on colab
Hi all,
I am trying to run the pre-trained gnd-all-xtransformer-en project with Annif (invoked with annif suggest -p projects.d -l 20 gnd-all-xtransformer-en) on google colab -The link is in the end-. I’m running into a two-step failure. I wanted to check if anyone has seen the same and to ask for recommended dependency versions or a patch.
Environment:
· annif 1.5.0.dev0 (source checkout in /content/Annif)
· transformers 4.49.0
What I did / command:
echo "Deep learning methods for multilingual information retrieval and neural ranking." \ | annif suggest -p projects.d -l 20 gnd-all-xtransformer-en
Sequence of issues:
1. Initial failure — missing tokenizer config
terminate called after throwing an instance of 'std::runtime_error' what(): Unable to open config file at data/projects/gnd-all-xtransformer-en/vectorizer/tokenizer/config.json
I resolved this by fixing a linking / path problem (some model files/configs weren’t on the expected path in my projects.d layout). After correcting that, the command proceeded further.
2. New failure (after resolving the above)
Short stderr/log noise about XLA/CUDA plugin registration (I think noisy and unrelated), then an abort with:
terminate called after throwing an instance of 'nlohmann::detail::type_error' what(): [json.exception.type_error.302] type must be number, but is null
This happens immediately after the noisy XLA messages. Disabling GPU (CUDA_VISIBLE_DEVICES="") and setting TF_XLA_FLAGS="--tf_xla_enable_xla_devices=false" did not avoid the crash for me.
Colab link with Editor access for any suggests:
https://colab.research.google.com/drive/11MHIlLDUmh6_UMprITtT__oEldNrW0zu?usp=sharing
Thanks in advance.
--
You received this message because you are subscribed to the Google Groups "Annif Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
annif-users...@googlegroups.com.
To view this discussion visit
https://groups.google.com/d/msgid/annif-users/bcbff5ed-36f9-498d-83ee-7c58c2263e7an%40googlegroups.com.