Dear Lucas,
Can you share some details of what hyperparameters you used to train the backend? What was the size of your vocabulary? Did you use GPUs for training?
Generally, out-of-memory-errors in deep learning can be caused by batch-sizes that are chosen too large. Maybe you may want to try decreasing that.
In our work with X-Transformer we found it to be quite challenging to find hyper-parameters that work well. Here is what we used for a German large size vocab (~200K concepts) and a short-document corpus of 900K entities.
[x-transformer-roberta]
name="X-Transformer Roberta-XML"
language=de
backend=xtransformer
analyzer=spacy(de_core_news_lg)
batch_size=32
vocab=gnd
limit=100
min_df=2
ngram=2
max_leaf_size=400
nr_splits=256
Cn=0.52
Cp=5.33
cost_sensitive_ranker=true
threshold=0.015
max_active_matching_labels=500
post_processor=l3-hinge
negative_sampling=tfn+man
ensemble_method=concat-only
loss_function=weighted-squared-hinge
num_train_epochs=5
warmup_steps=200
logging_steps=500
save_steps=500
model_shortcut=FacebookAI/xlm-roberta-base
Maybe you can compare this with your settings. None of our experiments needed more than 128 GB of system (CPU-)RAM. But the neural matcher part of the script was running on the GPUs, and I have no record of the GPU-memory footprint.
Best of luck,
Maximilian
Maximilian Kähler
Deutsche Nationalbibliothek
Automatische Erschließungsverfahren, Netzpublikationen
Deutscher Platz 1
D-04103 Leipzig
Telefon: + 49 341 2271- 133
mailto:m.ka...@dnb.de
https://www.dnb.de/ki-projekt
--
You received this message because you are subscribed to the Google Groups "Annif Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
annif-users...@googlegroups.com.
To view this discussion visit
https://groups.google.com/d/msgid/annif-users/d59b56bc-3d90-40e4-90cc-2da0f9151836n%40googlegroups.com.
Dear Lucas,
the parameters you copied are meant to work with a much smaller classification problem, I think. Particularly, I suspect that max_leaf_size=18000 could be related to increased modelsize. But I am not totally sure. Osma and his team have used X-Transformer recently with a larger vocab. You can find their parameter settings here:
https://github.com/NatLibFi/Annif-LLMs4Subjects-GermEval2025/blob/main/projects.toml
Maybe you try these or the ones I suggested below.
Best,
Maximilian
To view this discussion visit https://groups.google.com/d/msgid/annif-users/cfb60670-d776-4195-b419-17143a32d98bn%40googlegroups.com.