Hi,
We have trained the Annif Omikuji backend with a short text corpus (MARC field 245 contents + field 505 if present) covering ~67 thousand records.
Is there a way to improve the resulting model, e.g., by adjusting its parameters?
Here are the current parameters:
language=lv
backend=omikuji
analyzer=simplemma(lv)
vocab=nllsh_2026_04
cluster_balanced=False
cluster_k=100
max_depth=3
Best regards,
Uldis