Hi,
I'm evaluating annif and so far i have been able to use the tfidf and mllm backends without problems. But I'm getting an error when training with the omikuji backend:
> annif train -j0 decs-omikuji train.tsv.gz
Backend omikuji: creating vectorizer
Backend omikuji: creating train file
2024-11-20T22:52:43.155Z INFO [omikuji::data] Loading data from data/projects/decs-omikuji/omikuji-train.txt
2024-11-20T22:52:44.377Z INFO [omikuji::data] Parsing data
2024-11-20T22:52:46.174Z INFO [omikuji::data] Loaded 249474 examples; it took 3.02s
2024-11-20T22:52:46.475Z INFO [omikuji::model::train] Training model with hyper-parameters HyperParam { n_trees: 3, min_branch_size: 100, max_depth: 20, centroid_threshold: 0.0, collapse_every_n_layers: 0, linear: HyperParam { loss_type: Hinge, eps: 0.1, c: 1.0, weight_threshold: 0.1, max_iter: 20 }, cluster: HyperParam { k: 2, balanced: true, eps: 0.0001, min_size: 2 }, tree_structure_only: false, train_trees_1_by_1: false }
2024-11-20T22:52:46.476Z INFO [omikuji::model::train] Initializing tree trainer
2024-11-20T22:52:46.512Z INFO [omikuji::model::train] Computing label centroids
Labels 22581 / 22581 [============================================================================================================================] 100.00 % 7858.36/s 2024-11-20T22:53:24.536Z INFO [omikuji::model::train] Start training forest
25229 / 68347 [===============================================>----------------------------------------------------------------------------------] 36.91 % 101.00/s 7m
Killed
Annif version:
1.2.0 (using docker)
Project cfg:
[decs-omikuji]
name=DeCS Omikuji Parabel
language=es
backend=omikuji
analyzer=snowball(spanish)
vocab=decs
OS:
macOS 15.1
The same train data worked flawlessly with tfidf and mllm.
Regards,