BigARTM classification is fairly simple feature, but unfortunately there is no tutorial.
Basically, you need to set predict_class_id argument in artm.transform:
- predict_class_id (str) – class_id of a target modality to predict. When this option is enabled the resulting columns of theta matrix will correspond to unique labels of a target modality. The values will represent p(c|d), which give the probability of class label c for document d.
More details about classification are here:
First you need to infer topics for a given document. Later, you use
to find distribution on classification labels. Then for hard classification you take argmax - e.i class with highest p(c|d').
Kind regards,
Alex