Hi there.
BigARTM is really cool, glad to use this.
I'm facing problems while co-ooccurrence dictionary gathering. I try to follow note in BigARTM CLI Reference - so I've entered this is terminal:
bigartm -c jobs_corpus.txt -v jobs_cooc_vocab.txt --cooc-window 10 --cooc-min-tf 200 --write-cooc-tf cooc_tf_ --cooc-min-df 200 --write-cooc-df cooc_df_ --write-ppmi-tf ppmi_tf_ --write-ppmi-df ppmi_df_
where jobs_corpus.txt is my collection in VW format and jobs_cooc_vocab.txt is dictionary that I've saved through following snippet:
dictionary = artm.Dictionary(data_path='jobs-simple-corpus')# загрузка данных в словарь
dictionary.save_text(dictionary_path='jobs_cooc_vocab.txt')
I guess that jobs_cooc_vocab.txt is not the file that I really should pass as -v argument value and this is my fault reason. Anyway after running command in Terminal I see this output:
items per batch = 2365
Parsing text collection... OK.
32 batches created with total of 15739 items, and 36993 words in the dictionary; NNZ = 665556, average token weight is 1.22822
And nothing is changed then - no new files, no changes in existing files.
FIY I'm using BigARTM 0.9.0 on Ubuntu.
Hope anybody will help me.
Regards, Nikolay