Возникла проблема. До этого работал с выгрузками в формате UCI, конвертированными в batch-формат и всё было нормально. Сейчас попробовал конвертировать из Vowpal Wabbit и получаю почему-то пустое значение top_tokens_score.last_tokens . Подскажите, пожалуйста, что я делаю не так. К сожалению, значения строк вынужден скрыть, т.к. может попадать под NDA/--
import artmprint artm.version()pass_count = 2default_background_topics_count = 3topics_num = 20weight1 = 10.0weight2 = 10.0background_tau = 0.5foreground_tau = 0.5decorrelation_tau = 1e+5background_topics_count=default_background_topics_countprint 'Loading train batches...'train_batch_vectorizer = artm.BatchVectorizer(data_path='C:\\Users\\m.statsenko\\PycharmProjects\\DwarfCatcher\\test_batches',data_format='batches')print 'Batches have been loaded'print 'Loading dictionary...'dictionary = artm.Dictionary(data_path='C:\\Users\\m.statsenko\\PycharmProjects\\DwarfCatcher\\test_batches')print 'Dictionary loaded'background_tau = abs(background_tau)foreground_tau = - abs(foreground_tau)topics_num = int(round(topics_num))model = artm.ARTM(num_topics=topics_num,class_ids={'class1': 1.0, 'class2': weight1, 'class3': weight2},dictionary=dictionary)model.scores.add(artm.PerplexityScore(name='perplexity_score'))model.scores.add(artm.TopTokensScore(name='top_tokens_score'))model.regularizers.add(artm.SmoothSparsePhiRegularizer(name='sparse_phi_regularizer_background',tau=background_tau,topic_names=model.topic_names[-background_topics_count:]))model.regularizers.add(artm.SmoothSparsePhiRegularizer(name='sparse_phi_regularizer_foreground',tau=foreground_tau,topic_names=model.topic_names[0: -background_topics_count -1]))model.regularizers.add(artm.SmoothSparseThetaRegularizer(name='sparse_theta_regularizer_background',tau=background_tau,topic_names=model.topic_names[-background_topics_count:]))model.regularizers.add(artm.SmoothSparseThetaRegularizer(name='sparse_theta_regularizer_foreground',tau=foreground_tau,topic_names=model.topic_names[0: -background_topics_count-1]))model.regularizers.add(artm.DecorrelatorPhiRegularizer(name='decorrelator_phi_regularizer',tau=decorrelation_tau))foreground_topics = model.topic_names[0: -background_topics_count]background_topics = model.topic_names[-background_topics_count:]model.fit_offline(train_batch_vectorizer, num_collection_passes=pass_count)saved_top_tokens = model.score_tracker['top_tokens_score'].last_tokensresult = model.score_tracker['perplexity_score'].last_valueprint saved_top_tokens, resultВыдача
0.8.2Loading train batches...Batches have been loadedLoading dictionary...Dictionary loadedWidget Javascript not detected. It may not be installed properly. Did you enable the widgetsnbextension? If not, then run "jupyter nbextension enable --py --sys-prefix widgetsnbextension"Widget Javascript not detected. It may not be installed properly. Did you enable the widgetsnbextension? If not, then run "jupyter nbextension enable --py --sys-prefix widgetsnbextension"Widget Javascript not detected. It may not be installed properly. Did you enable the widgetsnbextension? If not, then run "jupyter nbextension enable --py --sys-prefix widgetsnbextension"{} 1023.86988819При этом:
model.get_phi()Выдает:
topic_0 topic_1 topic_2 \xxxxxxxxxxxxxxxxxxxxxx 2.176410e-06 6.831331e-08 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 3.880483e-08 0.000000e+00 5.567056e-07xxxxxxxxxxxxxxxxxxxxxx 4.182876e-07 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 2.700850e-08 2.053835e-07xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 4.791038e-06 2.341877e-06 4.462831e-06xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 3.486495e-04 5.950908e-04 1.435101e-04xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 3.810325e-09 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 1.387673e-07 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 9.885448e-06 6.776532e-06 3.928494e-07xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 3.107991e-07xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 2.816075e-06 1.750824e-04 1.173249e-04xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 4.080995e-07 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 5.068147e-07 3.172171e-07 1.046817e-07xxxxxxxxxxxxxxxxxxxxxx 6.630335e-08 1.282282e-07 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 3.821329e-07 3.680565e-07 8.766111e-08xxxxxxxxxxxxxxxxxxxxxx 6.386458e-07 3.464642e-07 1.205484e-06xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 1.418981e-08 1.233779e-08 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00... ... ...xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 1.067707e-07 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 1.811339e-07xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 1.596119e-07xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 9.911325e-08xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 2.348565e-10 3.576638e-08 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 9.894556e-10 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 8.510758e-09 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 0.000000e+00 0.000000e+00 0.000000e+00xxxxxxxxxxxxxxxxxxxxxxxx 1.924105e-08 0.000000e+00 0.000000e+00...[XXXXXX rows x 20 columns]
You received this message because you are subscribed to the Google Groups "bigartm-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bigartm-users+unsubscribe@googlegroups.com.
To post to this group, send email to bigart...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bigartm-users/b7848950-4140-487f-b6be-f4c93c7e0133%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.