Hi,
As of the examples posted in
http://nlp.stanford.edu/software/tmt/tmt-0.3/
I would like to refer to example-3-lda-infer.scala and example-6-llda-
learn.scala. I am trying to implement example-7-llda-infer.scala for
Labeled LDA inference.
According to example-3-lda-infer.scala, inference with the simple LDA,
the dataset (during inference) is created as,
val dataset = LDADataset(text, termIndex = model.termIndex);
However, in example-6-llda-learn.scala, the dataset for Labeled LDA
(during learning) is created is created as,
val dataset = LabeledLDADataset(text, labels);
Now, my question is, for creating the dataset for Labeled LDA (during
INFERENCE) which statement is appropriate?
I am considering two alternatives:
(a) val dataset = LabeledLDADataset(text, labels);
(b) val dataset = LabeledLDADataset(text, labels, model.termIndex,
model.topicIndex);
I suppose option(b) is more appropriate, please correct me if I am
wrong.
My second question is: where should the labels come from that need to
be passed as parameter? Is it the complete set of labels in the
corpora, or what?
I am new to NLP, TMT, and scala. Please reply me. I will be highly
appreciate it.