Hi,
I am new in topic modeling. I have gone through the examples provided
in the following link:
http://nlp.stanford.edu/software/tmt/tmt-0.3/
Now I am trying to implement 'Inference on the labeled LDA model',
which can be regarded as a followup of the 'example-6-llda-
learn.scala' provided on the website.
I have written 'example-6-llda-infer.scala' as follows:
--------------------------------------------------------------------------------------------------------------------------------------------------
import scalanlp.io._;
import scalanlp.stage._;
import scalanlp.stage.text._;
import scalanlp.text.tokenize._;
import scalanlp.pipes.Pipes.global._;
import edu.stanford.nlp.tmt.stage._;
import edu.stanford.nlp.tmt.model.lda._;
import edu.stanford.nlp.tmt.model.llda._;
// the path of the model to load
val modelPath = file("llda-cvb0-59ea15c7-31-61406081-7fb4b5bd"); //
labeled LDA
println("Loading "+modelPath);
val model = LoadCVB0LabeledLDA(modelPath);
// A new dataset for inference. (Here we use the same dataset
// that we trained against, but this file could be something new.)
val source = CSVFile("pubmed-oa-subset.csv") ~> IDColumn(1);
val text = {
source ~> // read from the source file
Column(4) ~> // select column containing
text
TokenizeWith(model.tokenizer.get) // tokenize with existing
model's tokenizer
}
// define fields from the dataset we are going to slice against
val labels = {
source ~> // read from the source file
Column(2) ~> // take column three, the
tags
TokenizeWith(WhitespaceTokenizer()) ~> // turns label field into an
array
TermCounter() ~> // collect label counts
TermMinimumDocumentCountFilter(10) // filter labels in < 10 docs
}
// Base name of output files to generate
val output = file(modelPath,
source.meta[java.io.File].getName.replaceAll(".csv",""));
// turn the text into a dataset ready to be used with LLDA
val dataset = LabeledLDADataset(text, labels, model.termIndex,
model.topicIndex);
println("Writing document distributions to "+output+"-document-topic-
distributions.csv");
val perDocTopicDistributions =
InferCVB0LabeledLDADocumentTopicDistributions(model, dataset);
CSVFile(output+"-document-topic-
distributuions.csv").write(perDocTopicDistributions);
--------------------------------------------------------------------------------------------------------------------------------------------------
But the last line of the above code yields the following error
message:
example-6-llda-infer.scala:55: error: could not find implicit value
for evidence parameter of type
scalanlp.serialization.TableWritable[scalanlp.collection.LazyIterable[(String,
scalala.collection.sparse.SparseArray[Double])]]
CSVFile(output+"-document-topic-
distributuions.csv").write(perDocTopicDistributions);
I have tried to go through the documentations but could not resolve
this. Can you please suggest?
Thanks in advance.
Fahim