how to classify new data set after models been trained

493 views
Skip to first unread message

Mei Meng

unread,
Oct 31, 2014, 10:58:23 AM10/31/14
to rtextto...@googlegroups.com
Hello, I have used a data set with manually coded classification to train and test models with RTextTools. I have identified a few models that performed well. Now I want to use those models to classify new data set that without any classification codes. I came to this problem and I appreciate any tips to solve this.Thanks!
Below is my script:

custs<-read_data(system.file("data","custsample.csv",package="RTextTools"),type="csv")
> matrix <- create_matrix(custsample$note, language="english", removeNumbers=TRUE,stripWhitespace=TRUE,stemWords=TRUE, removeSparseTerms=.998, weighting=weightTfIdf, originalMatrix=doc_matrix)
> container <- create_container(matrix,c(NA,1:1000),testSize=1:1000,virgin=TRUE)
Error in create_container(matrix, c(NA, 1:1000), testSize = 1:1000, virgin = TRUE) : 
  All data in the training set must have corresponding codes

Reply all
Reply to author
Forward
0 new messages