When I tried to convert a ranger RF model to PMML using r2pmml package, the following error shows up. It is a classification tree but I'd like to have class probability instead of label as the output. Looks like the r2pmml conversion doesn't support the type w/ probability=T? Any idea how to get this work?
Thanks,
ZYe
x<-dtrc #training data
x$DD[x$DD=="C0"]<-0
x$DD[x$DD=="C1"]<-1
x$DD<-as.factor(x$DD)
cla_rf<-ranger(DD ~., data=x, num.trees=200, case.weights=sw, classification=TRUE, importance="impurity", mtry=6, write.forest=TRUE,
probability=TRUE)
r2pmml(cla_rf, variable.levels=sapply(x, levels), paste("./src/cla_rf_", reg, ".pmml", sep=""))
====================
Feb 27, 2017 2:45:57 PM org.jpmml.rexp.Main run
INFO: Parsing RDS..
Feb 27, 2017 2:45:57 PM org.jpmml.rexp.Main run
INFO: Parsed RDS in 34 ms.
Feb 27, 2017 2:45:57 PM org.jpmml.rexp.Main run
INFO: Initializing default Converter
Feb 27, 2017 2:45:57 PM org.jpmml.rexp.Main run
INFO: Initialized org.jpmml.rexp.RangerConverter
Feb 27, 2017 2:45:57 PM org.jpmml.rexp.Main run
INFO: Converting..
Feb 27, 2017 2:45:57 PM org.jpmml.rexp.Main run
SEVERE: Failed to convert
java.lang.IllegalArgumentException
at org.jpmml.rexp.RangerConverter.encodeModel(RangerConverter.java:136)
at org.jpmml.rexp.RangerConverter.encodeModel(RangerConverter.java:44)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:78)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
Exception in thread "main" java.lang.IllegalArgumentException
at org.jpmml.rexp.RangerConverter.encodeModel(RangerConverter.java:136)
at org.jpmml.rexp.RangerConverter.encodeModel(RangerConverter.java:44)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:78)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
Error in .convert(tempfile, file, ...) : 1
Hi Villu,
Many thanks for the quick response. I think the tree type is changed when the parameter "probability" is set as TRUE.
====================
Ranger result
Call:
ranger(DD ~ ., data = x, num.trees = 200, case.weights = sw, classification = TRUE, importance = "impurity", mtry = get(paste("cla_rf", reg, sep = "."))$bestTune$mtry, write.forest = TRUE, probability = TRUE)
Type: Probability estimation
Number of trees: 200
Sample size: 718
Number of independent variables: 47
Mtry: 7
Target node size: 10
Variable importance mode: impurity
OOB prediction error: 0.006251474
==========
If probability = FALSE, the Type becomes "Classification". So, I wonder if in r2pmml, you just need add another condition ("Probability estimation") on "Classification" branch to fix the issue?
Ranger result
Call:
ranger(DD ~ ., data = x, num.trees = 200, case.weights = sw, classification = TRUE, importance = "impurity", mtry = get(paste("cla_rf", reg, sep = "."))$bestTune$mtry, write.forest = TRUE, probability = FALSE)
Type: Classification
Number of trees: 200
Sample size: 718
Number of independent variables: 47
Mtry: 7
Target node size: 1
Variable importance mode: impurity
OOB prediction error: 0.00 %
Thanks,
ZYe
Hi Villu,
That would be great! Thank you very much!
Best,
ZYe