h2o.ensemble function outputs a list object which makes up the ensemble "model". This R object can be serialized to disk using the R base save
function. However, if you save the ensemble model to disk, then use it
in the future to generate predictions on a test set using a new H2O
cluster instance (with a different cluster IP address), this will not
work. This can be fixed by updating the cluster IP address in the saved
object with the new one. The model saving process will probably be
modified in the future to serialize each of the individual H2O base
models using the h2o::saveModel function. Therefore, the
saved H2O base models will be accessible individually. Currently, the
ensemble fit is stored as a single R list object which contains all the
base learner fits, the metalearner fit, and a few other pieces of data."--
You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
-- Erin LeDell Ph.D. Statistician & Machine Learning Scientist | H2O.ai
> packageDescription("h2o")$Version [1] "3.6.0.8" > packageDescription("h2oEnsemble")$Version [1] "0.1.5"
The code that I'm running is:
# Specify the base learner library & the metalearner learner <- c("h2o.glm.wrapper", "h2o.randomForest.wrapper", "h2o.gbm.wrapper", "h2o.deeplearning.wrapper") metalearner <- "SL.glm" #creates the ensemble fit <- h2o.ensemble( x = predictors,y = response, training_frame = train, family = "binomial", learner = learner, metalearner = metalearner, cvControl = list(V = 5, shuffle = TRUE)) #print(paste0("Time spent:",as.integer(as.integer(Sys.time())-timestamp)," min")) #evaluate the models: pred <- predict(fit, test_original) labels <- as.data.frame(test_original[,c(response)])[,1] AUC(predictions=as.data.frame(pred$pred)[,1], labels=labels) #for each single model L <- length(learner) sapply(seq(L), function(l) AUC(predictions = as.data.frame(pred$basepred)[,l], labels = labels))
It works fine, until the moment that I try to save the model using the command:
setwd("~/FannieMae")h2oEnsemble::h2o.save_ensemble(fit,path="ensemble",force=TRUE)or either:h2oEnsemble::h2o.save_ensemble(fit,filename="file:/home/hanlon01/FannieMae/ensemble",force=TRUE)h2oEnsemble::h2o.save_ensemble(fit,path="file:/home/hanlon01/FannieMae/ensemble",force=TRUE)or this, that I read somewhere that would work:h2oEnsemble::h2o.save_ensemble(fit,path="file:///home/hanlon01/FannieMae/ensemble",force=TRUE)any of those seems to work, the error is:Error in h2o.saveModel(object = object$metafit, path = path, force = force) : `object` must be an H2OModel objectCan you help me with that?Also, if I need to use the base::save(), how can I modify the metadata for the saved model.Thanks in advance.PEdro Veronezi
h2oEnsemble::h2o.save_ensemble(fit,path="/home/hanlon01/FannieMae/ensemble",force=TRUE)
I've updated my h2o and ensemble packages.
I think you are suggesting that this is a path syntax problem.
h2o.save_ensemble(fit, path = "Project/hens10models", force = TRUE, export_levelone = FALSE)
It saves the individual models to the path but then fails on the meta model
Error ... `object` must be an H2OModel object
Thanks
Ian
metalearner <- "SL.glm"
I will try changing it over.
I'm afraid I probably got the choice from an example and it was not really well thought through.
I'll try some other models now I can see the error of my ways!
Regards
Ian
This did indeed let me save the complete ensemble but broke most of the code that tried to process a prediction from the model.
metalearner <- "h2o.glm"
I would say the data structure being returned from predict.h2o,ensemble is not consistent when you change the meta-learner. This is not the most friendly behaviour. If I wasn't really supposed to use the sl.glm then it maybe that it is that model that is the issue. However, I did previously have my auc and prediction file save working!
Regards
Ian
pred <- predict.h2o.ensemble(fit, validation_frame)
> labels <- as.data.frame(validation_frame[,c(y)])[,1]
> cat("AUC",AUC(predictions=as.data.frame(pred$pred)[,1], labels=labels),"\n")
Error in prediction(predictions = predictions, labels = labels, label.ordering = label.ordering) :
Format of predictions is invalid.