installing h2oEnsemble-package

samira Ellouze

unread,

Oct 6, 2015, 2:49:26 PM10/6/15

to H2O Open Source Scalable Machine Learning - h2ostream

Hello,

I tried to install "h2oEnsemble-package" many times but always I find the following error:

Downloading github repo h2oai/h2o-2@master

Error in curl::curl_fetch_memory(url, handle = handle) :

Timeout was reached

I use this code:

library(devtools)
install_github("h2oai/h2o-2/R/ensemble/h2oEnsemble-package")

I use R versio3.2.2

Can someone help me please?

Best,

Samira

Erin LeDell

unread,

Oct 6, 2015, 5:23:13 PM10/6/15

to samira Ellouze, H2O Open Source Scalable Machine Learning - h2ostream

Hi Samira,
You should be using the H2O 3.0 compatible version of h2oEnsemble. You should use the h2o-3 repository instead of the h2o-2.

So first, change the URL to:


        
          library(devtools)
install_github("h2oai/h2o-3/R/ensemble/h2oEnsemble-package")

And then try again to see if it works.

I have not seen that error before, but maybe you don't have curl installed on your machine? Are you using a vanilla linux distro? That is an issue with devtools rather than h2oEnsemble.

If you don't want to use devtools, you can install h2oEnsemble from the main h2o-3 repo as follows:

git clone https://github.com/h2oai/h2o-3.git
R CMD INSTALL h2o-3/h2o-r/ensemble/h2oEnsemble-package

Best,
Erin

--
You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
Erin LeDell Ph.D.
Statistician & Machine Learning Scientist | H2O.ai

samira Ellouze

unread,

Oct 10, 2015, 1:53:42 AM10/10/15

to H2O Open Source Scalable Machine Learning - h2ostream, ellouze...@gmail.com

Hi Erin

thank you very much

I should use the h2o-3 repository instead of the h2o-2.

but I change URL to:

install_github("h2oai/h2o-3/h2o-r/ensemble/h2oEnsemble-package")

finally I installed h2oEnsemble

now, I would like some informations about h2oEnsemble.

When I run h2o ensemble, I can indicate the number of folds for each "Learner" with "cvControl" command.

I would like to know if I can use cross-validation for the metalearner

best,

Samira

Erin LeDell

unread,

Oct 10, 2015, 5:13:58 AM10/10/15

to samira Ellouze, H2O Open Source Scalable Machine Learning - h2ostream

Hi Samira,

Ah, yes, thanks for pointing that out...your install_github URL is correct.

When you say "cross-validation for the metalearner", do you mean cross-validation for the whole ensemble? I am assuming you are asking about the same behavior induced by the nfolds argument in the regular H2O algos. That feature is not yet enabled in h2o.ensemble, but you could do it manually. There is an example in this blog post of how to manually CV any H2O algo: http://h2o.ai/blog/2015/07/kfold-cross-validation/ This was a post from before we enabled nfolds in all the algos.

For the Super Learner ensemble algorithm itself, there is no need to cross-validate the metalearner on the training data. It requires only fitting the metalearner once on the full training set. If you wanted to know the cross-validated performance of the metalearner, then you could pull out the Z matrix from the output and use any H2O algorithm with the nfolds argument.

-Erin

samira Ellouze

unread,

Oct 13, 2015, 11:22:49 PM10/13/15

to H2O Open Source Scalable Machine Learning - h2ostream, ellouze...@gmail.com

Hi Erin,

thank you for your help

I use the kfolds function indicated in http://h2o.ai/blog/2015/07/kfold-cross-validation/ using:

fit.dl <- h2o.kfold(3, up_sum.hex, 1:89, 90, h2o_ensemble, h2o.predict, TRUE)

with h2o_ensemble is equal to:

h2o_ensemble <-function(up_sum.hex,X,Y) { 
 h2o.ensemble(x = X, y = Y,training_frame =up_sum.hex,learner =c("h2o.glm.wrapper", "h2o.randomForest.wrapper","h2o.gbm.wrapper", "h2o.deeplearning.wrapper"),metalearner ="h2o.deeplearning.wrapper",cvControl = list(V = 10, shuffle = TRUE))
 }

I constate that h2o.ensemble is repeated 3 times (3 iterations). For each time, a model for each learner ("h2o.glm.wrapper", "h2o.randomForest.wrapper","h2o.gbm.wrapper", "h2o.deeplearning.wrapper") is built and a model for the metalearner is also built.

I my mind, a metalearner is built only once than it is tested for each fold. that's right or not??

also after the 3rd iteration, an error message is displayed:

Error: could not find function "h2o.getFutureModel"

can you help me or can someone help me to know the error?

Best,

Samira

samira Ellouze

unread,

Oct 14, 2015, 11:40:44 PM10/14/15

to H2O Open Source Scalable Machine Learning - h2ostream, ellouze...@gmail.com

Hi,

I execute the command source ("file.R") where file.R contains h2o.kfold function and

X<- 1:89
Y<-90
h2o_ensemble <-function(up_sum.hex,X,Y) { 
 h2o.ensemble(x = X, y = Y,training_frame =......................)
 }
fit.dl <- h2o.kfold(3, ................)

just an explanation "h2o.getFutureModel" function is called from h2o.kfold function but there aren't any code for this function, is it a predefined function?????
Please have you any clue about what is wrong in my code. I appreciate any help

This is a portion of the execution from R:

h2o_ensemble <-function(up_sum.hex,X,Y) {

+ h2o.ensemble(x = X, y = Y,training_frame =up_sum.hex,learner =c("h2o.glm.wrapper", "h2o.randomForest.w ..." ... [TRUNCATED]

> fit.dl <- h2o.kfold(3, up_sum.hex, 1:89, 90, h2o_ensemble, h2o.predict, TRUE)

[1] 2200 1

|======================================================================| 100%

[1] "Cross-validating and training base learner 1: h2o.glm.wrapper"

|======================================================================| 100%

[1] "Cross-validating and training base learner 2: h2o.randomForest.wrapper"

|======================================================================| 100%

[1] "Cross-validating and training base learner 3: h2o.gbm.wrapper"

|======================================================================| 100%

[1] "Cross-validating and training base learner 4: h2o.deeplearning.wrapper"

|======================================================================| 100%

[1] "Metalearning"