Is it possible to apply grid search to unsupervised isolation forest in H2O?

151 views
Skip to first unread message

fwz...@gmail.com

unread,
Jun 17, 2020, 3:38:28 PM6/17/20
to H2O Open Source Scalable Machine Learning - h2ostream

I am trying to apply grid search to H2O unsupervised isolation forest in R. Here is my code:

 

Accesses.hex <- as.h2o(Accesses)

x <- names(Accesses.hex)

seed <- 12345

 

# Model hyperparameters

hyper_params <- list(ntrees = c(50, 100, 150, 200),

                       max_depth = c(8, 15, 20, 30), # default is 8

                       sample_size = c(128, 256, 512))

 

# Early stopping criteria

search_criteria <- list(strategy = "RandomDiscrete",

                          max_models = 100,

                          max_runtime_secs = 4000,

                          stopping_rounds = 15,

                          seed = seed)

 

model.grid <- h2o.grid(algorithm = "isolationForest",

                         x = x,

                         grid_id = "model_grid",

                         training_frame = Accesses.hex,

                         hyper_params = hyper_params,

                         search_criteria = search_criteria,

                         seed = seed)

 

However, I got an error saying:

 

Error in h2o.grid(algorithm = "isolationForest", x = x, grid_id = "model_grid",  :

Must specify response, y

 

I am using isolation forest for unsupervised learning here, so I don’t have the response variable y. Is it possible to do a grid search in this case?

 

My computer: OS X 10.14.6, 16 GB memory

 

    H2O cluster version:        3.30.0.1

    H2O cluster total nodes:    1

    H2O cluster total memory:   15.00 GB

    H2O cluster total cores:    16

    H2O cluster allowed cores:  16

    H2O cluster healthy:        TRUE

    R Version:                  R version 3.6.3 (2020-02-29)

 

Please let me know if there is any other information I can provide. Thanks for your help!

Erin LeDell

unread,
Aug 21, 2020, 5:35:55 PM8/21/20
to H2O Open Source Scalable Machine Learning - h2ostream
This was unsupported previously, but we now support grid search for isolation forest in 3.30.1.1.

Fanwei Zeng

unread,
Aug 22, 2020, 8:51:27 AM8/22/20
to H2O Open Source Scalable Machine Learning - h2ostream
This is great! Thanks Erin. I will try it out. 

Fanwei Zeng

unread,
Sep 14, 2020, 11:22:17 AM9/14/20
to H2O Open Source Scalable Machine Learning - h2ostream
Hi Erin and other relevant H2O developers,

I have upgraded my H2O to 3.30.1.2. However, when I try to apply grid search to isolation forest in R, I still get "Must specify response, y". Please see below my code and let me know if there is anything I can modify to make it work. 

x <- names(data.hex)

seed <- 12345

hyper_params <- list(ntrees = c(50, 100, 150, 200),
                       max_depth = c(8, 15, 20, 30), # default is 8
                       sample_size = c(128, 256, 512, 1024)) # default is 256
  
# Early stopping criteria 
 search_criteria <- list(strategy = "RandomDiscrete", 
                          max_models = 100,
                          max_runtime_secs = 4000, 
                          stopping_rounds = 15,
                          seed = seed)
  
 model.grid <- h2o.grid(algorithm = "isolationForest",
                         x = x,
                         grid_id = "model_grid",
                         training_frame = data.hex,
                         hyper_params = hyper_params,
                         search_criteria = search_criteria,
                         seed = seed)

Thanks a lot for your help!

On Friday, August 21, 2020 at 5:35:55 PM UTC-4 Erin LeDell wrote:
Reply all
Reply to author
Forward
0 new messages