Error '3' when training a model

Pedro Henrique Veronezi e Sá

unread,

Feb 24, 2016, 12:20:37 PM2/24/16

to H2O Open Source Scalable Machine Learning - h2ostream

Hello!

I'm training a model using R, and I'm getting the following error:

Got exception 'class java.lang.ArrayIndexOutOfBoundsException', with msg '3'
java.lang.ArrayIndexOutOfBoundsException: 3
	at hex.tree.SharedTree.initial_MSE(SharedTree.java:606)
	at hex.tree.SharedTree$Driver.compute2(SharedTree.java:168)
	at water.H2O$H2OCountedCompleter.compute(H2O.java:1069)
	at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
	at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
	at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
	at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
	at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)


 Hide Traceback
  Rerun with Debug
 Error: '3'
 10
 stop(m, call. = FALSE)
 
9
 doTryCatch(return(expr), name, parentenv, handler)
 
8
 tryCatchOne(expr, names, parentenv, handlers[[1L]])
 
7
 tryCatchList(expr, classes, parentenv, handlers)
 
6
 tryCatch({
    while (keepRunning) {
        myJobUrlSuffix <- paste0(.h2o.__JOBS, "/", job_key)
        rawResponse <- .h2o.doSafeGET(urlSuffix = myJobUrlSuffix) ...
 
5
 .h2o.__waitOnJob(object@job_key)
 
4
 h2o.getFutureModel(job)
 
3
 .h2o.modelJob("drf", parms)
 
2
 h2o.randomForest(x = predictors_, y = response_, training_frame = train_, 
    validation_frame = valid_, sample_rate = 0.7, ntrees = 50, 
    max_depth = 20, min_rows = 1, nbins = 100) at
 get_results_fun.R#2125
1
 DRF_final_tier_userdefined(response_, selected_predictors_, train, 
    valid)

The code that I'm running is this one, it is a loop to train a series of models, one for each category that I have. This is like the final step of a series of complex models, but only here I'm getting an error.

train_valid <- as.data.frame(train_valid_)
  
  #executes the training by notches
  for (i in 1:length(notches)){
    df_train <- train_valid[which((train_valid$predict_picked == as.character(notches[[i]][1]))|(train_valid$predict_picked == as.character(notches[[i]][2]))|(train_valid$predict_picked == as.character(notches[[i]][3]))),]
    if (length(df_train[,1])>0){
      df_h2o <- as.h2o(df_train)
      #splits the dataset into train, valid, usually is 80%, 20% just used when the test data were picked already
      splits <- h2o.splitFrame(df_h2o, c(0.8))
      train  <- h2o.assign(splits[[1]], "train.hex") # 80%
      valid  <- h2o.assign(splits[[2]], "valid.hex") # 20%
      #transform all columns needed to factors
      train$rating_crude <- as.factor(train$rating_crude)
      train$rating_full <- as.factor(train$rating_full)
      train$sector <- as.factor(train$sector)
      train$Ticker <- as.factor(train$Ticker)
      valid$rating_crude <- as.factor(valid$rating_crude)
      valid$rating_full <- as.factor(valid$rating_full)
      valid$sector <- as.factor(valid$sector)
      valid$Ticker <- as.factor(valid$Ticker)
      DL_picked <- DL_gridsearch(response_,selected_predictors_,train,valid,num_models_=5,best_error_=1)
      DRF_picked <-  DRF_final_tier_userdefined(response_,selected_predictors_,train,valid)
      err_dl_model <- h2o.confusionMatrix(h2o.performance(DL_picked,valid=T))$Error[length(h2o.confusionMatrix(h2o.performance(DL_picked,valid=T)))-1]
      err_drf_model <- h2o.confusionMatrix(h2o.performance(DRF_picked,valid=T))$Error[length(h2o.confusionMatrix(h2o.performance(DRF_picked,valid=T)))-1]
      
      if ((err_dl_model <= err_drf_model)&((err_drf_model != 1)|(err_dl_model != 1))){
        assign(paste0("not",i,"_model"),DL_picked)
      }else if((err_drf_model != 1)|(err_dl_model != 1)){
        assign(paste0("not",i,"_model"),DRF_picked)
      }else{
        assign(paste0("not",i,"_model"),0)
      }
    }else{
      assign(paste0("not",i,"_model"),0)
    }
  }

Do you guys think that maybe there is a limit for number of models in the same h2o session?

I have no idea whay this is happening and neither where to find this error on the documentation, someone has had this kind of trouble before?

Thanks in advance

Pedro Veronezi

Auto Generated Inline Image 1

Auto Generated Inline Image 2

Spencer Aiello

unread,

Feb 24, 2016, 2:05:28 PM2/24/16

to Pedro Henrique Veronezi e Sá, H2O Open Source Scalable Machine Learning - h2ostream

What's the version of H2O that you're using?

Pedro Henrique Veronezi e Sá

unread,

Feb 24, 2016, 2:22:23 PM2/24/16

to H2O Open Source Scalable Machine Learning - h2ostream, veronez...@gmail.com

My h2o version is>

packageVersion("h2o")
[1] ‘3.6.0.3’

and my R version is>

> R.version
               _                           
platform       x86_64-redhat-linux-gnu     
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          2.0                         
year           2015                        
month          04                          
day            16                          
svn rev        68180                       
language       R                           
version.string R version 3.2.0 (2015-04-16)
nickname       Full of Ingredients   

should I update something?

Reply all

Reply to author

Forward