Error '3' when training a model

15 views
Skip to first unread message

Pedro Henrique Veronezi e Sá

unread,
Feb 24, 2016, 12:20:37 PM2/24/16
to H2O Open Source Scalable Machine Learning - h2ostream
Hello!

I'm training a model using R, and I'm getting the following error:

Got exception 'class java.lang.ArrayIndexOutOfBoundsException', with msg '3'
java.lang.ArrayIndexOutOfBoundsException: 3
	at hex.tree.SharedTree.initial_MSE(SharedTree.java:606)
	at hex.tree.SharedTree$Driver.compute2(SharedTree.java:168)
	at water.H2O$H2OCountedCompleter.compute(H2O.java:1069)
	at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
	at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
	at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
	at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
	at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)


Error: '3'
10
stop(m, call. = FALSE)
9
doTryCatch(return(expr), name, parentenv, handler)
8
tryCatchOne(expr, names, parentenv, handlers[[1L]])
7
tryCatchList(expr, classes, parentenv, handlers)
6
tryCatch({ while (keepRunning) { myJobUrlSuffix <- paste0(.h2o.__JOBS, "/", job_key) rawResponse <- .h2o.doSafeGET(urlSuffix = myJobUrlSuffix) ...
5
.h2o.__waitOnJob(object@job_key)
4
h2o.getFutureModel(job)
3
.h2o.modelJob("drf", parms)
2
h2o.randomForest(x = predictors_, y = response_, training_frame = train_, validation_frame = valid_, sample_rate = 0.7, ntrees = 50, max_depth = 20, min_rows = 1, nbins = 100) at
get_results_fun.R#2125
1
DRF_final_tier_userdefined(response_, selected_predictors_, train, valid)

The code that I'm running is this one, it is a loop to train a series of models, one for each category that I have. This is like the final step of a series of complex models, but only here I'm getting an error.

train_valid <- as.data.frame(train_valid_)
  
  #executes the training by notches
  for (i in 1:length(notches)){
    df_train <- train_valid[which((train_valid$predict_picked == as.character(notches[[i]][1]))|(train_valid$predict_picked == as.character(notches[[i]][2]))|(train_valid$predict_picked == as.character(notches[[i]][3]))),]
    if (length(df_train[,1])>0){
      df_h2o <- as.h2o(df_train)
      #splits the dataset into train, valid, usually is 80%, 20% just used when the test data were picked already
      splits <- h2o.splitFrame(df_h2o, c(0.8))
      train  <- h2o.assign(splits[[1]], "train.hex") # 80%
      valid  <- h2o.assign(splits[[2]], "valid.hex") # 20%
      #transform all columns needed to factors
      train$rating_crude <- as.factor(train$rating_crude)
      train$rating_full <- as.factor(train$rating_full)
      train$sector <- as.factor(train$sector)
      train$Ticker <- as.factor(train$Ticker)
      valid$rating_crude <- as.factor(valid$rating_crude)
      valid$rating_full <- as.factor(valid$rating_full)
      valid$sector <- as.factor(valid$sector)
      valid$Ticker <- as.factor(valid$Ticker)
      DL_picked <- DL_gridsearch(response_,selected_predictors_,train,valid,num_models_=5,best_error_=1)
      DRF_picked <-  DRF_final_tier_userdefined(response_,selected_predictors_,train,valid)
      err_dl_model <- h2o.confusionMatrix(h2o.performance(DL_picked,valid=T))$Error[length(h2o.confusionMatrix(h2o.performance(DL_picked,valid=T)))-1]
      err_drf_model <- h2o.confusionMatrix(h2o.performance(DRF_picked,valid=T))$Error[length(h2o.confusionMatrix(h2o.performance(DRF_picked,valid=T)))-1]
      
      if ((err_dl_model <= err_drf_model)&((err_drf_model != 1)|(err_dl_model != 1))){
        assign(paste0("not",i,"_model"),DL_picked)
      }else if((err_drf_model != 1)|(err_dl_model != 1)){
        assign(paste0("not",i,"_model"),DRF_picked)
      }else{
        assign(paste0("not",i,"_model"),0)
      }
    }else{
      assign(paste0("not",i,"_model"),0)
    }
  }


Do you guys think that maybe there is a limit for number of models in the same h2o session?

I have no idea whay this is happening and neither where to find this error on the documentation, someone has had this kind of trouble before?

Thanks in advance

Pedro Veronezi
Auto Generated Inline Image 1
Auto Generated Inline Image 2

Spencer Aiello

unread,
Feb 24, 2016, 2:05:28 PM2/24/16
to Pedro Henrique Veronezi e Sá, H2O Open Source Scalable Machine Learning - h2ostream
What's the version of H2O that you're using?

Pedro Henrique Veronezi e Sá

unread,
Feb 24, 2016, 2:22:23 PM2/24/16
to H2O Open Source Scalable Machine Learning - h2ostream, veronez...@gmail.com
My h2o version is>

packageVersion("h2o")
[1] ‘3.6.0.3’

and my R version is>

> R.version
               _                           
platform       x86_64-redhat-linux-gnu     
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          2.0                         
year           2015                        
month          04                          
day            16                          
svn rev        68180                       
language       R                           
version.string R version 3.2.0 (2015-04-16)
nickname       Full of Ingredients   

should I update something?
Reply all
Reply to author
Forward
0 new messages