H2O model building issue when creating models in parallel

374 views
Skip to first unread message

anixa...@gmail.com

unread,
May 29, 2018, 5:05:05 AM5/29/18
to H2O Open Source Scalable Machine Learning - h2ostream
Hi,

We are facing some issue while using H2O within our model building setup. We are triggering 3 models at one go - GLM, RF and GBM. All of the using H2O. Now these 3 model building takes some time. Before these model building task is completed, if we trigger another model building task, again with GLM, RF and GBM, we get the following errors:

i) ERROR MESSAGE: class water.fvec.Frame trainingData is already in use. Unable to use it now. Consider using a different destination name.

ii) Error: java.lang.RuntimeException: Rollups not possible, because Vec was deleted:

Is there a way to resolve this issue?

Regards,
Anirban

Darren Cook

unread,
May 29, 2018, 5:53:49 AM5/29/18
to h2os...@googlegroups.com
> We are facing some issue while using H2O within our model building
> setup. We are triggering 3 models at one go - GLM, RF and GBM. All of
> the using H2O. Now these 3 model building takes some time.

Is there a reason to do them in parallel? It will require less memory,
and almost certainly be quicker, to build them in serial. I.e. give each
exclusive use of the cluster to build its model.

Or, if you have loads of memory, set up three distinct H2O clusters.

(I think what you are trying to do is supposed to be supported, so you
are either running out of memory or triggering a bug.)

Darren

Tom Kraljevic

unread,
May 29, 2018, 10:10:25 AM5/29/18
to Darren Cook, h2os...@googlegroups.com

hi,

it sounds like a user misuse of the API based on the message, not a bug.

the “destination” referred to here is the distributed key/value store output of an H2O job or rapids expression.

the destination frame is intentionally locked until the job finishes, precisely so you get the message you see if you have a second job competing to clobber the same destination.

if you want to run multiple jobs in parallel, give them different destinations.

the second error looks like a symptom of the same problem and not a separate error.

tom

Sent from my iPhone
> --
> You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Darren Cook

unread,
May 29, 2018, 10:59:14 AM5/29/18
to h2os...@googlegroups.com
> the “destination” referred to here is the distributed key/value store
> output of an H2O job or rapids expression.
>
> the destination frame is intentionally locked until the job finishes,

Good point, Tom - I took "trainingData" to mean the same training data
was being given to three different models. But maybe there are some data
preparation steps being done in parallel, too. It'd be good to see some
code.

Darren

palbha...@gmail.com

unread,
May 5, 2020, 1:43:39 PM5/5/20
to H2O Open Source Scalable Machine Learning - h2ostream
Hi All ,

Please suggest how can we change destination name , basically I am creating and  trying to predict different parameters using AutoMl after 1/2 run I end up with an error saying the Exception in thread "main" java.lang.IllegalArgumentException: class hex.grid.Grid GLM_grid_1_AutoML_20200505_230253 is already in use.  Unable to use it now.  Consider using a different destination name.
.
> To unsubscribe from this group and stop receiving emails from it, send an email to h2os...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages