Specify cores?

510 views
Skip to first unread message

Kevin Ummel

unread,
Mar 23, 2017, 11:03:00 PM3/23/17
to h2os...@googlegroups.com, to...@0xdata.com
Hello,

Quick question:

I am doing some testing with H2O in R (Linux), local host, quad-core Intel processor. I initialize the H2O instance specifying a single thread/CPU:

h2o.init(nthreads = 1)

If I then call n-fold cross-validation, the Java process will begin using n-1 processor cores up to 4 (my physical maximum).

Is there no way to specify the number cores the H2O instance is allowed to use -- e.g. h2o.init(ncores = 2)? Or simply force H2O to run in serial?

I am finding it impractical to allow H2O to take up all of the cores automatically when I am in development and want to retain a core (or 2) for other work/apps running alongside R. I feel I must be missing something in the documentation...

Thank you!
Kevin

Erin LeDell

unread,
Mar 23, 2017, 11:32:43 PM3/23/17
to Kevin Ummel, h2os...@googlegroups.com, to...@0xdata.com

Hi Kevin,

If you set nthreads = 1, it should only use one core, even for cross-validation.  Can you double check how many cores are being used by typing h2o.clusterInfo() after you start up your cluster and look at the value for "H2O cluster allowed cores":

> h2o.clusterInfo()

R is connected to the H2O cluster:
    H2O cluster uptime:         2 minutes 43 seconds
    H2O cluster version:        3.11.0.99999
    H2O cluster version age:    4 days 
    H2O cluster name:           H2O_started_from_R_me_ure553
    H2O cluster total nodes:    1
    H2O cluster total memory:   3.56 GB
    H2O cluster total cores:    8
    H2O cluster allowed cores:  1
    H2O cluster healthy:        TRUE
    H2O Connection ip:          localhost
    H2O Connection port:        54321
    H2O Connection proxy:       NA
    H2O Internal Security:      FALSE
    R Version:                  R version 3.3.2 (2016-10-31)

It's possible that you are connecting to a previously running H2O cluster that was using all your cores and so when you run h2o.init(nthreads = 1), it won't start a 1-core cluster, it will connect to the already-running h2o cluster running on localhost.

Best,
Erin
--
You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
Erin LeDell Ph.D.
Statistician & Machine Learning Scientist | H2O.ai

Tom Kraljevic

unread,
Mar 24, 2017, 1:37:25 AM3/24/17
to Erin LeDell, Kevin Ummel, h2os...@googlegroups.com

what -nthreads really does is set the thread pool size for the fork/join worker thread pools.

so generally speaking it should do a pretty good job, since that's how h2o dispatches work.

Sent from my iPhone

Kevin Ummel

unread,
Mar 24, 2017, 8:29:17 AM3/24/17
to Erin LeDell, h2os...@googlegroups.com, to...@0xdata.com
Thanks for the quick response.

Very odd. The clusterInfo is, indeed, showing 1 allowed core. I've attached a screenshot running the help(h2o.gbm) example with cross-validation. It includes the system monitor window to you can see the Java instance CPU use.

Any idea what is going on? Is it possible there is something machine-specific that is causing the odd behavior?

Thanks again,
Kevin


Kevin Ummel

unread,
Mar 24, 2017, 8:51:06 AM3/24/17
to Erin LeDell, h2os...@googlegroups.com, to...@0xdata.com
Perhaps worth noting that the following h2o.grid() call:

iris.hex <- as.h2o(iris)
grid <- h2o.grid("gbm", x = c(1:4), y = 5, training_frame = iris.hex,
                            hyper_params = list(ntrees = c(1,2,3)*1e3, max_depth = 1:5))

...causes Java instance CPU use to range from 30% to 60% (average around 50%).

Even if nthreads=1 is being ignored for some reason, I would expect the CPU behavior to be similar in h2o.grid() and h2o.gbm(nfolds = 5) as both are building multiple models in parallel. But perhaps there is a difference under the hood that explains this behavior?
Reply all
Reply to author
Forward
0 new messages