h2o R Package - Memory Managment

745 views
Skip to first unread message

jeffl...@gmail.com

unread,
Aug 19, 2016, 4:11:17 PM8/19/16
to H2O Open Source Scalable Machine Learning - h2ostream
Hello,

I'm running h2o R package to process large datasets on my local computer. Thus, memory is always an issue. I'm trying to figure out a way to manage memory usage.

I would run h2o.ls() after each run of my code to see all the temporary h2o objects that have been created during the run.

My thinking is that I could use h2o.rm() to delete the temporary h2o objects after each run. However, h2o.rm() seems to require me to specify the object I'm deleting. Is there anyway I can just say delete all h2o objects in the current h2o cluster?

Thanks!

Jeff Li

Nick Karpov

unread,
Aug 19, 2016, 4:23:51 PM8/19/16
to jeffl...@gmail.com, H2O Open Source Scalable Machine Learning - h2ostream
Hi Jeff, h2o.removeAll() should do the trick


--
You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning  - h2ostream" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jeffl...@gmail.com

unread,
Aug 19, 2016, 5:07:06 PM8/19/16
to H2O Open Source Scalable Machine Learning - h2ostream, jeffl...@gmail.com
Thanks Nick! That definitely worked. After running that, I can see that all temporary objects have been cleared out checking with h2o.ls().

However, now I see that this doesn't solve the crux of my problem. When I run my data process on h2o cluster, I can see on Task Manager that java.exe is taking up about 4.5 GB in memory, which is about the size of the dataset I've uploaded onto the h2o cluster. I had assumed that this memory would be freed up with h2o.removeAll(); but that is not the case. java.exe is still taking up about 4.5 GB in memory after all temporary objects have been removed.

In general, is there a good way to manage memory taken up by h2o/java without shutting down the session all together?

Jeff Li

Avkash Chauhan

unread,
Aug 20, 2016, 1:26:05 PM8/20/16
to H2O Open Source Scalable Machine Learning - h2ostream, jeffl...@gmail.com
If you do not see memory goes down immediately after object removal, that does not mean the new allocation will start at top memory thresholds. In your case, if you assign your data, you will not see exactly 2x memory allocation after removeAll() call.  There could be several factors which needs to be discussed to create a relation between removed objects in R and instantly seeing memory goes down in Java process. You may or may not see the memory allocation gets free immediately even when you call gc() in R too, still allocation space gets larger due to free object in Java process.  You can do the following test by yourself:

0. Start R with H2O
1. Run JConsole and attach to running h2o server which is connected with R
2. Assign lots of objects in R and see how the memory is growing
3. Remove all objects in R, you will still see the memory does not go down still there are no objects left in R
4. Assign few objects in R, you will see different behaviors, some are as b
-- 1. You might see the memory goes down to significant level and new object is still in the place
-- 2. You might see the memory goes up even higher, because new object is larger to place into memory 
-- 3. You might not see any change in memory at all, as new object(s) assigned to the place which just got free in previous call

So when you remove un-used objects in R, you are making sure that your un-unused object are no longer available, thats all you can do from R perspective.

Tom Kraljevic

unread,
Aug 20, 2016, 1:40:02 PM8/20/16
to Avkash Chauhan, H2O Open Source Scalable Machine Learning - h2ostream, jeffl...@gmail.com

See this discussion for my go-to recipe to debug memory problems.


Tom


--
You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages