Here is some background on the problem I am solving: I am using H2O to build deep neural networks to forecast security prices. The models are used every day to construct a forecast and are updated (retrained from a checkpoint) every 3 months. I am simulating 50 assets (50 models) over a period of 14 years. In other words each model is trained 56 times (14 years * 4 times per year).
I noticed that the memory usage just keeps climbing as the simulation continues. I have a server with 24GB of RAM and 20 cores. At first I thought that this was due to all the temporary files which H2O creates so I use h2o.ls() to get a list of all the objects and I remove everything which isn't a model (i.e. I keep 50 items). But the memory usage was still quite high. So I thought it might be R so I am continuously calling gc() but it doesn't help.
Through H2O flow I can see that the memory size (used) on the cluster hovers between 120 and 160MB. However, the total RAM used by Java just keeps climbing until the entire program crashes. I have replicated this bug with the bleeding edge version, the current stable release (Turing), and the previous stable release (Turchin). Also, even if I set the max memory usage to 8GB it will just go over 8GB (and fairly quickly at that).
I think that it is exactly the same bug as this one logged on JIRA: https://0xdata.atlassian.net/browse/PUBDEV-3203. Is anybody working on this issue? I would label it as critical since it prevents people from using H2O in many use cases. Right now I have no idea what to do. Is there a way to install an even older version of H2O? Or call the Java garbage collector from inside R? Or just fix the problem wherever it is?
Any help / suggestions would be greatly appreciated!
Kind regards
Stuart Gordon Reid
I just wanted to add one more piece of information. The computer is running Ubuntu Server
Image 1 (H2O Flow): https://drive.google.com/file/d/0B8BsJ1DWY28VczdVNEpSUXdpWG8/view?usp=sharing
Image 2 (HTOP): https://drive.google.com/file/d/0B8BsJ1DWY28VdGtVcmlvNEFaWkE/view?usp=sharing