* A note on Bigger Data and GC: We do a user-mode swap-to-disk when the Java heap gets too full, i.e., you're using more Big Data than
physical DRAM. We won't die with a GC death-spiral, but we will degrade to out-of-core speeds. We'll go as fast as the disk will
allow. I've personally tested loading a 12Gb dataset into a 2Gb (32bit) JVM; it took about 5 minutes to load the data, and another 5 minutes to run a Logistic Regression.
This link refer to a similar case of mine. https://stackoverflow.com/questions/34082792/loading-data-bigger-than-the-memory-size-in-h2o . The answer mentioned user-mode swap-to-disk was disabled since performance was so bad. However, he did not explain any alternative method and how can enable the flags `--cleaner` in h2o?
# Work around 1:
I configured the java heap with options `java -Xmx10g -jar h2o.jar`. When I load dataset. The H2O information as follows:
However, JVM consumed all RAM memory and Swap, then operating system halted java h2o program.
# Work around 2:
I installed [H2O spark](http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.2/0/index.html). I can load dataset but spark was hanging with the following logs with a full swap memory:
+ FREE:426.8 MB == MEM_MAX:2.67 GB), desiredKV=841.3 MB OOM!
09-01 02:01:12.377 192.168.233.133:54321 6965 Thread-47 WARN: Swapping! OOM, (K/V:1.75 GB + POJO:513.2 MB + FREE:426.8 MB == MEM_MAX:2.67 GB), desiredKV=841.3 MB OOM!
09-01 02:01:12.377 192.168.233.133:54321 6965 Thread-48 WARN: Swapping! OOM, (K/V:1.75 GB + POJO:513.2 MB + FREE:426.8 MB == MEM_MAX:2.67 GB), desiredKV=841.3 MB OOM!
09-01 02:01:12.381 192.168.233.133:54321 6965 Thread-45 WARN: Swapping! OOM, (K/V:1.75 GB + POJO:513.3 MB + FREE:426.7 MB == MEM_MAX:2.67 GB), desiredKV=803.2 MB OOM!
09-01 02:01:12.382 192.168.233.133:54321 6965 Thread-46 WARN: Swapping! OOM, (K/V:1.75 GB + POJO:513.4 MB + FREE:426.5 MB == MEM_MAX:2.67 GB), desiredKV=840.9 MB OOM!
09-01 02:01:12.384 192.168.233.133:54321 6965 #e Thread WARN: Swapping! GC CALLBACK, (K/V:1.75 GB + POJO:513.4 MB + FREE:426.5 MB == MEM_MAX:2.67 GB), desiredKV=802.7 MB OOM!
09-01 02:01:12.867 192.168.233.133:54321 6965 FJ-3-1 WARN: Swapping! OOM, (K/V:1.75 GB + POJO:513.4 MB + FREE:426.5 MB == MEM_MAX:2.67 GB), desiredKV=1.03 GB OOM!
09-01 02:01:13.376 192.168.233.133:54321 6965 Thread-46 WARN: Swapping! OOM, (K/V:1.75 GB + POJO:513.2 MB + FREE:426.8 MB == MEM_MAX:2.67 GB), desiredKV=803.2 MB OOM!
09-01 02:01:13.934 192.168.233.133:54321 6965 Thread-45 WARN: Swapping! OOM, (K/V:1.75 GB + POJO:513.2 MB + FREE:426.8 MB == MEM_MAX:2.67 GB), desiredKV=841.3 MB OOM!
09-01 02:01:12.867 192.168.233.133:54321 6965 #e Thread WARN: Swapping! GC CALLBACK, (K/V:1.75 GB + POJO:513.2 MB + FREE:426.8 MB == MEM_MAX:2.67 GB), desiredKV=803.2 MB OOM!
In this case, I think the `gc` collector is waiting for cleaning some unused memory in swap.
How can I process huge dataset with a limited RAM memory ?