keyword: java.lang.OutOfMemoryError: unable to create new native thread, HBase
hi,
Long story short, I was struggling to run through Janusgraph Tests on MacOS, and finally in a 'better' shape. So take the opportunity to share about the struggle on MacOS and a couple observation about the test with HBase as backend:
- Macbook's limitation on "max user processes" and "kern.num_threads"
- High thread demand during HBase testing, for example HBaseIDAuthorityTest and HBaseLockStoreTest
- HBase98 vs HBase10: same testings in (2) passed HBase0.98 and failed on HBase1.x
I have a "solution" for (1) , but (2) and (3) are observations and potential improvement
(0) failures
*enviroment: macOS Sierra 10.12.6 with 16GB memory
*java: "1.8.0_102" ,SE Runtime Environment (build 1.8.0_102-b14), 64-Bit Server VM (build 25.102-b14, mixed mode)
*mvn: Apache Maven 3.3.9
*test method: "mvn clean install" through iTerm2
Typical failure (both only on HBase1.x testing)
HBaseLockStoreTest>LockKeyColumnValueStoreTest.parallelNoncontendedLockStressTest:364 expected:<100> but was:<80>
[pool-2-thread-2] ERROR diskstorage.LockKeyColumnValueStoreTest: Unexpected locking-related exception on iteration 81/100
java.lang.RuntimeException: java.lang.OutOfMemoryError: unable to create new native thread
org.janusgraph.diskstorage.hbase.HBaseIDAuthorityTest
testMultiIDAcquisition[0](org.janusgraph.diskstorage.hbase.HBaseIDAuthorityTest) Time elapsed: 27.565 sec <<< ERROR!
java.lang.RuntimeException: java.lang.OutOfMemoryError: unable to create new native thread
...
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:910)
at org.janusgraph.diskstorage.hbase.HTable1_0.batch(HTable1_0.java:51)
(1) Macbook's limitation on "max user processes" and "kern.num_threads"
Both failures due to "unable to create new native thread" which is setup very low on Mac and difficult to change. The default ones:
$ulimit -u ==> max user processes (-u) 709
$sysctl kern.num_threads ==> kern.num_threads: 10240
The direct cause is from kern.num_threads, which is not (independent) changable according to
apple. To 'influence' it, two steps to hack:
step 1) one must change "Max user processes" with this
instructionstep 2) turn on performance mode per
apple
Now my macOS shows : max user processes: 2499; and kern.num_threads: 25000
(2) High thread demand during HBase testing.
I digged into the HBaseIDAuthorityTest, which called HTalbe.batch() 15K+ times.
Similar occurred at LockKeyColumnValueStoreTest.parallelNoncontendedLockStressTest(), through it uses threadpool. BTW, its error message is a bit confusion, which I originally thought of a locking bug.
wondering whether OK to reduce the stress testing level, for example, change lockOperationsPerThread = 100 to 70 which works fine for the testing purpose?
(3) HBase98 vs HBase1.0: same testcases in (2) passed HBase0.98 but failed on HBase1.x
This is odd. As I can consistently repro the issue.
HBaseIDAuthorityTest: HBase1.X failed at 10350th or so HTable1_0.batch(), which HBase98 tolerate all 15549 batches
HBaseLockStoreTest#parallelNoncontendedLockStressTest: HBase1.x began to through OOM at around 80/100 iteration; and HBase98 saw no issues.
I am wondering what cause HBase1.x requires more resource? BTW, table.flushCommits() is removed from HTable1_0.batch() but it doesn't look like the cause.
---------------------
OK. that is all. I figured out so far from backward and the hard way. Hence, share here in hope to help whoever use macOS. Or maybe some smarter way.
Also wondering whether (2) and (3) warrant further investigation or fill an issue? thanks for reading.
Demai