zmalloc is great to use in product for Java. The others, how do you use them in Java? OK, JNI:)
In the first day, I am wared that we should have this allocator. I plan to use the allocate method in Unsafe. But after I read a NH article about allocators and do a simple benchmark about that in Unsafe, I found the existed FB/Google' one is very bleeding fast than that of Unsafe.
And if you are familiar with this field, you should know glibc's malloc has memory problem long history. This is why FB and Google has its own malloc.
Please note: in our simple benchmark, we only test the speed, the speed is definitely not the most important factor for allocator. The most important factor is stability. We definitely do not want any crash for memory leaking. This is, in fact, not shown in the benchmark:)
Why I design zmalloc in KISS from scatch, that is the reason. It is a big challenge, I know. But to repeat the others's logic is a stupid thing. When I'm coding the zmalloc, I often ask myself why I design like this.
For current implementations, there is only one problem, I mentioned in mechanical-sympathy in the day before yesterday: it is called by "pathological" case or usage: that is, in this case, you simply allocate the big batch of chunks, but jump to free them. "jump to free them" means:
assume I allocated 1, 2, 3, 4, 5, 6... ,10,11, but I only free 1,2,3, 5,6,7, 9,10,11, but keep 4 and 8.
For this case, for all allocators has some level of cache units, they may not free any in that size. This may lead to that the other sized chunks can not be allocated although you see you freed most space. The detail of how jump happens is decided by the detail of their cache unit.
I see ptmalloc, tcmalloc has these bug reports, and Netty's new allocator of course this(see more in the landz.kernel.test module) has this. It seems jemalloc has not dedicated bug reporting site. I have not found useful info. If assumed that Netty's impl is similiar to jemalloc, then I think jemalloc has such problem.
In the Memcached, there is another problem, called "classification": one class(size) of chunks may take all space and not back to other classes. This is why I set the page level size to one. So, ZMalloc has not such problem.
I will add more stats and tracing/logging options and make them into the web console of landz. From Java aspect, there is no memory allocator concept. The allocator is one try to organize offheap area to resolve the Java GC overriding problem. Except zmalloc, you can try Netty's buffer allocator. I know Netty's allocator is in the middle of zmalloc. So, we happen to meet in the thought! But, in that time, I have finished the threadlocal part coding, A simple review and benchmark to Netty just confirms that zmalloc should be good for backend's use:)