Reliably allocating large arrays

347 views
Skip to first unread message

Shevek

unread,
Oct 1, 2020, 2:43:59 PM10/1/20
to mechanical-sympathy
When I do new byte[N], I get OutOfMemoryError, despite that the VM
claims to have more than enough free space (according to MemoryMXBean,
Runtime.freeMemory, visualvm, etc).

My working assumption is that while I have enough free memory, I don't
have enough contiguous free memory. Is there a solution to this? Will I
get better results from any of:

* ByteBuffer.allocateDirect() - presumably yes, but has other issues
relating to overall memory usage on the system
* G1GC (or other GC which allocates (relocatable?) regions) - this is a
deep hole I haven't yet explored.
* Calling System.gc() before allocating a contiguous region [apparently
doesn't help].
* Other?

If we do follow a strategy using allocateDirect, will we end up with the
same fragmentation issue in the native heap, along with committed
off-heap memory which we can no longer effectively use, or is the
off-heap memory managed in some manner which avoids this problem?

Thank you.

S.
Message has been deleted

Peter Veentjer

unread,
Oct 2, 2020, 1:20:12 AM10/2/20
to mechanica...@googlegroups.com
I forget a very important part.

There is a memory allocator used by the process. And this memory allocator either makes use of the program break or the mmap for an anonymous memory mapping.

The ByteBuffer.allocateDirect forwards allocation requests to this memory allocator; it will not directly allocate memory by changing the program break or add an anonymous memory mapping region (so what I said before was incorrect).

This memory allocator could also have a big impact on how the system deals with fragmentation.

On Fri, Oct 2, 2020 at 8:11 AM alarm...@gmail.com <alarm...@gmail.com> wrote:
I'm not an expert, so take my answer with a few grains of salt. Also the following applies to Linux; I don't know anything about other OSs.

On the physical memory level, memory for the array doesn't need to be contiguous. That is the whole point of having virtual memory in the first place.

On the virtual memory level, the memory for array needs to be contiguous. So what is likely to happen is that an anonymous memory mapping for a particular region is created. This memory mapping is initialized with the zero page (has 0's as content), so all page table entries for this region are pointing to the same chunk of physical memory. Only when there is a write on this zero page, the copy on write mechanism kicks in and a page frame (physical memory) is allocated and assigned to that page table entry.

Afaik ByteBuffer.allocate direct just forwards the request to the OS to do an anonymous memory mapping. And since the virtual address space is huge, it should be easy to find a contiguous region of virtual memory for that array.

When you use new Object[...] you deal with the memory management from the JVM and then it depends on the GC algorithm. For example with CMS, there needs to be an entry in the free list large enough to hold that array (it needs to be contiguous). With G1 AFAIK it is less of an issue; if it is a large array, one or more non-contiguous humongous regions can be used. With parallel/serial, it should also not be an issue since these compact the memory and therefore there is no fragmentation.

Which GC algorithm are you using?
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/mechanical-sympathy/99daf015-ae67-4fb7-948f-cfb4a2cc00d6n%40googlegroups.com.

Peter Veentjer

unread,
Oct 2, 2020, 1:31:24 AM10/2/20
to mechanica...@googlegroups.com
I should have not opened my mouth :)

I thought multiple humongous regions for a single array did not need to be contiguous. But apparently this is incorrect:

https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/g1_gc_tuning.html

Gil Tene

unread,
Oct 2, 2020, 11:13:19 AM10/2/20
to mechanica...@googlegroups.com
Your stated working assumption (that while you have enough free memory, you don't have enough contiguous free memory)
is wrong. As a result, there is no need to seek a solution for that problem, because that is not the problem. At least on all production JVMs in the last 22 years or so, JVMs (through their garbage collectors) compact memory and create contiguous memory regions. Without doing that, most Java applications would generally stop working after running for a while.

Can you demonstrate (with specific output and numbers) your starting assertion? Showing the situation under which you are getting an OOME under a condition that you think it should not be happening, and how you are determining the available memory?

In general the criteria for JVMs throwing an OOME is not that you have no more empty bytes left in the heap. It is that you are thrashing so badly that “you’d probably rather be dead”. This is a subjective criteria which is often configurable via flags. There are other reasons for throwing OOME ( like running out of non-Java-heap types of memory), but the “running low enough on heap memory that The JVM is thrashing badly” is a common reason and the likely one in your case.

Efficiency of garbage collectors depends on having empty memory, and drops dramatically when empty (as in not live) memory is scarce. A significant contributing portion of garbage collector cost (in the vast majority of GC algorithms used in real world VMs) is generally linear to 1/portion-of-heap-that-is-empty. E.g. regardless of GC algorithm choice, if you are repeatedly allocating (and forgetting at the same rate) 64 byte objects in a 1GB heap that has only 256 bytes empty (not occupied by currently reachable objects) on a steady basis, you would need to run a complete GC cycle on every 4th allocation, and that GC cycle would have to examine the entire 1GB (minus 256 bytes) of of live objects in the heap each time to find the empty 256 bytes and potentially compact the heap to make them contiguous, That would be so much work per allocation that you would never want the JVM to continue running under that condition (and no current JVM will AFAIK). On the other hand, the exact same application and load would run very efficiently when there was more empty memory in the heap (improving as portion-of-heap-that-is-empty grows). More empty heap means that the act of having the garbage collector collect the heap (or a portion of it in e.g. generational collectors) will happen less often fir a given allocation rate, and when the collection costs depends only on the amount of live memory (Asir dies in e.g. the young generation of most collectors, and in some collectors even in the older generation), more empty heap simply means less GC work per unit of allocation. 
[Note that portion-of-heap-that-is-empty here refers to the portion of the heap that is not occupied by live, reachable objects, and not the much smaller portion that may be currently unused at some lint in time, until the GC gets rid of unreachable objects and significantly increases it]

E.g. some (stop the world) collectors will through an OOME when too much time (e.g. 98%) of a time interval (e.g.  no less than 1 minute) during which multiple (e.g. at least N) full GC cycles have run was spent in  stop-the world GC. This gets more intricate with concurrent collectors, but the principle is the same,

IMO, the most likely explanation (given the available data) is that you r heap is not large enough to continue to run your application given its live set and load, and that increasing the heap will resolve your problem (assuming you application doesn’t then react by increasing the live set to fill the heap up too far again). If this explanation applies, then OOME is a wholly appropriate indication during your situation.

HTH,

— Gil.

Reply all
Reply to author
Forward
0 new messages