Proper use of HeapFriendlyHashMap

77 views
Skip to first unread message

Darren Bathgate

unread,
Apr 2, 2015, 11:25:14 AM4/2/15
to netfli...@googlegroups.com
We have an application using Netflix Zeno that has a decent amount of data (about 2gb).
The data is being indexed using HeapFriendlyHashMap, and we have about 6 unique datasets. 
We swap the data for new HeapFriendlyHashMap's roughly every hour.

This means that we are creating 6 separate instances of HeapFriendlyHashMap.

My concern is that the singleton instance of HeapFriendlyMapArrayRecycler is being accessed by the 6 separate instances of HeapFriendlyHashMap.

The app has been encountering some issues with garbage collection.
We would see the heap space rise to about 90% capacity over a period of a few hours before rapidly dropping to around 30%.


So my questions are:
  • Is HeapFriendlyMapArrayRecycler designed to handle  multiple instances of HeapFriendlyHashMap? and if so..
    • What is the best practice for cycling data for multiple HeapFriendlyHashMap's?
  • What is the recommended GC configurations for an app using HeapFriendlyHashMap's?

Drew Koszewnik

unread,
Apr 2, 2015, 3:03:47 PM4/2/15
to netfli...@googlegroups.com
Hi Darren,

Yes, HeapFriendlyMapArrayRecycler is intended to be used with multiple maps.  We use it this way here at Netflix.

Just to double-check, are you following the instructions listed on the wiki?  Specifically, on each refresh you'll want to call:

HeapFriendlyMapArrayRecycler.get().swapCycleObjectArrays();

try {
   /// make data available to application
} finally {
    HeapFriendlyMapArrayRecycler.get().clearNextCycleObjectArrays();
}

Then, when making data available by creating new HeapFriendlyHashMaps, make sure you call releaseObjectArrays() on each of the old maps as you replace them with new ones.

You describe a scenario where you see heap usage rise to 90%, then drop to around 30%.  The HeapFriendlyMapArrayRecycler won't ever release any memory.  Assuming the above procedure is followed, once an Object array segment gets allocated to your HeapFriendlyHashMaps, it will shortly get promoted to tenured space, where it will live for the life of the application.  These long-lived object array segments will be reused every other (alternating) refresh cycle.

Can you please confirm that even after major collections, the heap usage remains at 90%?  If you grab a live histogram from an instance while it is stuck at 90% e.g. via:

jmap -histo:live <pid>

Do you see Object arrays as a top consumer?

Thanks,
Drew.

Darren Bathgate

unread,
Apr 2, 2015, 3:35:35 PM4/2/15
to netfli...@googlegroups.com
Thanks Drew, this is very helpful information.

I believe I had the order of events wrong in my code.
I was calling releaseObjectArrays before filling my new map with data.

So instead of:
HeapFriendlyMapArrayRecycler.get().swapCycleObjectArrays();

HeapFriendlyHashMap<K, V> newAccessMap = new HeapFriendlyHashMap<>(objectCount);

// fill new access map

accessMap.releaseObjectArrays();

HeapFriendlyMapArrayRecycler.get().clearNextCycleObjectArrays();

I was doing:
HeapFriendlyMapArrayRecycler.get().swapCycleObjectArrays();

HeapFriendlyHashMap<K, V> newAccessMap = new HeapFriendlyHashMap<>(objectCount);

accessMap.releaseObjectArrays();

// fill new access map

HeapFriendlyMapArrayRecycler.get().clearNextCycleObjectArrays();

So just to confirm, creating multiple HeapFriendlyHashMap's is fine as long as this pattern is followed?

Map 1:
--
HeapFriendlyMapArrayRecycler.get().swapCycleObjectArrays();

HeapFriendlyHashMap<K, V> newAccessMap1 = new HeapFriendlyHashMap<>(objectCount);

// fill new access map #1

accessMap1.releaseObjectArrays();

HeapFriendlyMapArrayRecycler.get().clearNextCycleObjectArrays();
accessMap1 = newAccessMap1

Map 2:
--
HeapFriendlyMapArrayRecycler.get().swapCycleObjectArrays();

HeapFriendlyHashMap<K, V> newAccessMap2 = new HeapFriendlyHashMap<>(objectCount);

// fill new access map #2

accessMap2.releaseObjectArrays();

HeapFriendlyMapArrayRecycler.get().clearNextCycleObjectArrays();
accessMap2 = newAccessMap2

Map 3:
--
HeapFriendlyMapArrayRecycler.get().swapCycleObjectArrays();

HeapFriendlyHashMap<K, V> newAccessMap3 = new HeapFriendlyHashMap<>(objectCount);

// fill new access map #3

accessMap3.releaseObjectArrays();

HeapFriendlyMapArrayRecycler.get().clearNextCycleObjectArrays();
accessMap3 = newAccessMap3

Drew Koszewnik

unread,
Apr 2, 2015, 4:09:29 PM4/2/15
to netfli...@googlegroups.com
Hi Darren,

I don't think that ordering actually matters.  Calling .releaseObjectArrays() returns the arrays to the pool, but only the interactions with the HeapFriendlyMapArrayRecycler will modify them or make them available for subsequent cycles.

I noticed that you're wrapping the replacement of each map between the calls to the HeapFriendlyMapArrayRecycler.  Instead, you probably want to wrap the replacement of all maps between the calls to HeapFriendlyMapArrayRecycler (e.g. you want to only call each method once per refresh, not once per map per refresh).  Here's why:  calling HeapFriendlyMapArrayRecycler.get().clearNextCycleObjectArrays() is akin to calling map.clear() on all previously released maps.  If you do this before replacing the old accessMap, there will be a short period of time where the data is inaccessible.  For the same reason, you'll want to make sure your call to clearNextCycleObjectArrays() happens after all application threads are done with the old maps.  Alternatively, if knowing all threads are done with the old maps isn't feasible, you can use a volatile reference for the access maps, read from them eagerly, then do a check before returning the result to ensure that the maps from which you read are still the active maps.  Let me know if you'd like to see an example of the latter pattern.

However, this usage isn't likely to result in the behavior you described in your original post either.

Thanks,
Drew.

Darren Bathgate

unread,
Apr 2, 2015, 5:17:39 PM4/2/15
to netfli...@googlegroups.com
So if I understand correctly, the implementation should look like this instead?


HeapFriendlyMapArrayRecycler.get().swapCycleObjectArrays();

HeapFriendlyHashMap<K, V> newAccessMap1 = new HeapFriendlyHashMap<>(objectCount);
HeapFriendlyHashMap<K, V> newAccessMap2 = new HeapFriendlyHashMap<>(objectCount);
HeapFriendlyHashMap<K, V> newAccessMap3 = new HeapFriendlyHashMap<>(objectCount);


// fill new access map #1
// fill new access map #2
// fill new access map #3


accessMap1.releaseObjectArrays();
accessMap2.releaseObjectArrays();
accessMap3.releaseObjectArrays();


accessMap1 = newAccessMap1
accessMap2 = newAccessMap2
accessMap3 = newAccessMap3


HeapFriendlyMapArrayRecycler.get().clearNextCycleObjectArrays();

Darren Bathgate

unread,
Apr 2, 2015, 7:16:33 PM4/2/15
to netfli...@googlegroups.com
Drew,

I am curious to see you thoughts on this pull request.

I am finding myself in a difficult situation with having to refresh multiple hash maps in a single cycle.

I noticed that HeapFriendlyMapArrayRecycler already had a public constructor, so I thought being able to pass in a separate instance of recycler to HeapFriendlyHashMap would be beneficial in my case.

Drew Koszewnik

unread,
Apr 2, 2015, 8:06:04 PM4/2/15
to netfli...@googlegroups.com
Hi Darren,

First, yes your earlier post has the right idea.

Second, I merged the pull request, I wasn't a fan of the HeapFriendlyMapArrayRecycler only being available via a static reference -- this is a good improvement, thank you.

I'll let you know when this gets pushed to Maven Central.

Thanks again,
Drew.

Darren Bathgate

unread,
Apr 2, 2015, 8:59:41 PM4/2/15
to netfli...@googlegroups.com
Thanks for merging Drew.

I wasn't sure if backwards compatibility was important in this case, which is why I kept the original constructor of HeapFriendlyHashMap referencing the singleton instance of HeapFriendlyMapArrayRecycler.

This will definitely save me from having to do a bunch of re-factoring!


--Darren

Darren Bathgate

unread,
Apr 6, 2015, 11:04:04 AM4/6/15
to netfli...@googlegroups.com
Drew,

We still seem to be having some issues with garbage collection even after the HeapFriendlyMapArrayRecycler change.
The climb to 90% heap space is much slower now though, taking 2 days before reaching the peak again (see attached chart).

Also, I made one more pull request to cleanup remaining static references to HeapFriendlyMapArrayRecycler.

We have been using a CMS collector (-XX:+UseConcMarkSweepGC) for a GC.
I've decided to switch to a G1 (-XX:+UseG1GC) collector to see how that handles over the course of the next few days.

I've also turned on Java 8's string deduplication feature -XX:+UseStringDeduplication.
This seems to have dropped down the overall average of heap space.


Do you have any preference of CMS vs. G1 collectors for Zeno apps?

--Darren
heap_usage.png

Jon Stockdill

unread,
Apr 16, 2015, 8:37:58 AM4/16/15
to netfli...@googlegroups.com
Darren, did you figure this one out?

Darren Bathgate

unread,
Apr 16, 2015, 10:07:05 PM4/16/15
to netfli...@googlegroups.com
The issue I was having seems to have still been improper use of HeapFriendlyHashMap.

Adding the Recycler as a constructor argument to HeapFriendlyHashMap helped quite a bit, but I still found myself referencing the same recycler for multiple instances.
I found that the best approach is to not reference the recycler externally, and instead implement a new HashMap that manages the recycler internally.

I've created a pull request for a new map that named PhasedHeapFriendlyHashMap.

This map eliminates improper use by internally maintaining its own recycler, and provides methods for switching in and out of a data swap phase.

Starting a data swap phase on a PhasedHeapFrienlyHashMap swaps the object arrays, and allows data to be inserted via put.
Ending the swap phase makes the newly inserted data available, releases the previous object arrays, and prepares the recycler for the next cycle.

I have so far had the most success with this implementation.
Reply all
Reply to author
Forward
0 new messages