Sizing caches by memory size instead of count

4,375 views
Skip to first unread message

Blair Zajac

unread,
Nov 30, 2011, 7:59:05 PM11/30/11
to guava-...@googlegroups.com
Just watched Bob's talk on Weak/Soft/Phantom References [1][2] and he
made a point about not using soft references, which we do in our
middle-tier app. We used to have an maximum sized single threaded LRU
cache before Google Collections existed, but I liked the idea of not
having to worry about the amount of memory to give to the cache since we
have systems with varying amounts of memory, so we switched to soft
reference cache instead with Google Collections.

After listening to the talk, it would be useful if Guava's cache could
be told the maximum size in memory to consume instead of the number of
elements. So I could say, this Java process is given 32 GB of memory,
I'll give the cache 512 MB.

This would make me feel better than just pulling a number out of the air
or manually sizing the cache by trying to find the object size.

Is this something that could be added to Guava? If not, I recall seeing
a page showing the memory usage of MapMaper's maps with varying settings
(softValues, softKeys, etc), but cannot find it now. This page seems to
have used some code that can determine how large a Java object is that I
could use; anybody know this project?

Blair

[1] http://crazybobs-talks.googlecode.com/svn/trunk/out/references.pdf
[2] http://www.parleys.com/#st=5&id=2657&sl=53

Tim Peierls

unread,
Nov 30, 2011, 8:05:27 PM11/30/11
to Blair Zajac, guava-...@googlegroups.com
What easily computed criteria could Guava use to determine the working size of the cache? You probably have a better sense of the average size of an application-specific entry in the cache than Guava does.

--tim

Charles Fry

unread,
Nov 30, 2011, 8:10:29 PM11/30/11
to Blair Zajac, guava-...@googlegroups.com
You could implement the logical equivalent of this using Guava 11.0's CacheBuilder.maximumWeight:


Charles

Dimitris Andreou

unread,
Nov 30, 2011, 8:11:05 PM11/30/11
to Blair Zajac, guava-...@googlegroups.com
I'm pretty confident that this can never happen in a general library
like this. Given a reference (i.e. the root to an object graph), there
is no unique answer of "this is how much memory the object graph
takes" in general, since you don't know what nodes of the graph are
really "owned" by the root object (they might be shared stuff instead,
which could lead to even more foreign objects). Let alone that, even
if you were willing to give a fine-grained predicate of "what parts of
the object graphs constitute the memory cost of it", it is very
expensive to measure the memory footprint of an object graph.

A weight is really the only option.

Blair Zajac

unread,
Dec 1, 2011, 3:46:37 AM12/1/11
to Dimitris Andreou, guava-...@googlegroups.com
I should have stated that it's a cache of protobuf like objects
(actually ZeroC Ice objects) from a remote server, so there isn't any
object sharing, maybe except for String objects. I'm fine with
knowing an rough estimate of an average size for sizing the cache, it
doesn't have to be exactly correct.

I don't care of one object is 50% larger than another one, so it
doesn't look like a weights will help solve the problem, as I still
need to know how much memory a single cached object will take. I'm
thinking I would effectively give each object the same weight.

Blair

Blair Zajac

unread,
Dec 1, 2011, 3:50:50 AM12/1/11
to Tim Peierls, guava-...@googlegroups.com
I'm hoping there's a library out there that can give you the size,
then I could build a representative one to size the cache.

Judging from what people have said, the web page showing the sizes the
different MapMaker maps was manually done?

Blair

Dimitris Andreou

unread,
Dec 1, 2011, 4:08:41 AM12/1/11
to Blair Zajac, guava-...@googlegroups.com
On Thu, Dec 1, 2011 at 12:46 AM, Blair Zajac <bl...@orcaware.com> wrote:
> I should have stated that it's a cache of protobuf like objects (actually
> ZeroC Ice objects) from a remote server, so there isn't any object sharing,
> maybe except for String objects.  I'm fine with knowing an rough estimate of
> an average size for sizing the cache, it doesn't have to be exactly correct.

Well, sounds like you have a relatively good estimate, the length of
the binary representation you get from the server.

>
> I don't care of one object is 50% larger than another one, so it doesn't
> look like a weights will help solve the problem, as I still need to know how
> much memory a single cached object will take.  I'm thinking I would
> effectively give each object the same weight.

I don't know about ZeroC, but for protobufs, the ratio is about this:
for N bytes of binary representation, you need ~10N memory footprint
for the corresponding object graph in java. I wouldn't expect ZeroC to
differ by much.

Dimitris Andreou

unread,
Dec 1, 2011, 4:13:19 AM12/1/11
to Blair Zajac, Tim Peierls, guava-...@googlegroups.com


On Thu, Dec 1, 2011 at 12:50 AM, Blair Zajac <bl...@orcaware.com> wrote:
> I'm hoping there's a library out there that can give you the size, then I
> could build a representative one to size the cache.
>
> Judging from what people have said, the web page showing the sizes the
> different MapMaker maps was manually done?

If you are talking about this page: http://code.google.com/p/memory-measurer/wiki/ElementCostInDataStructures, then no, it wasn't a manual process, I used the tool residing in that project. (I'm hoping to repackage it as part of caliper sooner than later.)

But that traverses reflectively the object graph, which is really expensive, not something you would want to be a part of your application, but of a benchmark, at most.
Reply all
Reply to author
Forward
0 new messages