Mysterious ghost object retrieval - no longer using hibernate-memcached in production

91 views
Skip to first unread message

Jason Buberel

unread,
Oct 10, 2009, 11:22:38 AM10/10/09
to hibernate-memcached
This is one of those stories about an impossibly hard to reproduce
error condition. It baffled me for quite a while, after which I gave
up and reverted to using EHCache for our hibernate object cache. I was
using he latest production versions of all components (hibernate-
memcached, spymemcache with binary protocol support, and memcached).

Our application used spymemcache+memcache in two ways: Directly and
via hibernate-memcached. The problem I will describe below never
affected our application in areas where we used just spymemcache
+memcache. We continue to use this combination in production without
problems.

Where problems occurred: Part of our application is a very long
running computation task. This is being run on a 64-bit CentOS5 server
with JDK1.6.0_16. The -Xmx is set to 8GB, and it will use every bit of
that over the course of several hours. The output of this task is what
we called a 'CitySummary' object - a large (hundreds of private fields/
members) that stores statistics data. We have a custom primary key
class - CitySummaryPK - that is composed of:

private Long mCityMasterId;
private Long mZipMasterId;
private String mState;
private ResidenceType mResidenceType;
private String mQuartile;
private String mDate;
private RollingAverage mRollingAverage;

Where ResidenceType and RollingAverage are simple beans, all
serializable. We use thse PKs to retrieve rows from our CitySummary
table, of course. After switching from EHCache to hibernate-memcached
+spycache, we started getting very, very rare errors: Our
CItySummaryPK's were retrieving the wrong CitySummary objects!!

For example, we would construct a CitySummaryPK for the combination of
"Sunnyvale-94086-CA-Single_Family-
Top_Quartile-2009-09-25-7_day_Average". We would then use hibernate's
session.load() method to retrieve the corresponding CitySummary
instance.

But much to our surprise, the CitySummary object returned was for a
DIFFERENT city or ZIP code - one that had been loaded earlier in the
statistics run. We process these statistics in order of State/City/ZIP
- and when the error occurred it was always the case that the
CitySummary object returned was one that had been loaded a few minutes
earlier. We were using a 900 second cache expiration:

<property name="hibernate.memcached.cacheTimeSeconds">900</property>

This would only happen very sporadically, and only on the largest of
our statistics tasks. As soon as we reverted back to EHCache, the
problem disappeared.

At the time, I wasn't savvy enough to configure an alternative key
strategy, and so we were using the default key strategy:
HashCodeKeyStrategy. Note that we'd never seen a problem with
conflicting hashcodes when using these CitySummaryPK objects to
retrieve CitySummary objects.

Q1: What is the suggested alternative key strategy for situations like
this/ SHA1 vs. String vs. Custom?
Q2: Even using the HashCodeKeyStrategy, what would cause this
condition to occur?

I am willing to give a different key strategy a try. For reference,
the hashCode method for our CitySummaryPK object is:

public int hashCode() {
int result = getCityMasterId().hashCode();
result = 31 * result + getZipMasterId().hashCode();
result = 31 * result + getState().hashCode();
result = 31 * result + getResidenceType().hashCode();
result = 31 * result + getQuartile().hashCode();
result = 31 * result + getDate().hashCode();
result = 31 * result + getRollingAverage().hashCode();
return result;
}

Thanks,
Jason

Jason Buberel

unread,
Oct 10, 2009, 1:13:43 PM10/10/09
to hibernate-memcached
One thought: Perhaps the hashCode() method on java.lang.Long is just
not 'unique enough':

return (int)(value ^ (value >>> 32));

I'll have to run a few tests, but because the conflicting keys were
always the wrong city/zip (not a mismatch of the other key components)
it makes sense that perhaps the conflicts could be caused by
Long.hashCode() returning identical values for dissimilar numbers.

Hmmm...

-jason

Jason Buberel

unread,
Oct 10, 2009, 1:26:29 PM10/10/09
to hibernate-memcached
Upon reading through the code, the other key strategies all seem to
rely on the 'concatenateKey()' method of AbstractKeyStrategy, which in
turn relies on the 'toString()' method of the key object returning a
unique value:

.append(regionName)
.append(":")
.append(clearIndex)
.append(":")
.append(String.valueOf(key)).toString();

So to use this key strategy for our entire deployment would require
that I revisit each toString() method and ensure it's uniqueness. Not
an appealing project :)

I'm starting to think a custom key strategy is going to be called for
- one that acts like hashcode but perhaps returns an MD5 or SHA1 hash
of all relevant member fields instead of returning an int value.

-jason

Ray Krueger

unread,
Oct 10, 2009, 1:53:57 PM10/10/09
to hibernate...@googlegroups.com
Yuck, composite keys.
If the hashcode key strategy isn't working you have a problem with
your hascodes. You're probably having problems with ehcache too.

-Ray
(blame the iPhone for typos)

Jason Buberel

unread,
Oct 10, 2009, 3:13:08 PM10/10/09
to hibernate-memcached
That was my initial thought too - but this problem never once occurred
while using EHCache, and has not recurred since switching back to
EHCache. It is only reproducible when using hibernate-memcached +
spymemcache + memcached.

So there must be some problem unique to that combination that results
in this problem. EHCache + hibernate have never shown this problem.

-jason

Ray Krueger

unread,
Oct 10, 2009, 3:41:46 PM10/10/09
to hibernate...@googlegroups.com
I'm back at home and took a peek at the code.
The Md5 and Sha1 key strategies extend the HashCodeKeyStrategy.
The HashCodeKeyStrategy uses the hashcode of the key object as the cache key.
The rest of the key logic, region name handling and what not, is in
the AbstractKeyStrategy.
Which implements that as...
String keyString = concatenateKey(regionName, clearIndex,
transformKeyObject(key));

The transformKeyObject method in HashCodeKeyStrategy gets the
hashCode, then turns it into a string so it can easily be combined
with the other key elements.
If you look closer at AbstractKeyStrategy it is not using the
"toString()" of your object, it is actually calling toString() on the
result of "transformKeyObject", which as I said, is the hashCode of
the key object. Granted, it's always calling String.valueOf on a
String, that's silly, but not a problem.

You're going to have to do some debugging to figure out how you're
getting duplicate keys. The hibernate-memcached keys are a combination
of values (region name, clearIndex and hashCode). Which in your case
probably looks like "CitySummaryPK.0.12345".
Unless this key gets over 255 characters it will be used as the cache
key, if it is too long it will be hashed automatically.

I've gone into a bit too much detail probably, but my point is that
this whole thing revolves around the hashCode returned from the key
object. Also, be sure to realize that the key Object hibernate gives
me is an instance of org.hibernate...CacheKey, not your object
directly.

Seems like you're going to have to bust out the debugger on this one.
The fun part is figuring out what object you asked for, and what
object you actually got. Then from there you can look at the hashCode
those objects generate and see if they're coming up the same.

And don't ever use composite keys :P Sorry, had to slip that in there :)

Jason Buberel

unread,
Oct 11, 2009, 12:59:54 AM10/11/09
to hibernate-memcached
This is now getting scary - consider this code:

CitySummaryPK cspk2 = new CitySummaryPK("CA", 2023L, "2009-09-11",
QuartileName.ALL, mf, RollingAverage.SEVEN_DAY, 1188L);
LOGGER.info("CSPK2 hashcode: " + cspk2.hashCode());
Thread.sleep(5000);
LOGGER.info("CSPK2 hashcode: " + cspk2.hashCode());

Run it once, and it produces the following output:

2009-10-10 21:52:06 INFO main - CSPK2 hashcode: 590201433
2009-10-10 21:52:06 INFO main - CSPK2 hashcode: 590201433
2009-10-10 21:52:11 INFO main - CSPK2 hashcode: 590201433
2009-10-10 21:52:11 INFO main - CSPK2 hashcode: 590201433

Then wait a few seconds, and run it again:

2009-10-10 21:53:05 INFO main - CSPK2 hashcode: 596433974
2009-10-10 21:53:05 INFO main - CSPK2 hashcode: 596433974
2009-10-10 21:53:10 INFO main - CSPK2 hashcode: 596433974
2009-10-10 21:53:10 INFO main - CSPK2 hashcode: 596433974

It seems as though each JVM instance will produce a new/unique value
for the hashCode for an otherwise identical component. I have traced
this down to one of the Enum members of the class, which uses the
JVM's native 'hashCode()' method:


/**
* Returns a hash code for this enum constant.
*
* @return a hash code for this enum constant.
*/
public final int hashCode() {
return super.hashCode();
}

which in turn invokes Object.hashCode():

public native int hashCode();

So when running this across JVMs in a multi-server configuration with
memcached, we are getting different (possibly colliding) hashKey()
values for a pair of identical keys.

This is not good. Not good at all. If I replace:

result = result * 31 * myEnum.hashCode()

with:

result = result * 31 * mEnum.name().hashCode()

The results are consistent across JVM invocations.

This is gonna need some fixing, and fast. Note to self: Do not use
Enum.hashCode on any object stored in memcached when memcached servers
are shared across JVMs.

Ray Krueger

unread,
Oct 11, 2009, 9:04:14 AM10/11/09
to hibernate...@googlegroups.com
Are you overriding hashcode on the enum or is that code from the JDK?

Enum instances are created by the jvm at runtime. Which means they use
Object.hashcode. You can't use that, it will be different with every
startup. Override hashcode on your enums. Use the ordinal value or name
().hashcode.




-Ray
(blame the iPhone for typos)

Ray Krueger

unread,
Oct 11, 2009, 11:09:45 AM10/11/09
to hibernate...@googlegroups.com
OK after looking closer at the email I see you came to the same
conclusion about the Enum.hashCode() problem. In fact, many others
have come to the same conclusion... http://bit.ly/4ooBQr

You're seeing it with memcached vs EHCache probably for the reasons
you mentioned. If you're limiting EHCache to 500 instances, how many
instances do you really end up storing in memcached when there is no
limit? The more instances you're storing the higher your chance of
running into this hashcode collision from your code.

The fix for this does not lie in hibernate-memcached or spymemcached,
the fix has to be in your code. Unfortunately due to the needs of the
equals() method the hashcode() method is marked as final in
Java.lang.Enum. You can see the explanation there in the first link
for the bug opened with Sun. You've already found the answer to the
problem when you started using enum.name() to prove the problem with
Enum hashcodes.

Honestly it's pretty astounding that they chose to do that with Enums,
it's amazingly lame. You could possibly implement a "safeHashCode()"
method on your enums that returns name().hashcode().

public enum SomeEnum {

ONE,
TWO;

private final int safeHashCode;

SomeEnum() {
this.safeHashCode = name().hashCode();
}


public int safeHashCode() {
return safeHashCode;

Jason Buberel

unread,
Oct 11, 2009, 11:34:36 AM10/11/09
to hibernate...@googlegroups.com
<nods> Yep, it is final. No overrides, and you'll like it!

For now, simply replaced every invocation of OneOfMyEnumClasses.hashCode() with OneOfMyEnumClasses.name().hashCode()

Hard to say for how long this has been affecting us, and to what extent. Quite scary really. But all gone now. I also disagree with their implementation as it makes the assumption that different JVMs will never interact at runtime. A very bad assumption.

-jason
--
Jason L. Buberel
ja...@buberel.org

Ray Krueger

unread,
Oct 11, 2009, 11:49:17 AM10/11/09
to hibernate...@googlegroups.com
Yeah, it's even worse :P
1.6.0_14 Enum.hashcode() on my ubuntu box returns something random every time.
1.6.0_16 Enum.hashcode() on my Macbook Pro here returns a consistent
hashcode on every round.

1.6.0_14 on Ubuntu implements Enum.hashCode() using super.hashCode()
1.6.0_16 on the Mac implements Enum.hashCode() using System.identityHashCode()

I'm updating java on my ubuntu box to see if 1.6.0_16 changes that
implementation.
It'd be interesting if Apple fixed it and Sun didn't.

Ray Krueger

unread,
Oct 11, 2009, 11:51:54 AM10/11/09
to hibernate...@googlegroups.com
Sorry, that didn't read right...

On Sun, Oct 11, 2009 at 10:49 AM, Ray Krueger <raykr...@gmail.com> wrote:
> Yeah, it's even worse :P
> 1.6.0_14 Enum.hashcode() on my ubuntu box returns something random every time.
> 1.6.0_16 Enum.hashcode() on my Macbook Pro here returns a consistent
> hashcode on every round.
>
> 1.6.0_14 on Ubuntu implements Enum.hashCode() using super.hashCode()
> 1.6.0_16 on the Mac implements Enum.hashCode() using System.identityHashCode()
>
> I'm updating java on my ubuntu box to see if 1.6.0_16 changes that
> implementation.
> It'd be interesting if Apple fixed it and Sun didn't.

The implementation of Enum.hashCode() is effectively the same (in
theory). It'd be interesting if Apple fixed Enums in the JVM native
code and Sun didn't.

Jason Buberel

unread,
Oct 11, 2009, 11:58:17 AM10/11/09
to hibernate...@googlegroups.com
1.6.0_16 on Win (from Sun distribution) also implements Enum.hashCode() as super.hashCode(). At least that doesn't return a random value every invocation - the value only changes when run in a new JVM instance.

FWIW, late last night I did restore the use of hibernate-memcached in our production cluster, and the stats processing ran completely without errors this time. All I did was replace all calls to Enum.hashCode() with Enum.name().hashCode() as a quick fix.  Hooray!

Onwards,
Jason

Ray Krueger

unread,
Oct 11, 2009, 11:59:59 AM10/11/09
to hibernate...@googlegroups.com
hahaha this is amazing.
OK I updated Ubuntu to 1.6.0_16 and it returns a consistent hashcode
*MOST* of the time lol.
I ran it like 10 times, and it returned the same hashcode like 80% of
the time. This seems more consistent with what you're seeing.

Let me try a larger sample on my MBP...
Amazing. Yep, the apple JDK returns the same value every single time.

Ray Krueger

unread,
Oct 11, 2009, 12:00:47 PM10/11/09
to hibernate...@googlegroups.com
> 1.6.0_16 on Win (from Sun distribution) also implements Enum.hashCode() as
> super.hashCode(). At least that doesn't return a random value every
> invocation - the value only changes when run in a new JVM instance.

Yeah, that's what I'm doing it, starting a new instance over and over.

Ray Krueger

unread,
Oct 11, 2009, 12:07:53 PM10/11/09
to hibernate...@googlegroups.com
> FWIW, late last night I did restore the use of hibernate-memcached in our
> production cluster, and the stats processing ran completely without errors
> this time. All I did was replace all calls to Enum.hashCode() with
> Enum.name().hashCode() as a quick fix.  Hooray!
>

You might get better performance by pre-calculating the hashcode in
the Enum constructor as in my example, rather than calculating it on
every invocation.
Probably a trivial difference if there is one at all. The JVM can
probably optimize it out.
*shrug*

Glad we figured that out though, that's awful.

Ran Enrico Magen

unread,
Oct 11, 2009, 12:08:44 PM10/11/09
to hibernate...@googlegroups.com
Guys, I must say I've been following your discussion very closely (since we're also using hibernate-memcached in production, and naturally I followed to see if there's a problem I should be aware of) and your conclusions are nothing less than fascinating. :)

Jason Buberel

unread,
Oct 11, 2009, 12:09:02 PM10/11/09
to hibernate...@googlegroups.com
Probably - but for now, things work. I'll run it through the profiler later.

I also just added a comment to the project wiki to warn users that no keyed object should rely on Enum.hashCode() as part of its hashCode() implementation. Safer to avoid this completely until it has been fully fixed across all platforms (not holding breath).

-jason

Ray Krueger

unread,
Oct 11, 2009, 12:14:25 PM10/11/09
to hibernate...@googlegroups.com
> Guys, I must say I've been following your discussion very closely (since
> we're also using hibernate-memcached in production, and naturally I followed
> to see if there's a problem I should be aware of) and your conclusions are
> nothing less than fascinating. :)

Insanity! heh well hopefully you don't run into this. It should really
only come up if you're using composite key objects with Enums in their
value and calculating hashcodes.

In my experience I stopped implementing equals() and hashcode() on my
Hibernate domain objects a long time ago. It leads to nothing but
trouble. And of course, not bashing Jason here, don't ever use
composite keys unless you're stuck integrating with some legacy
database.

If your domain objects use surrogate IDs the hibernate Type system
will always use the ID as the hashcode() vaue of the CacheKey object.
On a composite key it seems it uses the real hashcode of the PK
object.

Hucmuc

unread,
Dec 5, 2009, 7:38:27 PM12/5/09
to hibernate-memcached
Hashcode is not guaranteed to return a unique value and one should not
assume this. It works for ehcache since since it uses a map to store
the values and uses equals when the hashcodes are the same.

Ray Krueger

unread,
Dec 5, 2009, 8:54:43 PM12/5/09
to hibernate...@googlegroups.com
Please read the entire thread. Hashcodes themselves were not the
problem.

-Ray
(blame the iPhone for typos)

> --
>
> You received this message because you are subscribed to the Google
> Groups "hibernate-memcached" group.
> To post to this group, send email to hibernate...@googlegroups.com
> .
> To unsubscribe from this group, send email to hibernate-memca...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/hibernate-memcached?hl=en
> .
>
>
Reply all
Reply to author
Forward
0 new messages