The ShardResolutionStrategy

12 views
Skip to first unread message

H.Taylor

unread,
Nov 21, 2008, 5:04:26 AM11/21/08
to Hibernate Shards Dev
I have unfortunately encountered a major flaw in my design which I
have an idea how to fix but sharing the issue may lead to a better
solution.

My data model consists of object hierarchies which all have a root
object "Company". The business requirements need to shard the data by
Company, so I have included a shardid in the persistent entity.

All objects in scope must have a Company object associated with it
according to predefined business rules, which are by nature of the
architecture defined in the ShardSelectionStrategy ie

if (obj instanceOf Company)
{
id = ((Company) obj).getShardId
}
if (obj instanceOf Job)
{
id = ((Job) obj).getCreator().getCompany.getShardId
}


So this enforces the following
1) Every object must be associated with a Company
2) All objects in the hierarchy must be on the same shard.


I have extended the AnnotationSessionFactoryBean to create a
"ShardedAnnotationSessionFactory" so I can add shard configurations
into the application in the applicationContext.xml

It flexible in the sense that I can store multiple Companies on one
shard or if single DB tenancy is required I can have one Company on a
shard.

At present I have 4 shards configured with multiple Companies on each
shard. I have a Hibernate Search Index which spans all the shards and
I have created a ShardedFullTextSession and Query implementation which
is a bit of a hack, and beyond the scope of this post.

So at present I have multiple shards(MYSQL), single index using
HibernateSearch, a DAO design Pattern with Hibernate implementations
all within a Spring MVC web application.

So whats working is search, retrieval and persistence of a transient
object, but not persistence of a persistent object (In laymens terms
I can create a new object but not update an exisiting one).


The problem is that the default ShardResolutionStrategy cannot resolve
which shard the object resides on.
This is the AllShardsShardResolutionStrategy. After stepping through,
it appears to take the follwing steps

a) Check each shard session for existence of the context object. In my
case its not finding it in any of the sessions (I am not sure why)

b) Next it tries to use the shard resolution strategy to try and find
it. It resolves to 0 which is incidently the id of my first shard, so
if the object exists on any of the other shards it wont persist and I
get a StaleStateException.

Once I had established this I thought I could just write a new
ShardResolutionStrategy which would work along the same lines as my
ShardSelectionStrategy.

With the present framework as far as I can see I cant do that because
the context object is not in scope as the ShardResolutionStrategyData
only contains the name and the id of the entity.

Lets say for instance this class ShardResolutionStrategyData is
modified to hold (if it exists) a context object do you think this
would work? It should not affect operation of any other resolution
strategy and in the resolution strategy I intend to write will use the
AllShardsShardResolutionStrategy if a context object does not exist in
the ShardResolutionStrategyData. This should give the resolution
strategy the versatility required.

I am prototyping so everything is up in the air so any comments or
criticism will be welcome.





Max Ross

unread,
Nov 28, 2008, 12:10:04 AM11/28/08
to hibernate-...@googlegroups.com
Thanks for the thorough write-up.  We've definitely considered extending the interface to ShardResolutionStrategy to allow more information to be provided.  The issue is that we use the resolution strategy in scenarios where the only info we have about what we're trying to find is the pk and the class, and if we're asking developers to implement different logic based on how much data is provided I think that's an indication that we don't have a specific enough api.  I'd rather consider adding another component to the ShardStrategy that is invoked for update/merge/repliicate.  That said, before we consider a change of that magnitude I think we need to figure out why your object is not being found in any of the existing sessions.  If you've looked it up before you updated it really should be there.  My advice would be to write a small testcase that establishes a txn, looks up your object using get(), modifies a property on the object, and then calls save.  Step through ShardedSessionImpl.getShardForObject() and into the shard.getSession().contains() call and see what's there.  Perhaps you have an issue with your equals() or hashcode() impl?  The object that was returned from get() _should_ reside in one of those Sessions.

Max

H.Taylor

unread,
Nov 28, 2008, 4:43:46 AM11/28/08
to Hibernate Shards Dev

Hi Max,

Thanks for your reply

I am in agreement with your analysis , mostly that the object should
be exisiting in one of the sessions.

Implemented what I stated above and it appears to work pretty well.

I added the context object to ShardResolutionStrategyData and added a
constructor to take this object.


In determineShardsObjectViaResolutionStrategy I pass the reference to
the context object into the ShardResolutionStrategyData instance via
the constructor.

My ResolutionStrategy tests for this object and if null will call the
super classes (AllShardsShardResolutionStrategy)
method selectShardIdsFromShardResolutionStrategyData.

This approach seems to be "minimally invasive" but by no means the
best from a design standpoint but it works for my prototype and the
framework appears to hold up to a good hammering from my suit of
JMETER tests.

Having said that I am going to take your points on board namely

1) Trying to establish why the object does not exist in any of the
sessions. It should as you said. I am pretty sure its something to do
with my implementation, its just a case of finding out what.

2) If I cannot solve that, then looking at a better way to do what I
have done.

To be honest my goal will be 1) as then I can move back to your latest
release and I will have no integration issues if you add new features.
> > criticism will be welcome.- Hide quoted text -
>
> - Show quoted text -

Max Ross

unread,
Nov 28, 2008, 9:44:55 AM11/28/08
to hibernate-...@googlegroups.com
Sounds very reasonable.  Let me know if you need more assistance, and when (let's be optimistic) you figure out why your object isn't being found in any of the Sessions please post again, I'm curious to know what's going on.

Thanks,
Max

H.Taylor

unread,
Dec 9, 2008, 10:54:13 AM12/9/08
to Hibernate Shards Dev
Hi Max

I have spent a bit of time on this and I have a question.

I will step through the code. I am debugging an update operation

Ln 788 ShardedSessionImpl.applySaveOrUpdateOperation()

If I understand what is going on here, for the purposes of
optimization you are looking in the session for the object (Which in
theory should be cached in the session associated with the shard it is
located on). So from Ln 788, the execution jumps to ln 1531
getShardIdForObject and this is where the problem is originating.

In my case my "obj" resides on shard id 2 (I have assigned the shards
ids, 0,1,2,3, so 4 in total), but it appears that the only shard that
has a session associated with it is first shard in the list.

Consequently in getShardForObject(ln 1485), since the session is null
in the shardsToConsider list it will not try to find the object on the
shard it exists on.

Once it cannot resolve the shard it uses the resolution strategy and
then the merge and save logic if the shard has not been resolved.

So, it appears the reason that it is not looking in the session for
the object is because in the context of an update action the session/s
are not associated with the shards in the ShardedSessionImpl, so
during the iteration it is not considered

My next questions to resolve my issue would be;

Under these circumstances, would I expect every Shard to have a
session associated with it, or should I expect only the shard that
contains the object to have a session associated with it? I would
imagine all 4 shards shoudl have associated sessions?

If I can clear that up, I can further my investigation in the
knowledge that it is my implementation and not an issue with the
framework.

Thanks
> > > - Show quoted text -- Hide quoted text -

Max Ross

unread,
Dec 9, 2008, 2:55:05 PM12/9/08
to hibernate-...@googlegroups.com
Very interesting.  In order to conserve memory we avoid allocating Sessions until we absolutely need them, so there certainly are scenarios where a Shard might not have a Session associated with it.  You mentioned 'merge.'  Are you explicitly calling merge() on your object?  If you were loading the object using ShardedSession 1 and then calling merge using ShardedSession 2 I think you would see the behavior you're describing, since the object would be associated sith a Session that belonged to ShardedSession 1 (which is probably closed at this point) and not with a Session that belongs to ShardedSession 2.

H.Taylor

unread,
Dec 10, 2008, 5:43:33 AM12/10/08
to Hibernate Shards Dev
Hi Max

Thanks for your reply,

A few points

a) RE merge functionality. I am not explicitly merging anything. I
was making a reference to the high level functionality as I understand
it in the method applySaveOrUpdateOperation in ShardedSessionImpl
which starts on line 805 if the object has not been resolved in the
session or via the resolution strategy. Apologies for the confusion.

I think the technical issue here is that the session is not associated
with the shard when trying to resolve the object from the session.

In my opinion this raises the following points

a) Under what circumstances does the shard release the session?

b) If this the behavior is better from an overall performance
perspective(It seems it would be), then determining the shard via the
Shard Resolution Strategy would be more heavily used to determine the
shard id. Since when we are using the resolution to determine the
shard, we only have the entityname and id in context. So our
resolution strategy would have to retrieve the object and then update
it. (Unless the object to update is in the context of the SRStrategy)

c) I am mindful of the fact that all this can be avoided if the PK has
the shard id encoded into it. However these numbers are big and
unweildy. But I suppose it is fair to say that using the
ShardedUUIDGenerator would solve any issues I have.


H
> ...
>
> read more »- Hide quoted text -
Reply all
Reply to author
Forward
0 new messages