Session replication with Hazelcast

731 views
Skip to first unread message

mongonix

unread,
Nov 28, 2011, 6:41:35 AM11/28/11
to haze...@googlegroups.com
Hi,

I'm trying to implement a specific use-case involving a stateful service. After I started a discussion here: https://groups.google.com/d/topic/hazelcast/TJIeAvMxNpk/discussion , I realized that proposed solutions may work, but sound more like a "hack". 

I looked around to see how other people implement such concepts like session replication and sticky sessions using distributed caches (please, see links below). Based on that I have a number of questions:

1) Hazelcast WM implements HTTP session replication with configurable sticky sessions support. Right now, the implementation is rather HTTP-sessions specific. But I think that it could be generalized to support any kind of sessions, as long as they follow the same pattern, i.e. each session's data is a collection of key/value pairs that need to be replicated among serving nodes. This generic component becomes then re-usable. This generic part should be always combined with a technology specific part (i.e. HTTP servlets stack) to hook into the session mechanism used by a run-time platform.  

Does this proposal make sense for you?

2) JBoss Cache had in the past something called TreeCache, which was a distributed multi-level Map essentially. The implementation was really using a tree-like internal representation, IIRC. Later, when Infinispan was introduced, they provided an implementation of a TreeCache interface on top of their flat distributed Map implementation (and Hazelcast uses also a flat Map representation). 

    Now, TreeCache allows for a very straight forward mapping of session-like concepts or hierarchical resources to a distributed data structure. Therefore, it would be extremely useful to have such an abstraction in Hazelcast.

 Among other things, it would allow for a very straight-forward implementation of (HTTP-)session replication. And an initial implementation of TreeCache on top of Hazelcast can take the inspiration from the Infinispan's implementation, I guess. Later, it may be implemented even more efficiently using low-level Hazelcast specifics.
What do you think?

3) As an addition point, which is not as important as the first two, but still worth mentioning I'd like to ask about the following feature:

    Sessions are complex objects with eventually a lot of data and therefore replicating the whole state of the session upon every change can be too expensive. The idea may be then to replicate at the attribute (i.e. key/value) pair level. But even that can be rather expensive, if attribute values are big objects. Therefore, some frameworks (Terracotta, JBoss PojoCache, JBoss HTTP clustering with field-level replication) allow for very fine-grained replication based on detection of deltas between changes at the object field level. This greatly reduces the amount of data that needs to be replicated, but is much harder to implement. 
    Question: Are there any plans or ideas to implement something similar for Hazelcast?



And here are some links related to this discussion:

Infinispan has TreeCache support implemented on top of their simple flat Map cache APIs:

JBoss TreeCache implementation can be found here:

Hazelcast-based HTTP session replication implementation can be found here:

JBoss TreeCache based HTTP session replication:

Regards,
   Leo

mongonix

unread,
Nov 30, 2011, 9:30:20 AM11/30/11
to haze...@googlegroups.com
Talip or anyone,

any feedback on these ideas?

mongonix

unread,
Jan 4, 2012, 10:39:03 AM1/4/12
to haze...@googlegroups.com
Fuad or Talip,

Now that you are back from your USA trip, may be you could provide some feedback?

Thanks!

Talip Ozturk

unread,
Jan 4, 2012, 6:08:43 PM1/4/12
to haze...@googlegroups.com
> Now that you are back from your USA trip, may be you could provide some
> feedback?

USA trip was very productive and busy. It was mainly a business trip;
giving 2-day training/consulting and visiting prospects/customers..
but we added more to it of course. Some feedback:

* Top financials and banks in USA are in production with Hazelcast. A
telecom giant is using Hazelcast as the backbone of their new
architecture and the first phase of the project will be in production
this month. Some of these guys know Hazelcast amazing well; they know
the history, internals, implementation details, issues and the ways to
fix them.

* Developers and architects we talked were quite creative in terms of
their use of Hazelcast. They have been using topics, queues, maps,
executor service and atomic numbers in so many different ways and were
able to use the API efficiently with the help of its being very simple
but we have also seen design mistakes; not all of them are
cost-conscious. Being able to design the whole thing in a way that
will use less resource(network, memory and CPU) and be less affected
by failures is still very challenging. Looks like we should write more
about the best practices... Common use-cases should also be
documented.

* Some companies already converted from other IMDG products to
Hazelcast, while some are in the process of converting. IMDGs look
good on paper but each miserably fails to cover all edge cases (this
is also true for Hazelcast :), even 10 years old ones!!. No product is
still perfect. So whoever covers the most edge case has a greater
chance to win. Performance and feature-set are still the biggest
concerns when choosing IMDG product though.

* We returned with a good list of improvement/feature requests... from
socket level custom authentication to more sophisticated backup
strategies.

* We also heard bad Hazelcast stories from our customers; they
mentioned connection/network issues... or stability issues when
restarting a super-client node... or edge cases specific to them.
These are high priority issues for us as they are reported by
customers. 2.0 should fix most of these issues. By the way Hazelcast
2.0 will be internally very different than Hazelcast 1.x.

We also had a lot of fun; met with so many nice people, enjoyed the
live country music in Nashville. Biking by the beach in LA and then
driving to SF on Route 101N, via Malibu, Santa Barbara was awesome.
Evening at the Marriott Revolving Rooftop Bar in NYC was among the
many other great things we had... It was also an excellent opportunity
for us to see and spend time with some of the people in this
mailing-list.


-talip

mongonix

unread,
Jan 5, 2012, 3:16:28 AM1/5/12
to haze...@googlegroups.com
Hi Talip,

Thanks a lot for this report about your trip to USA! Sounds very interesting!

It would be also very interesting to see more of trip's outcomes in form of:
- filed issues/feature enhancements
- improved best practices guidelines
- described or modified and anonymized real-life use-cases (if you are allowed to disclose some details) used by big players (banks, telcos). This ones could be very valuable for convincing new enterprise users and giving Hazelcast more credibility
- etc

So, I guess many of Hazelcast users are looking forward to hear and see more from you over next weeks and months!

Thanks again,
   Leo

P.S. Coming back to my original question about the feedback, I actually meant feedback on the topic of this thread, i.e. on session replication. But I do not regret that you understood it as a feedback on your trip to USA. Do to this small misunderstanding we all got a lot of very interesting information! :-) It is way more interesting than the topic of this thread, I'd say. Nevertheless, if you'd find some time to provide some feedback on the original first post in this thread, I'd really appreciate it. Thanks!

Talip Ozturk

unread,
Jan 5, 2012, 4:45:34 AM1/5/12
to haze...@googlegroups.com
> 1) Hazelcast WM implements HTTP session replication with configurable sticky
> sessions support. Right now, the implementation is rather HTTP-sessions
> specific. But I think that it could be generalized to support any kind of
> sessions, as long as they follow the same pattern, i.e. each session's data
> is a collection of key/value pairs that need to be replicated among serving
> nodes. This generic component becomes then re-usable. This generic part
> should be always combined with a technology specific part (i.e. HTTP
> servlets stack) to hook into the session mechanism used by a run-time
> platform.
>
> Does this proposal make sense for you?

It surely does make sense. We have to play with it and see because
sometimes generalization might not be that good. because you can
optimize things specific to ,say HTTP sessions. But I agree I have
seen people coded their own session clustering solution just because
we didn't have a general purpose one.

> 2) JBoss Cache had in the past something called TreeCache, which was a
> distributed multi-level Map essentially. The implementation was really using
> a tree-like internal representation, IIRC. Later, when Infinispan was
> introduced, they provided an implementation of a TreeCache interface on top
> of their flat distributed Map implementation (and Hazelcast uses also a flat
> Map representation).
>
>     Now, TreeCache allows for a very straight forward mapping of
> session-like concepts or hierarchical resources to a distributed data
> structure. Therefore, it would be extremely useful to have such an
> abstraction in Hazelcast.
>  Among other things, it would allow for a very straight-forward
> implementation of (HTTP-)session replication. And an initial implementation
> of TreeCache on top of Hazelcast can take the inspiration from the
> Infinispan's implementation, I guess. Later, it may be implemented even more
> efficiently using low-level Hazelcast specifics.
> What do you think?

Distributed Tree implementation is planned and yes I agree it is a good fit.

> 3) As an addition point, which is not as important as the first two, but
> still worth mentioning I'd like to ask about the following feature:
>
>     Sessions are complex objects with eventually a lot of data and therefore
> replicating the whole state of the session upon every change can be too
> expensive. The idea may be then to replicate at the attribute (i.e.
> key/value) pair level. But even that can be rather expensive, if attribute
> values are big objects. Therefore, some frameworks (Terracotta, JBoss
> PojoCache, JBoss HTTP clustering with field-level replication) allow for
> very fine-grained replication based on detection of deltas between changes
> at the object field level. This greatly reduces the amount of data that
> needs to be replicated, but is much harder to implement.
>     Question: Are there any plans or ideas to implement something similar
> for Hazelcast?

I agree with what you are saying but I don't think this is good for
Hazelcast as we focus on scalability. Replication is not scalable.
Even if you send only the small updates, you still copy all these
small updates onto all JVMs. And replication has to be synchronous to
ensure data consistency. I am not saying it is wrong or bad. What they
have will be useful for some people for sure and it is good for them.
Sending deltas over reliable-multicast might fix things to some level.
Hazelcast is not trying to be the answer for everything. Hazelcast is
all about scaling so I don't think we will implement that; at least
until we first implement reliable-multicast to make it
scalable-enough.

-talip

mongonix

unread,
Jan 5, 2012, 5:02:42 AM1/5/12
to haze...@googlegroups.com
Hi Talip,

Thanks a lot for a quick reply!
 

It surely does make sense. We have to play with it and see because

sometimes generalization might not be that good. because you can
optimize things specific to ,say HTTP sessions. But I agree I have
seen people coded their own session clustering solution just because
we didn't have a general purpose one.

> 2) JBoss Cache had in the past something called TreeCache, which was a
> distributed multi-level Map essentially. The implementation was really using
> a tree-like internal representation, IIRC. Later, when Infinispan was
> introduced, they provided an implementation of a TreeCache interface on top
> of their flat distributed Map implementation (and Hazelcast uses also a flat
> Map representation).
>
>     Now, TreeCache allows for a very straight forward mapping of
> session-like concepts or hierarchical resources to a distributed data
> structure. Therefore, it would be extremely useful to have such an
> abstraction in Hazelcast.
>  Among other things, it would allow for a very straight-forward
> implementation of (HTTP-)session replication. And an initial implementation
> of TreeCache on top of Hazelcast can take the inspiration from the
> Infinispan's implementation, I guess. Later, it may be implemented even more
> efficiently using low-level Hazelcast specifics.
> What do you think?

Distributed Tree implementation is planned and yes I agree it is a good fit.


Both statements from you are very encouraging and promising! Do you have any idea when you are going to introduce these features, i.e. improved session replication and distributed trees? Any hope to see them in 2012?

Thanks,
  Leo

Talip Ozturk

unread,
Jan 5, 2012, 6:57:56 AM1/5/12
to haze...@googlegroups.com
>> Distributed Tree implementation is planned and yes I agree it is a good
>> fit.
>
>
> Both statements from you are very encouraging and promising! Do you have any
> idea when you are going to introduce these features, i.e. improved session
> replication and distributed trees? Any hope to see them in 2012?

Yes, there will be more improvements on session replication. Cannot
promise for distributed tree implementation though.

-talip

mongonix

unread,
Jan 5, 2012, 7:25:06 AM1/5/12
to haze...@googlegroups.com
Hi Talip

Just for the case: I filed two JIRA issues on those topics a few weeks ago. If they match your thinking about session replication and distributed trees you could use them to track the progress.



Reply all
Reply to author
Forward
0 new messages