Solved Memory leak in hazecast 1.9.4.6

246 views
Skip to first unread message

Eddy van Oosterbosch

unread,
Feb 1, 2012, 8:56:38 AM2/1/12
to haze...@googlegroups.com, Eddy van Oosterbosch

Hi,

 

In the following thread, an hazelcast memory leak is mentioned:

http://groups.google.com/group/hazelcast/browse_frm/thread/711003d81cb46ea?tvc=1&q=Re%3A+Memory+leak+in+1.9.4.6+using+multicast

Which has been fixed in the trunk already.

 

However, I still found another related memory leak using the “JoinTest” below. This memory leak is was also introduced in version 1.9.6.X

After an extensive search and elimination, I found that the following class is causing this memory leak:

 

http://hazelcast.googlecode.com/svn/trunk/hazelcast/src/main/java/com/hazelcast/impl/ThreadContext.java , Repo version: 2417

 

Solution: Use this version of ThreadContext.java instead:

http://hazelcast.googlecode.com/svn/tags/1.9.3/1.9.3.4/hazelcast/src/main/java/com/hazelcast/impl/ThreadContext.java

 

One small change is needed in order to compile the code:

84c84

<     public Transaction getTransaction() {

---

>     public TransactionImpl getTransaction() {

 

Note: jvisualvm is quite handy to visualize and analyze the heap usage for this problem.

 

Can this solution be committed into the hazelcast repo?

 

Thanks!

 

Eddy van Oosterbosch

 

 

import java.util.logging.Logger;

 

import com.hazelcast.config.Config;

import com.hazelcast.core.Hazelcast;

import com.hazelcast.core.HazelcastInstance;

 

public class JoinTest

{

        private static final int COUNT = 1000; // number of times to try

        private static final int SIZE = 2; // size of the cluster

       

        public static void main(String[] args) throws Exception

        {

                final Logger logger = Logger.getLogger("JoinTest");

               

                final HazelcastInstance[] instances = new HazelcastInstance[SIZE];

                for (int i = 0; (i < COUNT); ++i)

                {

                        logger.info("Attempt " + (i + 1));

                       

                        final Config config = new Config();

                        config.getGroupConfig().setName("uniqueName:" + i);

               

                        // create cluster

                        for (int j = 0; (j < instances.length); ++j)

                        {

                                instances[j] = Hazelcast.newHazelcastInstance(new Config());

                        }

                       

                        // verify cluster

                        for (int j = 0; (j < instances.length); ++j)

                        {

                                final int size = instances[j].getCluster().getMembers().size();

                                if (size != SIZE)

                                {

                                        logger.severe("OOPS, the cluster size of instance " + j + " was " + size);

                                        System.exit(0);

                                }

                        }

                       

                        // shutdown cluster

                        for (int j = 0; (j < instances.length); ++j)

                        {

                                instances[j].getLifecycleService().shutdown();

                                instances[j] = null; // just for good measures

                        }

                }

        }

}

 

Fuad Malikov

unread,
Feb 1, 2012, 11:19:35 AM2/1/12
to haze...@googlegroups.com
Eddy,

Thanks for the input. I found the problem with the latest ThreadContext and fixed it: http://code.google.com/p/hazelcast/source/detail?r=2420
 
It is in trunk now.

Regards



--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To post to this group, send email to haze...@googlegroups.com.
To unsubscribe from this group, send email to hazelcast+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/hazelcast?hl=en.

Fuad Malikov

unread,
Feb 1, 2012, 11:22:39 AM2/1/12
to haze...@googlegroups.com
BTW, the Eclipse Memory Analyzer is super super good.

-fuad

Fuad Malikov

unread,
Feb 1, 2012, 11:29:16 AM2/1/12
to haze...@googlegroups.com
A short workaround for the 1.9.4.6 users would be to start each new Hazelcastinstance on a separate thread. Note that the given example below starts new HazelcastInstance in the main thread. And the ThreadContext of the main thread leaks. If you would start it on a new thread each time which will eventually die, there want be a leak.

-fuad

Byron Albert

unread,
Feb 1, 2012, 12:38:32 PM2/1/12
to haze...@googlegroups.com
Fuad,

 Will you be creating a new release soon with all the memory fixes that I have seen over the last few days?  We have been running into some GC issues and would like to rule out any of these known fixed bugs.

Thanks
Byron

Fuad Malikov

unread,
Feb 2, 2012, 3:56:40 AM2/2/12
to haze...@googlegroups.com
Hi Byron,

These memory leaks happens only if you shutdown and start a new Hazelcast in the same JVM. It doesn't have anything to do with GC. And as far as I know you are not doing any similar thing. If you have a 10G heap, you have to tune the JVM, which may help to some extend or use the off-heap storage enabled Enterprise. 

We are currently heavily trying to release 2.0. So there might not be any release on 1.9.4 branch.

-fuad


--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To view this discussion on the web visit https://groups.google.com/d/msg/hazelcast/-/y7vYxyzHILwJ.

eddy.van.o...@ericsson.com

unread,
Feb 2, 2012, 4:53:48 AM2/2/12
to Hazelcast
Hi Fuad,

I have tested the "JoinTest" on http://hazelcast.googlecode.com/svn/trunk/hazelcast
(r2422) and it is working fine now.
It's really great to have such a quick response, thanks!

However, I'm still investigating the "TestMemLeak" on different
versions, which still causing another memory leak:
http://groups.google.com/group/hazelcast/browse_frm/thread/711003d81cb46ea?tvc=1&q=Re%3A+Memory+leak+in+1.9.4.6+using+multicast
In this thread it was mentioned the problem was fixed, but I'm still
able to reproduce this memory leak.

This particular test still shows a memory leak in versions:
1.9.3
1.9.4.6
latest trunk (r2422)

The funny thing about this test is that it has a 50/50 chance of
reproducing the fault.
I will get back on this as soon as I have some news/solution.

Greetings,

Eddy

On Feb 2, 9:56 am, Fuad Malikov <f...@hazelcast.com> wrote:
> Hi Byron,
>
> These memory leaks happens only if you shutdown and start a new Hazelcast
> in the same JVM. It doesn't have anything to do with GC. And as far as I
> know you are not doing any similar thing. If you have a 10G heap, you have
> to tune the JVM, which may help to some extend or use the off-heap storage
> enabled Enterprise.
>
> We are currently heavily trying to release 2.0. So there might not be any
> release on 1.9.4 branch.
>
> -fuad
>

eddy.van.o...@ericsson.com

unread,
Feb 2, 2012, 10:58:35 AM2/2/12
to Hazelcast
Hi Fuad,

I've done some more testing on the "TestMemLeak" test, and found the
following cause:

FactoryImpl
\_ LifeCycleServiceImpl lifeCycleService
\_ :
Node node

\_ SimpleBoundedQueue serviceThreadPacketQueue

The "serviceThreadPacketQueue" is the main occupier of the heap space
(which are visible as byte[]).
I was not able to simple cleanup these queues, maybe you have some
suggestions?
This test uses the one main thread as well.

I will check if we can use your proposed workaround (of using separate
thread).
Never the less, we are interested in the 2.0 release. Do you have an
idea when this could be available?

Thanks a lot,

Eddy


On Feb 2, 10:53 am, eddy.van.oosterbo...@ericsson.com wrote:
> Hi Fuad,
>
> I have tested the "JoinTest" onhttp://hazelcast.googlecode.com/svn/trunk/hazelcast
> (r2422) and it is working fine now.
> It's really great to have such a quick response, thanks!
>
> However, I'm still investigating the "TestMemLeak" on different
> versions, which still causing another memory leak:http://groups.google.com/group/hazelcast/browse_frm/thread/711003d81c...

eddy.van.o...@ericsson.com

unread,
Feb 10, 2012, 4:37:55 AM2/10/12
to Hazelcast
Hi Fuad,

The "JoinTest" memory-leak has also been solved in Hazelcast 1.9.4.8:
there is no visible leak after a 1000 iterations.

Thanks!

Eddy
> > versions, which still causing anothermemoryleak:http://groups.google.com/group/hazelcast/browse_frm/thread/711003d81c...
> > In this thread it was mentioned the problem was fixed, but I'm still
> > able to reproduce thismemoryleak.
>
> > This particular test still shows amemoryleakin versions:
> > 1.9.3
> > 1.9.4.6
> > latest trunk (r2422)
>
> > The funny thing about this test is that it has a 50/50 chance of
> > reproducing the fault.
> > I will get back on this as soon as I have some news/solution.
>
> > Greetings,
>
> > Eddy
>
> > On Feb 2, 9:56 am, Fuad Malikov <f...@hazelcast.com> wrote:
>
> > > Hi Byron,
>
> > > Thesememoryleaks happens only if you shutdown and start a new Hazelcast
> > > in the same JVM. It doesn't have anything to do with GC. And as far as I
> > > know you are not doing any similar thing. If you have a 10G heap, you have
> > > to tune the JVM, which may help to some extend or use the off-heap storage
> > > enabled Enterprise.
>
> > > We are currently heavily trying to release 2.0. So there might not be any
> > > release on 1.9.4 branch.
>
> > > -fuad
>
> > > On Wed, Feb 1, 2012 at 7:38 PM, Byron Albert <byro...@gmail.com> wrote:
> > > > Fuad,
>
> > > > Will you be creating a new release soon with all thememoryfixes that I

Fuad Malikov

unread,
Mar 1, 2012, 4:29:48 AM3/1/12
to haze...@googlegroups.com
Hi Eddy,



On Thu, Feb 2, 2012 at 5:58 PM, <eddy.van.o...@ericsson.com> wrote:
Hi Fuad,

I've done some more testing on the "TestMemLeak" test, and found the
following cause:

FactoryImpl
        \_ LifeCycleServiceImpl  lifeCycleService
                                                            \_  :
Node node

\_  SimpleBoundedQueue serviceThreadPacketQueue

The "serviceThreadPacketQueue" is the main occupier of the heap space
(which are visible as byte[]).
I was not able to simple cleanup these queues, maybe you have some
suggestions?

That Queue is simply a bounded Queue. It can never cause a leek. 
 
This test uses the one main thread as well.

I will check if we can use your proposed workaround (of using separate
thread).
Never the less, we are interested in the 2.0 release. Do you have an
idea when this could be available?

Hopefully this week.  
Reply all
Reply to author
Forward
0 new messages