[2.1] Suffering from memory leaks, too much NioAcceptedSocketChannel loaded by "sun.misc.Launcher$AppClassLoader" cannot be gc.

2,476 views
Skip to first unread message

Mark HUANG

unread,
May 8, 2013, 1:16:53 AM5/8/13
to play-fr...@googlegroups.com
I have been suffering from memory leaks for a long time about my web project. and I couldnot figure out what's wrong with it.
When the project starts, it only need 600m memory, and then it grows gradually. After 3 days, the memory cost will be around 4g and then it will freeze soon, which means it wont accept any request anymore.
I printed the jmap info just before it froze several days before.

num     #instances         #bytes  class name
----------------------------------------------
   1:         14496     2845179264  [Ljava.lang.Object;
   2:        285578       48162912  [B
   3:          8754       40270440  [I
   4:       1029353       32939296  java.util.HashMap$Entry
   5:        686482       27459280  org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext
   6:       1704148       27266368  java.lang.Object
   7:        170290       20434800  org.jboss.netty.channel.socket.nio.NioAcceptedSocketChannel
   8:        170401       19084912  java.net.SocksSocketImpl
   9:        170306       19074272  sun.nio.ch.SocketChannelImpl
  10:        681195       16348680  java.net.InetSocketAddress
  11:        170290       14985520  org.jboss.netty.handler.codec.http.HttpRequestDecoder
  12:        160676       14204712  [C
  13:        183171       11868480  [Ljava.util.HashMap$Entry;
  14:        340694       10902208  java.net.Inet4Address
  15:        340578       10898496  org.jboss.netty.util.internal.ConcurrentHashMap$HashEntry
  16:         74687       10846416  <constMethodKlass>
  17:         74687        8972232  <methodKlass>
  18:        171989        8255472  java.util.HashMap
  19:        511038        8176608  java.util.concurrent.atomic.AtomicInteger
  20:        170306        8174688  sun.nio.ch.SocketAdaptor
  21:        170291        8173968  org.jboss.netty.channel.AbstractChannel$ChannelCloseFuture
  22:        170290        8173920  org.jboss.netty.channel.socket.nio.DefaultNioSocketChannelConfig
  23:          6637        7213720  <constantPoolKlass>
  24:        215371        6891872  java.lang.String
  25:        170810        6832400  java.lang.ref.Finalizer
  26:        170289        6811560  sun.nio.ch.SelectionKeyImpl
  27:        107356        6780112  <symbolKlass>
  28:          6637        6192456  <instanceKlassKlass>
  29:        347511        5560176  java.lang.Integer
  30:        170307        5449824  [Ljava.nio.channels.SelectionKey;
  31:        170291        5449312  org.jboss.netty.channel.DefaultChannelPipeline
  32:        169465        5422880  org.jboss.netty.channel.AdaptiveReceiveBufferSizePredictor
  33:        170482        4091568  java.util.concurrent.ConcurrentLinkedQueue$Node
  34:        170420        4090080  java.io.FileDescriptor
  35:        170300        4087200  java.util.concurrent.ConcurrentLinkedQueue
  36:        170290        4086960  org.jboss.netty.channel.socket.nio.AbstractNioChannel$WriteRequestQueue
  37:        170290        4086960  org.jboss.netty.util.internal.ThreadLocalBoolean
  38:        170290        4086960  org.jboss.netty.handler.codec.replay.ReplayingDecoderBuffer
  39:          5787        4024984  <constantPoolCacheKlass>
  40:         89657        3586280  java.util.LinkedHashMap$Entry
  41:        106274        3400768  play.core.server.netty.PlayDefaultUpstreamHandler
  ...

Today, i use MAT to analyze dump info in eclipse, the leak suspect is:

51,084 instances of"org.jboss.netty.channel.socket.nio.NioAcceptedSocketChannel", loaded by"sun.misc.Launcher$AppClassLoader @ 0x715c0a240" occupy 756,863,880 (89.90%) bytes.

Biggest instances:

  • org.jboss.netty.channel.socket.nio.NioAcceptedSocketChannel @ 0x73941d418 - 16,785,968 (1.99%) bytes.
  • org.jboss.netty.channel.socket.nio.NioAcceptedSocketChannel @ 0x73d3e1270 - 16,785,928 (1.99%) bytes.
  • org.jboss.netty.channel.socket.nio.NioAcceptedSocketChannel @ 0x71b69cb78 - 16,785,496 (1.99%) bytes.
  • org.jboss.netty.channel.socket.nio.NioAcceptedSocketChannel @ 0x726f98950 - 16,785,464 (1.99%) bytes.


Keywords
sun.misc.Launcher$AppClassLoader @ 0x715c0a240
org.jboss.netty.channel.socket.nio.NioAcceptedSocketChannel

The project is quite simple, it is used to accept the photos sent from client and then upload them to amazons3, or push apns msg to client.

Application.java:

    @BodyParser.Of(value = BodyParser.Raw.class, maxLength = ImgHandler.MAX_PIC_LENGTH)
    public static Result upload_pic(final String key) {
        final Request req = Application.request();
        return async(
                future(new Callable<String>() {
                  public String call() {
                    return ImgHandler.upload_pic(req, key);
                  }  
                }).map(new F.Function<String,Result>() {
                  public Result apply(String i) {
                    return ok(i);
                  }
                })
        );
    }


I found the similar problem here: http://web.archiveorange.com/archive/v/ZVMdI4hq8Mz7GLAFDnbS
His problem was that TimerTask can not be gc. I also have a TimerTask in my code, but I cannot find the bug...

    public static class MyActor extends UntypedActor{
       
        static ActorRef instance = Akka.system().actorOf(new Props(MyActor.class));
       
        public static void init(){
            try
            {
              // Send a TICK message every minute
            Akka.system().scheduler().schedule(
                    Duration.Zero(),
                    Duration.create(1, TimeUnit.MINUTES),
                    instance, "TICK",
                    Akka.system().dispatcher()
                    );
            }catch(Exception ex)
            {
                System.out.println("timer error:" + ex.toString());
            }
            System.out.println("schedule started");
        }
       
        public void onReceive(Object message) {
             if("TICK".equals(message))
                 doSth();
        }
    }

I init the timer when the server start.(MyActor.init()).


Chanan Braunstein

unread,
May 9, 2013, 9:56:11 AM5/9/13
to play-fr...@googlegroups.com
Hi,

Did you find a solution to this? The reason I asked, is that I always put my scheduled task code directly in Global or if there are many of them, I create a Setup class that has all that inside of it. I like the code you have where it is inside the actor, but I am just wondering if that somehow can be the cause of the memory leak, doubtful, true, but still... If you solved it and it is not, I will start using this pattern.

Mark HUANG

unread,
May 10, 2013, 4:58:36 AM5/10/13
to play-fr...@googlegroups.com
Not yet, still finding the solution. Only I can do now is to change the jvm max memory limit from 4G to 8G, which can make the project run a little longer...

在 2013年5月9日星期四UTC+8下午9时56分11秒,Chanan Braunstein写道:

Alexandru Nedelcu

unread,
May 10, 2013, 5:31:02 AM5/10/13
to play-fr...@googlegroups.com
Increasing the heap limit like that is a really bad idea. The concurrent garbage collector (CMS) does a pretty well job in general, however it still needs to do a periodic blocking sweep, during which it blocks the whole process. With big heap sizes it can takes tens of seconds or even minutes (!!!) for the GC to finish. G1 could be an alternative, however the experiences of people with it haven't been good.

You mentioned that when your project starts, it needs 600m memory. I really hope that's just the heap memory allocated by the JVM and NOT the heap memory used by the JVM.
Because 600m is huge. My apps when they start only need 30-50 MB, with only one exception that needs about 100 MB because it preallocates an in-memory buffer.

You can have memory leaks with TimerTasks, that's true, but I'm not seeing your code creating TimerTasks.
I would be more worried about the code that does the file uploading to S3.

Also, when something keeps sockets opened, try to see what connections are opened, by means of tools such as "lsof". Are they incoming, or outgoing? Are they sitting idle? In TIME_WAIT, or in CLOSE_WAIT? Do you have a load-balancer or something in front that uses keep-alive connections?



--
You received this message because you are subscribed to the Google Groups "play-framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to play-framewor...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Alexandru Nedelcu
http://bionicspirit.com

Alexandru Nedelcu

unread,
May 10, 2013, 5:40:17 AM5/10/13
to play-fr...@googlegroups.com
Also, try to reproduce the problem on your localhost. If the app just uploads user files, create a script that does requests to your local server for file uploading and monitor the process with VisualVM or something.
I personally start removing functionality until it stops leaking :-)

Mark HUANG

unread,
May 10, 2013, 6:02:19 AM5/10/13
to play-fr...@googlegroups.com
Thanks for your reply. I restarted the project yesterday. Actually it needs 400+M at first, maybe because lots of images were being uploaded( around 90 per minute).

The TimerTask is created in ImgHandler.java with a static block:

public class ImgHandler{
           ...
       static {
                   ...
         MyActor.init();
         System.out.println("timer started!");
}

 public static string upload_pic(...){ ... }
 ... //some other similar methods

        public static class MyActor extends UntypedActor{
       
        static ActorRef instance = Akka.system().actorOf(new Props(MyActor.class));
       
        public static void init(){
            try
            {
              // Send a TICK message every minute
            Akka.system().scheduler().schedule(
                    Duration.Zero(),
                    Duration.create(1, TimeUnit.MINUTES),
                    instance, "TICK",
                    Akka.system().dispatcher()
                    );
            }catch(Exception ex)
            {
                System.out.println("timer error:" + ex.toString());
            }
            System.out.println("schedule started");
        }
       
        public void onReceive(Object message) {
             if("TICK".equals(message))
                 doSth();
        }
    }

}

Are there anything wrong in my timertask?
I will try the tool "lsof". I have no load_banlancer and I dont need keepalive connections cause I just upload photos.

在 2013年5月10日星期五UTC+8下午5时31分02秒,Alexandru Nedelcu写道:

Mark HUANG

unread,
May 18, 2013, 1:22:36 AM5/18/13
to play-fr...@googlegroups.com
Thanks god, finally I found the problem. The leak is caused by keep-alive connections. It is very strange that play will not release the connection until the client disconnects with the right way. I thought normally the keep-alive connection would only last for 2~5 minutes. And with 'lsof' anaylysis, if the client disconnects abnormally (i.e. closing the wifi), the connection will be kept on the server side forever and cannot be gc.
It seems that we cannot set 'keep-alive timeout' in play framework. So I setuped nginx as a proxy with keep-alive timeout 1 mininute, then the leak had gone.
There is no problem with my tasktimer, so you could try this pattern.


On Thu, May 9, 2013 at 9:56 PM, Chanan Braunstein <cha...@gmail.com> wrote:
Hi,

Did you find a solution to this? The reason I asked, is that I always put my scheduled task code directly in Global or if there are many of them, I create a Setup class that has all that inside of it. I like the code you have where it is inside the actor, but I am just wondering if that somehow can be the cause of the memory leak, doubtful, true, but still... If you solved it and it is not, I will start using this pattern.

--
You received this message because you are subscribed to a topic in the Google Groups "play-framework" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/play-framework/ijhR2MDm21Y/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to play-framewor...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages