Memory Issues in Akka/Spray Application

30 views
Skip to first unread message

Arpit

unread,
Apr 30, 2015, 3:52:42 AM4/30/15
to spray...@googlegroups.com

I have build spray service which would run in 2 8GB Boxes . It receives Json every 5 seconds which would get converted to MyJsonMessage .Each MyJsonMessage will contain 3000 MyObjects. So 3000 MyObjects would get created every 5 seconds.

Internally I am using batching to process these 3000 objects in batch of 1000s which will be sent to an consumerActor.

My heap size is set to 5GB.

I am emitting metrics. I have seen YGC time increasing at a very high rate , even heap size grows and frequently touches 5 GB. I am new to Akka so I am not sure if there is any memory leak here or only option is to add more boxes. Solutions/Suggestions?

case class MyJsonMessage
(
  inputString1:String,
  inputString2:String,
  objectList:List[MyObject]
)

case class MyObject
(
  objectName : String,
  objectValue : String,
  data : Map[String,String]
)


class MyHttpService (implicit val context: akka.actor.ActorRefFactory        
.....) extends MyJsonProtocol
{

 def worker = MyHttpServiceWorker
 val multicastRoute = path("service" / "task" / segment) { 
 (configName: String) => {
 post { ctx =>
  var payload = try {         
 Left(JsonParser(ctx.request.entity.asString).convertTo[MyJsonMessage])
     } catch {
      case ex: Exception =>
         log.error("Error converting message payload: ", ex)
         Right(ex)
     }     
     worker.process(payload.left.get)
 }

object MyHttpServiceWorker
{
    def process[T](request: T) = {
   request match {

    case request : MyJsonMessage =>
       val objectListCount = request.objectList.size
       val batches = objectListCount > 1000 match {
        case true => ceil(objectListCount * 1.0 / 1000).toInt
        case false => 1
      }

    List.range[Long](0, batches).foreach(batch => { 
       val split = MyBulkObjectRequest(request, batch.toInt * 
        batchSize, limit * batchSize)
        MyObjectRequestConsumer ! split
        limit += 1
      })

  MyObjectSuccessResponse(objectListCount, batches, requestId , UUID))     

   }
}

Following is the dispatcher for MyObjectRequestConsumer in conf

myobject-dispatcher {
    type = Dispatcher
    executor = "fork-join-executor"
    fork-join-executor {
       parallelism-min = 16
       parallelism-factor = 4.0
      parallelism-max = 16
     }
   throughput = 1
  }

Mathias Doenitz

unread,
Apr 30, 2015, 8:02:27 AM4/30/15
to spray...@googlegroups.com
Arpit,

the first thing you’ll want to do is finding out which object type(s) make up the majority of the memory held when heap size is very large.
Some simple memory profiling should do the trick.

When you know the object type it shouldn’t be too hard to identify the reference(s) that keeps these object from being GCed.

Cheers,
Mathias

---
mat...@spray.io
http://spray.io
> --
> You received this message because you are subscribed to the Google Groups "spray.io User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to spray-user+...@googlegroups.com.
> Visit this group at http://groups.google.com/group/spray-user.
> To view this discussion on the web visit https://groups.google.com/d/msgid/spray-user/90549c84-4354-48ae-9825-3f6fc61e58c0%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Arpit

unread,
Apr 30, 2015, 8:51:28 AM4/30/15
to spray...@googlegroups.com
Hi Mathias,

Yes, I am creating a heap dump and tried analysing via MAT. MAT only shows biggest objects of size 60-70 MB which I believe is not a right info I am getting.

-Arpit
Reply all
Reply to author
Forward
0 new messages