High response time for file-upload in Play 2.5(Scala)

43 views
Skip to first unread message

shailes...@gslab.com

unread,
Jul 19, 2017, 6:03:05 AM7/19/17
to Play Framework
We have upgraded one component from play(scala) 2.4.6 to 2.5.8. After upgrade, when we deployed the build on production, we observed significant increase in response time while uploading files(API allows client to upload images). To dig out more, I created few clients which can upload-image at high-rate(2K uploads per minutes) and found that:

If client sends upload-image requests from AWS, then image-upload API has low response time.
If client sends upload-image requests from local-network, then image-upload API has high response time.
To confirm this behavior, we deployed build again on production started sending image-upload requests using client. With server side logs, we could see that image-upload requests which were sent by our clients(deployed on AWS) were consuming less time compared to other requests which were coming from real clients i.e. cameras(which used local network).

def uploadImage(custId: String, devId: String) = Action.async(parse.multipartFormData) { implicit request =>
  val startT
= System.currentTimeMillis()
  authenticateRole
(Device, Admin) { implicit session =>
    authenticateVendor
{
      authenticateDevice
(custId, devId) {
        val timestamp
= System.currentTimeMillis()
       
Future {
          val imageFile
= new File(s"$pathToImageDirectory/$devId/$timestamp.jpg")
         
if (!imageFile.getParentFile.exists())
            imageFile
.getParentFile.mkdirs()
         
if (request.body.files != null && !request.body.files.isEmpty) {
            val image
= request.body.files.head.ref.file
            org
.apache.commons.io.FileUtils.copyFile(image, imageFile, false)
            val totalT
= System.currentTimeMillis() - startT
           
Logger.info(s"image-upload for [${request.uri}] took [$totalT] milli-seconds")
           
Ok(<response><status>success</status></response>)
         
} else {
           
BadRequest()
         
}
       
} recover {
         
case ex: IOException =>
           
InternalServerError()
       
}
     
}
   
}
 
}
}


With logs its already clear that controller method takes very less time and in filter log message we can see high response time. I suspect that there might be change in way how play used to parse request body because request body parsing is completed before controller method gets invoked. 

I have also verified that there are enough thread in thread-pool for Future's execution(other APIs which using Future are having less response time).
As a simple test, I created new scala-play project and added API to upload-file and observed exactly above behavior.
Has anybody observed similar behavior? Any kind of help/suggestions to fix this will be appreciated.

NOTE:
1. We use new-relic agent(3.40.0) to monitor response time.
2. Scala 2.11.7 is used.
3. We have logs in filters which tells about time consumed while serving each request. This time matches with what new-relic shows.
4. We have added more messages in controller method which tells about time consumed by controller method. This time is significantly less compared to time logged in filters or new-relic reports.
5. Image size in upload request is less than 100 KB.

Marcos Pereira

unread,
Jul 21, 2017, 3:40:18 PM7/21/17
to play-fr...@googlegroups.com
Hi,

I understand that by "local network" you meant the device network, right? Which could be 3G, 4G, user wifi, etc.

Is there any other update happening here? Like NewRelic Agent or other libraries? If you are updating the NewRelic agent, is it possible that it is now collecting the metrics the right way?

Also, comparing requests from AWS network itself to "real user" network does not seem "fair", since user network usually will be slower to complete the upload.

Furthermore, my main doubt here is how you are measuring and comparing the response times. Can you elaborate more on that? Finally, it would be good to know much higher the numbers are.
 
As a simple test, I created new scala-play project and added API to upload-file and observed exactly above behavior.

By "above behavior" you mean "upload from aws is faster"? Or "upload using 2.4 is faster than 2.5"?

Best.
--
Marcos Pereira
Software Engineer, Lightbend.com

shailes...@gslab.com

unread,
Jul 22, 2017, 8:17:02 AM7/22/17
to Play Framework
I understand that by "local network" you meant the device network, right? Which could be 3G, 4G, user wifi, etc.

Yes, by local network I mean device network, which can be 3G, 4G or wifi etc.


Is there any other update happening here? Like NewRelic Agent or other libraries? If you are updating the NewRelic agent, is it possible that it is now collecting the metrics the right way?
 
No, I am using same new-relic agent(3.40) for both builds(with play 2.4 and 2.5). 

Also, comparing requests from AWS network itself to "real user" network does not seem "fair", since user network usually will be slower to complete the upload.

Yes, I do agree. 

Furthermore, my main doubt here is how you are measuring and comparing the response times. Can you elaborate more on that? Finally, it would be good to know much higher the numbers are.

I have added logs in filters which indicate time taken to serve the request. Also I am having new-relic agent which reports response time for all requests. With play 2.4 build, new-relic reported 3-4 milli-seconds as response time but logs in filters reported that upload-request took 185 milli-seconds. With play 2.5 build, new-relic reported 185 milli-seconds as response time and logs in filters reported that upload-request took 185 milli-seconds. I performed the above test after posting this question. It seems like with play 2.4, new-relic was reporting time taken by only controller method and with play 2.5 new-relic agent is reporting complete response time(from when request is received to when controller method is completed). Now it seems that the issue is not with play but with new-relic agent itself.

By "above behavior" you mean "upload from aws is faster"? Or "upload using 2.4 is faster than 2.5"?

Yes, I mean upload from AWS is faster which makes sense as AWS has higher bandwidth. Now, I can say that upload with play 2.4 and play 2.5 is same. It is the new-relic agent which reports less response time with play 2.4 and high response time with play 2.5. Also I have captured response time at client side for upload-file request, which was approximately same with play 2.4 and play 2.5.

One question that I now have is:

When upload-file request is sent from client, request-headers are first received at play-app and once complete file i.e. request body is received at server, controller method gets invoked. Does play gets blocked after request-headers are received until request body is completely received? As play is top to bottom asynchronous, I hope this operation(waiting for complete body to arrive at server after request-headers are received) is asynchronous.
Reply all
Reply to author
Forward
0 new messages