Web2py DAL slows down the server response when faced with multiple concurrent requests

297 views
Skip to first unread message

Saurabh Kumar

unread,
Nov 25, 2013, 12:23:37 PM11/25/13
to web...@googlegroups.com
The Web2py response slows down when faced with concurrent requests. On doing some profiling, I found out that this occurs because DAL start responding slowly. Here is the profile log, that confirms the same:

root@replica:/home/www-data/lambda# python c.py  mod-20131125T164445.prof
         16217 function calls (16199 primitive calls) in 1.558 seconds

   Ordered by: internal time, call count
   List reduced from 188 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1710    0.342    0.000    0.530    0.000 /home/www-data/lambda/gluon/dal.py:8247(__getitem__)
       14    0.188    0.013    0.188    0.013 /usr/lib/python2.7/dist-packages/MySQLdb/cursors.py:97(nextset)
      183    0.152    0.001    0.152    0.001 /usr/lib/python2.7/threading.py:146(acquire)
      366    0.131    0.000    0.132    0.000 /usr/lib/python2.7/genericpath.py:38(isdir)
      102    0.107    0.001    0.241    0.002 /home/www-data/lambda/gluon/dal.py:8025(__init__)
      595    0.102    0.000    0.556    0.001 /home/www-data/lambda/gluon/dal.py:9100(__init__)
     2242    0.086    0.000    0.100    0.000 /home/www-data/lambda/gluon/dal.py:8317(__setattr__)
      596    0.055    0.000    0.055    0.000 /home/www-data/lambda/gluon/dal.py:5060(cleanup)
      379    0.052    0.000    0.052    0.000 /usr/lib/python2.7/posixpath.py:68(join)
        3    0.048    0.016    0.048    0.016 /usr/lib/python2.7/dist-packages/MySQLdb/cursors.py:277(_do_query)
      101    0.038    0.000    0.681    0.007 /home/www-data/lambda/gluon/dal.py:7658(lazy_define_table)
      101    0.036    0.000    0.073    0.001 /home/www-data/lambda/gluon/dal.py:8198(_create_references)
      520    0.035    0.000    0.214    0.000 /home/www-data/lambda/gluon/dal.py:6620(sqlhtml_validators)
      101    0.035    0.000    0.717    0.007 /home/www-data/lambda/gluon/dal.py:7628(define_table)
        1    0.029    0.029    1.369    1.369 /home/www-data/lambda/applications/chat/compiled/models/db.py:1(<module>)
      158    0.024    0.000    0.024    0.000 /home/www-data/lambda/gluon/validators.py:2564(__init__)
     1384    0.015    0.000    0.138    0.000 /home/www-data/lambda/gluon/dal.py:8339(__iter__)
      102    0.014    0.000    0.014    0.000 /home/www-data/lambda/gluon/dal.py:457(pluralize)
      695    0.013    0.000    0.013    0.000 /home/www-data/lambda/gluon/dal.py:8288(__setitem__)
      414    0.013    0.000    0.013    0.000 /home/www-data/lambda/gluon/dal.py:7762(__getattr__)
       89    0.012    0.000    0.014    0.000 /home/www-data/lambda/gluon/validators.py:450(__init__)
        6    0.005    0.001    0.005    0.001 /usr/lib/python2.7/socket.py:223(meth)


To further confirm this, I removed most of the table definitions (~100) and kept just the auth_user table in db.py. This improved the response time significantly.

Concurrency Level:      25
Time taken for tests:   2.150 seconds
Complete requests:      50
Failed requests:        0
Write errors:           0
Total transferred:      588512 bytes
HTML transferred:       567000 bytes
Requests per second:    23.25 [#/sec] (mean)
Time per request:       1075.173 [ms] (mean)
Time per request:       43.007 [ms] (mean, across all concurrent requests)
Transfer rate:          267.27 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    1   0.1      1       1
Processing:   291  901 296.5   1011    1230
Waiting:      288  899 296.4   1011    1230
Total:        292  902 296.5   1012    1232

Percentage of the requests served within a certain time (ms)
  50%   1012
  66%   1103
  75%   1120
  80%   1140
  90%   1182
  95%   1229
  98%   1232
  99%   1232
 100%   1232 (longest request)


Earlier benchmark logs looked like


Concurrency Level:      25
Time taken for tests:   8.882 seconds
Complete requests:      50
Failed requests:        0
Write errors:           0
Total transferred:      588190 bytes
HTML transferred:       567000 bytes
Requests per second:    5.63 [#/sec] (mean)
Time per request:       4440.767 [ms] (mean)
Time per request:       177.631 [ms] (mean, across all concurrent requests)
Transfer rate:          64.67 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    1   0.7      1       6
Processing:  1883 4011 1211.9   3863    5865
Waiting:     1882 4008 1211.6   3856    5861
Total:       1884 4012 1211.9   3864    5867

Percentage of the requests served within a certain time (ms)
  50%   3864
  66%   4845
  75%   5009
  80%   5165
  90%   5557
  95%   5748
  98%   5867
  99%   5867
 100%   5867 (longest request)
 
Any clues on what could be the cause of this? 

Anthony

unread,
Nov 25, 2013, 12:37:52 PM11/25/13
to web...@googlegroups.com
With 100 table definitions, even a single request will probably be relatively slow. You might want to see if you can conditionally define your tables depending on the controller/function called (you can set up conditional model files or move the definitions to modules that get imported and run when needed).

Anthony

Saurabh Kumar

unread,
Nov 25, 2013, 12:55:06 PM11/25/13
to web...@googlegroups.com
Apparently not. Following are the logs when doing single request will all 100 tables defined.

  14877 function calls (14855 primitive calls) in 0.117 seconds

   Ordered by: internal time, call count
   List reduced from 220 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        8    0.025    0.003    0.025    0.003 /home/www-data/lambda/gluon/restricted.py:190(compile2)
      102    0.017    0.000    0.031    0.000 /home/www-data/lambda/gluon/dal.py:8025(__init__)
      596    0.012    0.000    0.015    0.000 /home/www-data/lambda/gluon/dal.py:9100(__init__)
     1726    0.006    0.000    0.006    0.000 /home/www-data/lambda/gluon/dal.py:8247(__getitem__)
     2242    0.005    0.000    0.007    0.000 /home/www-data/lambda/gluon/dal.py:8317(__setattr__)
      520    0.005    0.000    0.015    0.000 /home/www-data/lambda/gluon/dal.py:6620(sqlhtml_validators)
      101    0.004    0.000    0.063    0.001 /home/www-data/lambda/gluon/dal.py:7658(lazy_define_table)
     1392    0.004    0.000    0.007    0.000 /home/www-data/lambda/gluon/dal.py:8339(__iter__)
        1    0.003    0.003    0.091    0.091 /home/www-data/lambda/applications/chat/models/db.py:1(<module>)
      101    0.003    0.000    0.066    0.001 /home/www-data/lambda/gluon/dal.py:7628(define_table)
      695    0.003    0.000    0.003    0.000 /home/www-data/lambda/gluon/dal.py:8288(__setitem__)
      101    0.003    0.000    0.007    0.000 /home/www-data/lambda/gluon/dal.py:8198(_create_references)
      596    0.002    0.000    0.002    0.000 /home/www-data/lambda/gluon/dal.py:5060(cleanup)
      151    0.002    0.000    0.002    0.000 /home/www-data/lambda/gluon/custom_import.py:37(custom_importer)
       89    0.002    0.000    0.003    0.000 /home/www-data/lambda/gluon/validators.py:450(__init__)
      102    0.002    0.000    0.002    0.000 /home/www-data/lambda/gluon/dal.py:457(pluralize)
      181    0.001    0.000    0.004    0.000 /home/www-data/lambda/gluon/languages.py:867(translate)
      160    0.001    0.000    0.007    0.000 /home/www-data/lambda/gluon/validators.py:77(translate)
       82    0.001    0.000    0.004    0.000 /home/www-data/lambda/gluon/validators.py:689(__init__)
      184    0.001    0.000    0.001    0.000 /home/www-data/lambda/gluon/languages.py:338(__init__)
      181    0.001    0.000    0.005    0.000 /home/www-data/lambda/gluon/languages.py:367(__str__)

Also if you see closely, there might be some slow down due to locking. Look at the following line taken from the log I had posted earlier.

Leonel Câmara

unread,
Nov 25, 2013, 1:10:42 PM11/25/13
to web...@googlegroups.com
Your slow down seems totally expectable. It's spending most of the time locked or waiting for db.  
  
You can of course make sure you don't do things that prevent tables from being lazy but there's not much you can do about the slowdown from concurrency apart from tuning your database, did you set the pool_size in your Dal? 
  
What I would do, if possible, would be to aggressively cache as much as possible so you avoid hitting the DB so much.  
  
Think about what is really important to be realtime updated, maybe somethings can be 5 minutes out of date and you can cache that select.

Saurabh Kumar

unread,
Nov 25, 2013, 1:17:09 PM11/25/13
to web...@googlegroups.com
Comments inline.


On Monday, November 25, 2013 11:40:42 PM UTC+5:30, Leonel Câmara wrote:
Your slow down seems totally expectable. It's spending most of the time locked or waiting for db.  
  
You can of course make sure you don't do things that prevent tables from being lazy but there's not much you can do about the slowdown from concurrency apart from tuning your database, did you set the pool_size in your Dal? 

Pool size=100.
Number of web2pydaemon  processes=10

I didn't get the second point, although I have done caching and database tuning, but it does not matter. I am profiling a http request which make zero database queries and it is being slowed down while loading the models (specifically DAL in db.py).

Massimo Di Pierro

unread,
Nov 25, 2013, 1:48:29 PM11/25/13
to web...@googlegroups.com
What database? Are you using ab for testing?

Saurabh Kumar

unread,
Nov 25, 2013, 1:52:40 PM11/25/13
to web...@googlegroups.com
@Database: Mysql 5.5
@Testing: Yes I am using ab.

Also I would like to know what does lazy_define_table() do?  I was looking at the code and if 'migrate=False' is set, it looks like it does not do anything expensive. But the logs say otherwise.

Thanks,
Saurabh



--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/rH1C7iXMPNA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Niphlod

unread,
Nov 25, 2013, 3:57:53 PM11/25/13
to web...@googlegroups.com
sorry..... but are you **really** asking why your mysql instance suffers from 100 concurrent connections ?
and why web2py takes some time to parse 100 table definitions ?
and ....  you're sure that even with migrate=False and lazy_tables=True those timings are comparable ?

Anthony

unread,
Nov 25, 2013, 4:14:09 PM11/25/13
to web...@googlegroups.com
So, is time for 25 concurrent requests significantly greater than 25*[time for a single request]?

Saurabh Kumar

unread,
Nov 26, 2013, 7:41:33 AM11/26/13
to web...@googlegroups.com
I am sorry if I failed to put it clearly or I am asking something absurd.

I am not sure why are you relating it to the Mysql server. I am summarizing my observations below.

Context:
I am testing using ab. The HTTP request does not make any database query.
Number of Daemon processes: 10
Max Number of Mysql connections allowed: 1000
DAL pool size: 100

Case 1:
Number of Tables in db.py: ~100 (all with migrate=False)
No. of concurrent requests: 1
Time taken per request: 900 ms

Case 2:
Number of Tables in db.py: ~100 (all with migrate=False)
No. of concurrent requests: 25
Time taken per request: 4440 ms


Case 3:
Number of Tables in db.py: =2 (all with migrate=False)
No. of concurrent requests: 25
Time taken per request: 1075 ms



In case 2, extra time gets consumed in loading the models in main.py web2py file.

Logs for case 2:
Logs for case 3:

14877 function calls (14855 primitive calls) in 0.117 seconds

   Ordered by: internal time, call count
   List reduced from 220 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        8    0.025    0.003    0.025    0.003 /home/www-data/lambda/gluon/restricted.py:190(compile2)
      102    0.017    0.000    0.031    0.000 /home/www-data/lambda/gluon/dal.py:8025(__init__)
      596    0.012    0.000    0.015    0.000 /home/www-data/lambda/gluon/dal.py:9100(__init__)
     1726    0.006    0.000    0.006    0.000 /home/www-data/lambda/gluon/dal.py:8247(__getitem__)
     2242    0.005    0.000    0.007    0.000 /home/www-data/lambda/gluon/dal.py:8317(__setattr__)
      520    0.005    0.000    0.015    0.000 /home/www-data/lambda/gluon/dal.py:6620(sqlhtml_validators)
      101    0.004    0.000    0.063    0.001 /home/www-data/lambda/gluon/dal.py:7658(lazy_define_table)
     1392    0.004    0.000    0.007    0.000 /home/www-data/lambda/gluon/dal.py:8339(__iter__)
        1    0.003    0.003    0.091    0.091 /home/www-data/lambda/applications/chat/models/db.py:1(<module>)
      101    0.003    0.000    0.066    0.001 /home/www-data/lambda/gluon/dal.py:7628(define_table)
      695    0.003    0.000    0.003    0.000 /home/www-data/lambda/gluon/dal.py:8288(__setitem__)
      101    0.003    0.000    0.007    0.000 /home/www-data/lambda/gluon/dal.py:8198(_create_references)
      596    0.002    0.000    0.002    0.000 /home/www-data/lambda/gluon/dal.py:5060(cleanup)
      151    0.002    0.000    0.002    0.000 /home/www-data/lambda/gluon/custom_import.py:37(custom_importer)
       89    0.002    0.000    0.003    0.000 /home/www-data/lambda/gluon/validators.py:450(__init__)
      102    0.002    0.000    0.002    0.000 /home/www-data/lambda/gluon/dal.py:457(pluralize)
      181    0.001    0.000    0.004    0.000 /home/www-data/lambda/gluon/languages.py:867(translate)
      160    0.001    0.000    0.007    0.000 /home/www-data/lambda/gluon/validators.py:77(translate)
       82    0.001    0.000    0.004    0.000 /home/www-data/lambda/gluon/validators.py:689(__init__)
      184    0.001    0.000    0.001    0.000 /home/www-data/lambda/gluon/languages.py:338(__init__)
      181    0.001    0.000    0.005    0.000 /home/www-data/lambda/gluon/languages.py:367(__str__)



Doubt 1:
If migrate is set to False, then ideally, I would expect the models to load in same amount of time irrespective of the number of concurrent requests.

Doubt 2:
For a given number of concurrent requests (25 here) and migrate set to False, reducing the number of table definitions should only marginally improve the response time, not reduce it by 80%. 


I have  not set lazy_tables=True anywhere yet. So it is taking the default settings. I will try it and update the thread.

Thanks for your time,
Saurabh



Saurabh Kumar

unread,
Nov 26, 2013, 7:45:25 AM11/26/13
to web...@googlegroups.com
On Tue, Nov 26, 2013 at 2:44 AM, Anthony <abas...@gmail.com> wrote:
So, is time for 25 concurrent requests significantly greater than 25*[time for a single request]?


No, time for 25 concurrent requests is smaller than 25*[time for a single request]. 
Avg. time taken per request is 4 to 5 times though. They should be processed in parallel not sequentially, right?

Ideally, I would expect time for 25 concurrent requests ~= 1*[time for a single request]. 


Thanks,
Saurabh
 

Anthony

unread,
Nov 26, 2013, 8:57:49 AM11/26/13
to web...@googlegroups.com
Case 1:
Number of Tables in db.py: ~100 (all with migrate=False)
No. of concurrent requests: 1
Time taken per request: 900 ms

Case 2:
Number of Tables in db.py: ~100 (all with migrate=False)
No. of concurrent requests: 25
Time taken per request: 4440 ms

For apples to apples comparison, you should look at the "(mean, across all concurrent requests)" value. In case #2, that is only 177.6 ms. With multiple concurrent requests, of course each request is going to take longer to complete from start to finish (each thread is sharing system resources, so can't run as fast as when only a single thread is processing a request). However, because the requests are being processed in parallel, you have to divide the overall average time per request by the concurrency level to get the true time spent on each request (i.e., if 50 requests take an average of 4.4 seconds per request processed 25 at a time, then the true time spent on each request is 4.4 seconds / 25 = 177.6 ms). This is the number that should be compared to the single request case. Alternatively, you can just compare the total test time in both cases (assuming you ran the same number of total requests in each case).

Anthony

Anthony

unread,
Nov 26, 2013, 9:05:29 AM11/26/13
to web...@googlegroups.com
On Tue, Nov 26, 2013 at 2:44 AM, Anthony <abas...@gmail.com> wrote:
So, is time for 25 concurrent requests significantly greater than 25*[time for a single request]?


No, time for 25 concurrent requests is smaller than 25*[time for a single request]. 
Avg. time taken per request is 4 to 5 times though. They should be processed in parallel not sequentially, right?

Ideally, I would expect time for 25 concurrent requests ~= 1*[time for a single request]. 

No, that's not how concurrency works. When you process 25 requests simultaneously, the computer doesn't suddenly gain 25x it's normal computing power. Each request is handled in a separate thread, but all of those threads still have to share the same computing resources. So, each request will take longer to process than it would in the single request case.

It doesn't look like concurrency is slowing things down for you. The problem is that you have 100 table definitions. Even with migrations turned off, table definitions take some time, so you should avoid having so many in each request. At a minimum, you should set lazy_tables=True. That will postpone much of the table definition code until each table is actually referenced in the code (so tables that are never referenced within a given request will not get fully defined). In addition, you can use conditional models to execute only some of the models on each request (i.e., those needed for the particular controller/function being called) and/or define models in modules and import where needed.

Anthony 

Saurabh Kumar

unread,
Nov 26, 2013, 9:32:37 AM11/26/13
to web...@googlegroups.com
I am still unclear.

Have a look at the longest response time which is 5.8 s as compared to <1 s. This would mean that as soon as the website is faced with 25 odd users, the response time goes up six times, which is certainly not a good thing. I am sure CPU/memory/Bandwidth are not bottleneck here.

Comparing total time taken in both the cases wont be a fair comparison. One of them is doing things concurrently and other one sequentially. If both were expected to be the same, then why do things in parallel in the first place?

I understand that expected time for 25 concurrent requests ~= 1*[time for a single request] is wrong because of sharing of resources. But it should be considerably less than 25*[time for a single request]. And as 25 is a very small number I'd expect it to be close to 1*[time for a single request]. The CPU time taken in processing a light request should be close to zero and it should not be a bottleneck while processing 25 such requests in parallel. If CPU is turning out to be the bottleneck here, there must be something wrong in what are are doing. A lot of table definitions is one such things. And, just curious, why is table definition expensive in the first place if migrate is set to False and these is no need for database interaction while defining tables.

Regarding the optimization of  table definitions suggestion, yes it definitely makes sense and answers my doubts. I will do the needful and post how it turns out.
 

Anthony

Ricardo Pedroso

unread,
Nov 26, 2013, 10:12:46 AM11/26/13
to web...@googlegroups.com
How may cores do you have?
Are all the cores being used when you run the tests?
What webserver are you testing against?

Keep in mind that one core can only process one instruction (machine code, not python instruction) at a time.

Ricardo



You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

Saurabh Kumar

unread,
Nov 26, 2013, 10:25:37 AM11/26/13
to web...@googlegroups.com
On Tue, Nov 26, 2013 at 8:42 PM, Ricardo Pedroso <rmdpe...@gmail.com> wrote:
How may cores do you have?
Are all the cores being used when you run the tests?
What webserver are you testing against?


I have 4 cores, 16 GB RAM running Apache. But I think, this is irrelevant. Apache easily serves 1000 static file requests (which are not processed through web2y) with 100 concurrent connections on the same machine, with the largest response time being <133 ms.

Concurrency Level:      100
Time taken for tests:   0.884 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      54234000 bytes
HTML transferred:       53846000 bytes
Requests per second:    1130.96 [#/sec] (mean)
Time per request:       88.421 [ms] (mean)
Time per request:       0.884 [ms] (mean, across all concurrent requests)
Transfer rate:          59898.81 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        3   19   6.5     19      34
Processing:    36   67  16.4     64     117
Waiting:       11   25   8.4     23      64
Total:         40   86  17.2     84     133

Percentage of the requests served within a certain time (ms)
  50%     84
  66%     94
  75%     98
  80%    101
  90%    107
  95%    115
  98%    125
  99%    128
 100%    133 (longest request)

Anthony

unread,
Nov 26, 2013, 10:30:04 AM11/26/13
to web...@googlegroups.com
Are you comparing how long it takes to do 25 requests done in parallel to how long it takes to do 25 requests serially, or are you comparing how long it takes to do 25 request in parallel to how long it takes to do 1 request all by itself? You seem to be making the latter comparison and expecting the machine to be able to do 25 requests just as fast as it can do a single request. Instead, trying comparing:

ab -n 50 -c 1 [url]

with

ab -n 50 -c 25 [url]

and observe the total time taken for the entire 50 requests in each case.

Regarding table definitions, yes, they do take up non-trivial amounts of CPU time. When you define a table, it calls the gluon.dal.Table.__init__ method -- check it out.

Anthony

Saurabh Kumar

unread,
Nov 26, 2013, 10:43:23 AM11/26/13
to web...@googlegroups.com
Here is the output of the two commands and one with  -n 500 -c 100.

I think as you say the problem is with table definition code, which I am trying to sort out. If you have any other suggestions regarding the same (speeding up and concurrency) please let me know. I am really thankful to all of you for helping me out :)

I have one more question. Is it possible to lazy load only some tables in place of all of them? Currently what I found out is that, lazy_tables=True argument could be passed in the DAL constructor. Is these a simple way to say that load these tables immediately and the rest of them lazily?

On Tue, Nov 26, 2013 at 9:00 PM, Anthony <abas...@gmail.com> wrote:
Are you comparing how long it takes to do 25 requests done in parallel to how long it takes to do 25 requests serially, or are you comparing how long it takes to do 25 request in parallel to how long it takes to do 1 request all by itself? You seem to be making the latter comparison and expecting the machine to be able to do 25 requests just as fast as it can do a single request. Instead, trying comparing:

ab -n 50 -c 1 [url]


Concurrency Level:      1
Time taken for tests:   5.791 seconds
Complete requests:      50
Failed requests:        0
Write errors:           0
Total transferred:      558150 bytes
HTML transferred:       531500 bytes
Requests per second:    8.63 [#/sec] (mean)
Time per request:       115.813 [ms] (mean)
Time per request:       115.813 [ms] (mean, across all concurrent requests)
Transfer rate:          94.13 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    1   0.2      1       2
Processing:    79  115  90.7     86     629
Waiting:       79  114  90.8     86     629
Total:         80  116  90.7     87     631

Percentage of the requests served within a certain time (ms)
  50%     87
  66%     90
  75%     95
  80%     98
  90%    218
  95%    240
  98%    631
  99%    631
 100%    631 (longest request)

 
with

ab -n 50 -c 25 [url]


Concurrency Level:      25
Time taken for tests:   3.461 seconds
Complete requests:      50
Failed requests:        0
Write errors:           0
Total transferred:      558150 bytes
HTML transferred:       531500 bytes
Requests per second:    14.45 [#/sec] (mean)
Time per request:       1730.347 [ms] (mean)
Time per request:       69.214 [ms] (mean, across all concurrent requests)
Transfer rate:          157.50 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    2   0.5      1       3
Processing:   392 1214 599.6   1093    2531
Waiting:      392 1214 599.6   1093    2531
Total:        393 1216 599.6   1095    2533
WARNING: The median and mean for the initial connection time are not within a normal deviation
        These results are probably not that reliable.

Percentage of the requests served within a certain time (ms)
  50%   1095
  66%   1520
  75%   1650
  80%   1762
  90%   2118
  95%   2203
  98%   2533
  99%   2533
 100%   2533 (longest request)


ab -500 -100 [url]

Concurrency Level:      100
Time taken for tests:   29.829 seconds
Complete requests:      500
Failed requests:        1
   (Connect: 0, Receive: 0, Length: 1, Exceptions: 0)
Write errors:           0
Total transferred:      5581331 bytes
HTML transferred:       5314831 bytes
Requests per second:    16.76 [#/sec] (mean)
Time per request:       5965.761 [ms] (mean)
Time per request:       59.658 [ms] (mean, across all concurrent requests)
Transfer rate:          182.73 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    2   1.8      1       9
Processing:   379 5392 2761.9   5080   14603
Waiting:      379 5391 2761.6   5080   14603
Total:        387 5394 2761.1   5081   14605

Percentage of the requests served within a certain time (ms)
  50%   5081
  66%   6040
  75%   6914
  80%   7500
  90%   9714
  95%  10752
  98%  11747
  99%  12821
 100%  14605 (longest request)


 
and observe the total time taken for the entire 50 requests in each case.

Regarding table definitions, yes, they do take up non-trivial amounts of CPU time. When you define a table, it calls the gluon.dal.Table.__init__ method -- check it out.


Checking.

Anthony

unread,
Nov 26, 2013, 11:02:19 AM11/26/13
to web...@googlegroups.com

ab -n 50 -c 1 [url]


Concurrency Level:      1
Time taken for tests:   5.791 seconds
Time per request:       115.813 [ms] (mean)
Time per request:       115.813 [ms] (mean, across all concurrent requests)
 
ab -n 50 -c 25 [url]

Concurrency Level:      25
Time taken for tests:   3.461 seconds
Time per request:       1730.347 [ms] (mean)
Time per request:       69.214 [ms] (mean, across all concurrent requests)

In both cases above, 50 requests were completed. In the case of concurrency=25, however, the 50 requests were completed faster (i.e., 3.46 seconds vs. 5.79 seconds in the serial case). So, with concurrent requests, the time per request dropped from 115ms to 69ms. Did you expect it to drop all the way down to 4.6ms (i.e., 115/25)?

Anthony

Saurabh Kumar

unread,
Nov 26, 2013, 11:15:27 AM11/26/13
to web...@googlegroups.com
No, not all the way down to 4.6 ms certainly, but to some intermediate small value.

Look at this while serving static files:
Serving 500 files with different concurrency.

Concurrency Level:      1
Time taken for tests:   0.288 seconds
Complete requests:      50
Failed requests:        0
Write errors:           0
Total transferred:      2711700 bytes
HTML transferred:       2692300 bytes
Requests per second:    173.85 [#/sec] (mean)
Time per request:       5.752 [ms] (mean)
Time per request:       5.752 [ms] (mean, across all concurrent requests)
Transfer rate:          9207.35 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    1   0.2      1       2
Processing:     4    5   1.0      4       8
Waiting:        1    2   1.1      2       6
Total:          5    6   1.0      5      10
WARNING: The median and mean for the processing time are not within a normal deviation
        These results are probably not that reliable.

Percentage of the requests served within a certain time (ms)
  50%      5
  66%      6
  75%      6
  80%      6
  90%      7
  95%      8
  98%     10
  99%     10
 100%     10 (longest request)



and 

Concurrency Level:      25
Time taken for tests:   0.049 seconds
Complete requests:      50
Failed requests:        0
Write errors:           0
Total transferred:      2711700 bytes
HTML transferred:       2692300 bytes
Requests per second:    1020.26 [#/sec] (mean)
Time per request:       24.504 [ms] (mean)
Time per request:       0.980 [ms] (mean, across all concurrent requests)
Transfer rate:          54036.05 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    4   3.8      2      15
Processing:     7   17   7.6     18      34
Waiting:        1    9   6.8      6      23
Total:          9   21   7.7     23      36

Percentage of the requests served within a certain time (ms)
  50%     23
  66%     26
  75%     27
  80%     27
  90%     30
  95%     35
  98%     36
  99%     36
 100%     36 (longest request)


I feel things will get much better once I optimize table definitions.

Anthony

unread,
Nov 26, 2013, 11:35:05 AM11/26/13
to web...@googlegroups.com

In both cases above, 50 requests were completed. In the case of concurrency=25, however, the 50 requests were completed faster (i.e., 3.46 seconds vs. 5.79 seconds in the serial case). So, with concurrent requests, the time per request dropped from 115ms to 69ms. Did you expect it to drop all the way down to 4.6ms (i.e., 115/25)?


No, not all the way down to 4.6 ms certainly, but to some intermediate small value.

Look at this while serving static files:

Serving static files is very different from having a web framework process dynamic requests. 

Ricardo Pedroso

unread,
Nov 26, 2013, 11:36:49 AM11/26/13
to web...@googlegroups.com
On Tue, Nov 26, 2013 at 8:42 PM, Ricardo Pedroso <rmdpe...@gmail.com> wrote:
How may cores do you have?
Are all the cores being used when you run the tests?
What webserver are you testing against?


I have 4 cores, 16 GB RAM running Apache. But I think, this is irrelevant. Apache easily serves 1000 static file requests (which are not processed through web2y) with 100 concurrent connections on the same machine, with the largest response time being <133 ms.


Every thing is relevant.

Assuming that you are using an OS with sendfile support:

Webservers are optimized to serve static content, you cannot expect to go near that with dynamic content.

When doing static files serving, and in your case always the same file, the kernel will cache it and there is no disk access,
also to serve the file you will not have the typical system calls in the user space (open/read/write) you will have only one system call "sendfile" that works in the kernel space.


from https://httpd.apache.org/docs/2.2/misc/perf-tuning.html:

If your OS supports a sendfile(2) system call, make sure you install the release and/or patches needed to enable it. (With Linux, for example, this means using Linux 2.4 or later. For early releases of Solaris 8, you may need to apply a patch.) On systems where it is available, sendfile enables Apache 2 to deliver static content faster and with lower CPU utilization.

What I can suggest right know, and assuming that you have the database and webserver in the same machine,
is to put the database in another machine.

If you are doing the ab tests in the same machine, do it in another one.


Ricardo

Anthony

unread,
Nov 26, 2013, 11:40:17 AM11/26/13
to web...@googlegroups.com
I think as you say the problem is with table definition code, which I am trying to sort out. If you have any other suggestions regarding the same (speeding up and concurrency) please let me know. I am really thankful to all of you for helping me out :)


Also, see http://web2py.com/books/default/chapter/29/04/the-core#Workflow for details on conditional models as well as http://web2py.com/books/default/chapter/29/04/the-core#markmin_response_models_to_run for more fine-grained control over which models get run.

And of course, there is the DAL(..., lazy_tables=True) option.

Anthony

Ricardo Pedroso

unread,
Nov 26, 2013, 11:44:11 AM11/26/13
to web...@googlegroups.com
And to see how much you loose when doing dynamic serving against static content serving, just create a web2py
application with only one file:

models/0.py:

raise HTTP(200, 'OK')



--
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

Saurabh Kumar

unread,
Nov 26, 2013, 11:44:48 AM11/26/13
to web...@googlegroups.com
I am NOT comparing the static files and dynamic requests. I was just illustrating the behaviour I am expecting wrt concurrency. I understand how Linux kernel and Apache works to some extent.

All I am trying to do is 25 light concurrent requests [in which controller looks like the following] with decent response time. Currently it is going down to 5-6 seconds. Is it an absurd expectation on a 4 core, 16G machine?

def controller():
     return

Saurabh


--

Leonel Câmara

unread,
Nov 26, 2013, 11:48:41 AM11/26/13
to web...@googlegroups.com
100 table definitions make it look like this project scope has grown a little bit too big. Have you considered breaking the project into more than one web2py application?

Saurabh Kumar

unread,
Nov 26, 2013, 11:50:25 AM11/26/13
to web...@googlegroups.com
On Tue, Nov 26, 2013 at 10:14 PM, Ricardo Pedroso <rmdpe...@gmail.com> wrote:
And to see how much you loose when doing dynamic serving against static content serving, just create a web2py
application with only one file:

models/0.py:

raise HTTP(200, 'OK')


Yes, I tried something similar and it worked well. Its all about loading models which I am going to optimize now. My original doubt was why models are loading slowly. I have got some useful inputs on that part :)
 

On Tue, Nov 26, 2013 at 4:40 PM, Anthony <abas...@gmail.com> wrote:
I think as you say the problem is with table definition code, which I am trying to sort out. If you have any other suggestions regarding the same (speeding up and concurrency) please let me know. I am really thankful to all of you for helping me out :)


Also, see http://web2py.com/books/default/chapter/29/04/the-core#Workflow for details on conditional models as well as http://web2py.com/books/default/chapter/29/04/the-core#markmin_response_models_to_run for more fine-grained control over which models get run.

And of course, there is the DAL(..., lazy_tables=True) option.

Anthony

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/rH1C7iXMPNA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.

Ricardo Pedroso

unread,
Nov 26, 2013, 11:52:58 AM11/26/13
to web...@googlegroups.com
On Tue, Nov 26, 2013 at 4:44 PM, Saurabh Kumar <saurab...@gmail.com> wrote:
I am NOT comparing the static files and dynamic requests. I was just illustrating the behaviour I am expecting wrt concurrency. I understand how Linux kernel and Apache works to some extent.
 
 
All I am trying to do is 25 light concurrent requests [in which controller looks like the following] with decent response time. Currently it is going down to 5-6 seconds. Is it an absurd expectation on a 4 core, 16G machine?

def controller():
     return


As an experimentation, just lower the threads that apache spawns to for eg 4, and see if you notice any difference.


Ricardo

Saurabh Kumar

unread,
Nov 26, 2013, 11:54:35 AM11/26/13
to web...@googlegroups.com
I think l should first try lazy loading and other stuff. I hope that it should work that way.


On Tue, Nov 26, 2013 at 10:18 PM, Leonel Câmara <leonel...@gmail.com> wrote:
100 table definitions make it look like this project scope has grown a little bit too big. Have you considered breaking the project into more than one web2py application?

--

Saurabh Kumar

unread,
Nov 26, 2013, 11:55:13 AM11/26/13
to web...@googlegroups.com
Yes makes sense. Will try :)
Reply all
Reply to author
Forward
0 new messages