Questions for large deployment

164 views
Skip to first unread message

Anand Vaidya

unread,
Jul 20, 2009, 11:54:09 PM7/20/09
to web2py-users
Hi

After a couple of web2py projects, I am confident of coding a fairly
big app in web2py.

My previous projects did not need any database (we had to use
flatfiles), the new project is also similar. I intend to bypass the
models etc completely.

The app is likely to be used in a corporate setting with ~ 8000 users,
and atleast 1000-2000 simultaneous users.

The users authenticate to an LDAP server.

The app is not computationally intensive

It queries another service and displays results

No SQL DB is required

Most likely behind a few Apache 2.x front server

I'd like to know:

- Are there any large web2py installations that I can quote as an
example

- How are the issues of caching (say rendered pages) handled? I have
done a few Drupal sites and can see the performance effects of caching
very clearly. IIRC only Django has caching in the python world?

- Has anyone done any work with web2py in a cluster (similar to a
Tomcat cluster behind mod_jk)? (multiple machines running web2py, the
session data sync'd etc. I can put the session info in a shared FS
though)

Regards
Anand

Bottiger

unread,
Jul 21, 2009, 1:20:27 AM7/21/09
to web2py-users
If it is truly not computationally intensive, and does not even use a
database, it should not be a problem.

I have benchmarked Web2Py on the static welcome page to 700 requests/
second with a concurrency level of 50.

To increase the level of concurrency (if you have additional CPU
cores), you should increase the number of Web2Py processes.

"~ 8000 users, and atleast 1000-2000 simultaneous users."

This is not really a large installation if it doesn't use a database.

"How are the issues of caching (say rendered pages) handled? I have
done a few Drupal sites and can see the performance effects of caching
very clearly. IIRC only Django has caching in the python world?"

Drupal, Django, and Web2Py have equivalent caching mechanisms. Any
external caching mechanism you have seen with Drupal should also be
usable with Web2Py or Django.

"I can put the session info in a shared FS though"

You can either do that or use a database for sessions.

mdipierro

unread,
Jul 21, 2009, 9:52:16 AM7/21/09
to web2py-users
- Are there any large web2py installations that I can quote as an
example

Not that I know and handle >1000 requests/second.

- How are the issues of caching (say rendered pages) handled? I have
done a few Drupal sites and can see the performance effects of caching
very clearly. IIRC only Django has caching in the python world?

If you use multiple installations behind a load balancer I suggest you
use the "pound" load balancer to keep sessions sticky. In that case
the different processes do not need to share any data.

- Has anyone done any work with web2py in a cluster (similar to a
Tomcat cluster behind mod_jk)? (multiple machines running web2py, the
session data sync'd etc. I can put the session info in a shared FS
though)

If you need sessions and you need sessions synced, I suggest you share
the sessions folder.

Massimo

Alex Fanjul

unread,
Jul 21, 2009, 9:40:06 PM7/21/09
to web...@googlegroups.com
Hello Massimo (all) this days I'm reading about horizontal scale architectures, key/value and graphs db's, etc. and the awakening in cloud computing enviroment
In the last reading, I saw "Redit" benchmark of about 50 to 100 thousands req/sec with standard linux box.

I know those values are due to DB architecture of key/value (and they are really incredible), but:
 -what thing is really limiting 1000 req/sec in web2py? cherrypy/apache? mysql/postgres? wscgi/fastcgi? web2pyframework? python?
 -what do you think would the upper limit (req/sec) be in the better production enviroment case (great linux server/s, apache/cherokee?, best connection)
 -As a matter of curiosity, have you ever though in implementing the API for any of such databases? Redit? Tokyo? couchDB?

regards,
alex f

P.S as always i'm sorry for my poor english
--
Alejandro Fanjul Fdez.
alex....@gmail.com
www.mhproject.org

Fran

unread,
Jul 22, 2009, 3:51:54 AM7/22/09
to web2py-users
On Jul 22, 2:40 am, Alex Fanjul <alex.fan...@gmail.com> wrote:
>   -As a matter of curiosity, have you ever though in implementing the
> API for any of such databases? Redit? Tokyo? couchDB?

Yes, the new DAL is looking to make it easier to add more alternative
DBs.
See this thread for Yarko's thoughts on this:
http://groups.google.com/group/web2py/browse_thread/thread/2bd0d613079a600a/45881995869439cf?#45881995869439cf

At this stage no specific DB is being looked at (beyond GAE), but just
making it as easy to add new flavours in future as it currently is for
Relational SQL...

F

mdipierro

unread,
Jul 22, 2009, 4:03:37 AM7/22/09
to web2py-users
This is a complex issues that spans different topics: speed,
efficiency, scalability.

I am not aware of any major bottle neck in web2py, execpt of the
database (not the DAL , the actual database) so it is efficiently.
There are many little tricks you can use to speed applications more:
- use connection pooling
- run your app bytecode compiled (press the button)
- move as much code as you can modules instead of models.
- discard sessions unless you they have modified
- store sessions in a memory mapped file

How fast is it (with or without optimizations) depends on the
architecture. Is there a machine that can give you 1000 request/
seconds. I do know. It is possible. On my virtual machine I get about
100.

There are some things that definitively will NOT help:
- a multicore machine. because the python interpreter cannot use
multiple cores efficiently
- a key/value database. This helps with scalability (i.e. run lots of
concurrent servers) but not necessarily speeds up a single server. I
will actually argue most of the map/reduce DB out there are slower
then postresql.

Massimo
> alex.fan...@gmail.comwww.mhproject.org

Bottiger

unread,
Jul 22, 2009, 4:22:28 AM7/22/09
to web2py-users
"There are some things that definitively will NOT help: a multicore
machine. because the python interpreter cannot use
multiple cores efficiently "

If you use the prefork flup server not included in Web2Py, each
request is handled by a seperate process so this will bypass the GIL.
It makes a huge difference on my 8 core server. Presumably, Jython
which is GILless would yield the same performance on a multicore
machine.

"I will actually argue most of the map/reduce DB out there are slower
then postresql. "

For minimal latency, this is probably true, but I am willing to bet
that the latency grows far slower in a map/reduce DB as the number of
simultaneous connection rises.

mdipierro

unread,
Jul 22, 2009, 4:48:13 AM7/22/09
to web2py-users
With 8 cores and Flup you can probably get close to 1000 requests/
second.
if you can do any test, let us know what you get.

Massimo

Alex Fanjul

unread,
Jul 22, 2009, 5:23:02 AM7/22/09
to web...@googlegroups.com
Thanks Massimo, Bottiger, good to know some keys that I didn't know, like tricks to speed up the app, and the python efficiency  in multicore.
Excuse my ignorance , but let me enquire a bit more...
  • 10x more requests by only installing a simple package (flup)? or is an special server? or is superman?
    • Bottiger, I tried to find out some information and documentation about flup but -apart from official trac wiki- it's lack of
    • Could you give us some introduction about Flup, Prefork server (or whatever it is) and how to install it to improve the performance?
  • So (according to what I've been reading) I conclude that a great performance machine for stressed production enviroment would be made up of:
    • Multiple core machine +
    • Flup Server packages +
    • Cherokee +
    • Postgres +
    • Web2py Tricks
  • Massimo, in a old thread I read something about a prebuilt vmware with cherokee and web2py, ready to install, is it any news about this?
    • In terms of "marketing", I think that if we could upload a vmware turnkey appliance to here (like Django one), it would be increase our presence
Thanks,
alex f

Anand Vaidya

unread,
Jul 22, 2009, 9:16:42 AM7/22/09
to web2py-users
Hi All,

Thanks for the enlightening discussion. web2py community is awesome!

I am hoping to create a rudimentary implementation and perform load
testing etc. My plan is to spread out over multiple machines (2 for a
start), say end-Aug. I am planning to use apache+mod_wsgi

I will share the lessons I learn on this forum and this wiki page:
https://mdp.cti.depaul.edu/wiki/default/page/8692b276-3ddf-4038-9525-bc51060ad1dc


Regards
Anand


On Jul 22, 5:23 pm, Alex Fanjul <alex.fan...@gmail.com> wrote:
> Thanks Massimo, Bottiger, good to know some keys that I didn't know,
> like tricks to speed up the app, and the python efficiency  in multicore.
> Excuse my ignorance , but let me enquire a bit more...
>
>     * 10x more requests by only installing a simple package (flup)? or
>       is an special server? or is superman?
>           o Bottiger, I tried to find out some information and
>             documentation about flup but -apart from official trac wiki-
>             it's lack of
>           o Could you give us some introduction about Flup, Prefork
>             server (or whatever it is) and how to install it to improve
>             the performance?
>     * So (according to what I've been reading) I conclude that a great
>       performance machine for stressed production enviroment would be
>       made up of:
>           o Multiple core machine +
>           o Flup Server packages +
>           o Cherokee +
>           o Postgres +
>           o Web2py Tricks
>     * Massimo, in a old thread I read something about a prebuilt vmware
>       with cherokee and web2py, ready to install, is it any news about this?
>           o In terms of "marketing", I think that if we could upload a
>             vmware turnkey appliance to here
>             <http://www.vmware.com/appliances/> (like Django one
>             <http://www.vmware.com/appliances/directory/82433>), it
> alex.fan...@gmail.comwww.mhproject.org

what_ho

unread,
Aug 23, 2009, 2:55:02 PM8/23/09
to web2py-users
Intrigued by the recommendation to put code in modules instead of
models if possible.

At present I have db.define_table .. method calls in a model file. The
database structure will stay the same between releases, so it does not
feel optimal at present to have such definitions run on every page
request. Same for the auth and mail objects in my model file, these
are created each page request at present.

Would it be possible to define db, auth and mail just the once at a
module level and then refer to these shared objects between page
requests? If per-request copies are required, would a .copy operation
on a template mail/auth/db object run faster than just defining these
in the model?

I am going to have a play with this. Very impressed with web2py so
far, it is great how code changes I make are picked up on the fly by
default. But going for the nth degree of performance (just for the
hell of it) I'm interested to see if it is possible to restrict things
like running db.define_tables to a startup task or manual admin
operation instead of on each page request, and in general pull logic
out of the models for anything that is static or changes infrequently.

If anyone else has experience with puting their models on a diet and
shifting code to modules, would be interested to hear. Cheers.
> > > done a few Drupal sites and can see theperformanceeffects of caching
> > > very clearly. IIRC only Django has caching in the python world?
>
> > > If you use multiple installations behind a load balancer I suggest you
> > > use the "pound" load balancer to keep sessions sticky. In that case
> > > the different processes do not need to share any data.
>
> > > - Has anyone done any work with web2py in a cluster (similar to a
> > > Tomcat cluster behind mod_jk)? (multiple machines running web2py, the
> > > session data sync'd etc. I can put the session info in a shared FS
> > > though)
>
> > > If you need sessions and you need sessions synced, I suggest you share
> > > the sessions folder.
>
> > > Massimo
>
> > > On Jul 21, 12:20 am, Bottiger<bottig...@gmail.com>  wrote:
>
> > >> If it is truly not computationally intensive, and does not even use a
> > >> database, it should not be a problem.
>
> > >> I have benchmarked Web2Py on the static welcome page to 700 requests/
> > >> second with a concurrency level of 50.
>
> > >> To increase the level of concurrency (if you have additional CPU
> > >> cores), you should increase the number of Web2Py processes.
>
> > >> "~ 8000 users, and atleast 1000-2000 simultaneous users."
>
> > >> This is not really a large installation if it doesn't use a database.
>
> > >> "How are the issues of caching (say rendered pages) handled? I have
> > >> done a few Drupal sites and can see theperformanceeffects of caching
> > >> very clearly. IIRC only Django has caching in the python world?"
>
> > >> Drupal, Django, and Web2Py have equivalent caching mechanisms. Any
> > >> external caching mechanism you have seen with Drupal should also be
> > >> usable with Web2Py or Django.
>
> > >> "I can put the session info in a shared FS though"
>
> > >> You can either do that or use a database for sessions.
>
> > >> On Jul 20, 8:54 pm, Anand Vaidya<anandvaidya...@gmail.com>  wrote:
>
> > >>> Hi
>
> > >>> After a couple of web2py projects, I am confident of coding a fairly
> > >>> big app in web2py.
>
> > >>> My previous projects did not need any database (we had to use
> > >>> flatfiles), the new project is also similar. I intend to bypass the
> > >>> models etc completely.
>
> > >>> The app is likely to be used in a corporate setting with ~ 8000 users,
> > >>> and atleast 1000-2000 simultaneous users.
>
> > >>> The users authenticate to an LDAP server.
>
> > >>> The app is not computationally intensive
>
> > >>> It queries another service and displays results
>
> > >>> No SQL DB is required
>
> > >>> Most likely behind a few Apache 2.x front server
>
> > >>> I'd like to know:
>
> > >>> - Are there any large web2py installations that I can quote as an
> > >>> example
>
> > >>> - How are the issues of caching (say rendered pages) handled? I have
> > >>> done a few Drupal sites and can see theperformanceeffects of caching

Yarko Tymciurak

unread,
Aug 23, 2009, 4:04:36 PM8/23/09
to web...@googlegroups.com
On Sun, Aug 23, 2009 at 1:55 PM, what_ho <al...@viovi.com> wrote:

Intrigued by the recommendation to put code in modules instead of
models if possible.

At present I have db.define_table .. method calls in a model file. The
database structure will stay the same between releases, so it does not
feel optimal at present to have such definitions run on every page
request. Same for the auth and mail objects in my model file, these
are created each page request at present.

This defines the structure of the interface to the tables in question (it does not define the tables);

If you will have rare or no access to the database, or perhaps many tables, of which you usually only access a very small percentage, then you could put this in modules and import only the tables you need to reference (e.g. read, write, create a query, login, add a user, add ... etc., etc.).

If you will have ONLY ONE application running in your web installation, then your other alternative is to put your data table definitions file in gluon, and import it from main.py.   This way, it will be defined upon server startup and be available to all request threads.   In fact, you could do this kind of "quick hack" to compare performance.

I think in most cases, the performance difference will not be significant, but I look forward to what you find.

- Yarko

 

what_ho

unread,
Aug 24, 2009, 6:07:49 AM8/24/09
to web2py-users
I did some quick and dirty performance tests to measure the time taken
by db.define_table calls.

Test was as follows:
- Run web2py locally, with built-in web server
- Create new application 'perftest' in web2py admin interface
- Add 5 dummy table definitions to db.py, uncomment mail and auth
settings (see copy of my db.py at end of this message)
- Run the HP tool httperf to access perftest home page 1000 times
using the following command:
httperf --hog --server=localhost --port 8000 --uri /perftest/default/
index --num-conns=1000
- Time tests with db.py as defined below, with table definitions
removed, and with db.py file removed completely

To remove the table definitions I commented out the line
'auth.define_tables()' and all lines starting 'db.define_table' in
db.py

For reference I am running Ubuntu Linux (karmic alpha), Intel Dual
Core 2.0Ghz, 4GB RAM. httperf version used is 0.9.0 (available from
synaptic package manager). web2py version 1.66.2, python 2.5

Results - time to receive 1000 pages from web2py built-in webserver:

Not compiled:
Full db.py - 22.4s
No table defs in db.py - 11.5s
No db.py at all 9.6s

Compiled:
Full db.py - 16s
No table defs - 5.7s
No db.py at all 4.1s

This does appear to show there would be a performance benefit to
sharing the db object definitions between page requests.

Yarko - I like your suggestion of putting table definitions in gluon
just as a quick hack to compare performance - I will try this next.

Please find the source of the db.py file I used below:

# coding: utf8

if request.env.web2py_runtime_gae: # if running on Google
App Engine
db = DAL('gae') # connect to Google
BigTable
session.connect(request, response, db=db) # and store sessions and
tickets there
else: # else use a normal
relational database
db = DAL('sqlite://storage.sqlite') # if not, use SQLite or
other DB

from gluon.tools import *
auth=Auth(globals(),db) # authentication/
authorization
auth.settings.hmac_key='6944f00d-1758-41e4-8fdb-625b0be17e1a'
auth.define_tables() # creates all needed
tables
crud=Crud(globals(),db) # for CRUD helpers using
auth
service=Service(globals()) # for json, xml, jsonrpc,
xmlrpc, amfrpc
crud.settings.auth=auth # enforces authorization
on crud
mail=Mail() # mailer
mail.settings.server='smtp.gmail.com:587' # your SMTP server
mail.settings.sender='y...@gmail.com' # your email
mail.settings.login='username:password' # your credentials or
None
auth.settings.mailer=mail # for user email
verification
auth.settings.registration_requires_verification = True
auth.settings.registration_requires_approval = True
auth.messages.verify_email = \
'Click on the link http://.../user/verify_email/%(key)s to verify
your email'

db.define_table('perftest1',
SQLField('string1','string'),
SQLField('text1','text'),
SQLField('blob1','blob'),
SQLField('password1','password'),
SQLField('upload1','upload'),
SQLField('boolean1','boolean'),
SQLField('integer1','integer'),
SQLField('double1','double'),
SQLField('date1','date'),
SQLField('time1','time'),
SQLField('datetime1','datetime'))

db.define_table('perftest2',
SQLField('string2','string'),
SQLField('text2','text'),
SQLField('blob2','blob'),
SQLField('password2','password'),
SQLField('upload2','upload'),
SQLField('boolean2','boolean'),
SQLField('integer2','integer'),
SQLField('double2','double'),
SQLField('date2','date'),
SQLField('time2','time'),
SQLField('datetime2','datetime'))

db.define_table('perftest3',
SQLField('string3','string'),
SQLField('text3','text'),
SQLField('blob3','blob'),
SQLField('password3','password'),
SQLField('upload3','upload'),
SQLField('boolean3','boolean'),
SQLField('integer3','integer'),
SQLField('double3','double'),
SQLField('date3','date'),
SQLField('time3','time'),
SQLField('datetime3','datetime'))

db.define_table('perftest4',
SQLField('string4','string'),
SQLField('text4','text'),
SQLField('blob4','blob'),
SQLField('password4','password'),
SQLField('upload4','upload'),
SQLField('boolean4','boolean'),
SQLField('integer4','integer'),
SQLField('double4','double'),
SQLField('date4','date'),
SQLField('time4','time'),
SQLField('datetime4','datetime'))

db.define_table('perftest5',
SQLField('string5','string'),
SQLField('text5','text'),
SQLField('blob5','blob'),
SQLField('password5','password'),
SQLField('upload5','upload'),
SQLField('boolean5','boolean'),
SQLField('integer5','integer'),
SQLField('double5','double'),
SQLField('date5','date'),
SQLField('time5','time'),
SQLField('datetime5','datetime'))

mdipierro

unread,
Aug 24, 2009, 6:19:07 AM8/24/09
to web2py-users
thanks for this test. Would you be able to also test them setting

db.define_table(....,migrate=False)

?

which is what people should do on production.

Massimo
> mail.settings.sender='...@gmail.com'         # your email
> mail.settings.login='username:password'      # your credentials or
> None
> auth.settings.mailer=mail                    # for user email
> verification
> auth.settings.registration_requires_verification = True
> auth.settings.registration_requires_approval = True
> auth.messages.verify_email = \
>   'Click on the linkhttp://.../user/verify_email/%(key)sto verify

what_ho

unread,
Aug 24, 2009, 6:51:16 AM8/24/09
to web2py-users
Sure thing -

I modified the auth line to auth.define_tables(migrate=False) , and
added migrate=False as the last parameter of each of the 5 test tables

So with migrate=false added:

full db.py, not compiled - 18.2s
full db.py, compiled - 12.1s

compared to previous results with migration on:
not compiled: 22.4s
compiled: 16s

- Alex

mdipierro

unread,
Aug 24, 2009, 7:03:01 AM8/24/09
to web2py-users
Thank you!

Just to clarify. 12ms/request includes the time to define all system
tables, your tables, create a new session, execute controller, action,
view and layout, not just the time to execute the db.py. Correct?

Can you post your results on AlterEgo? OR would you mind if I post it
and quote you as author of the benchmarks?

Massimo

what_ho

unread,
Aug 24, 2009, 8:04:26 AM8/24/09
to web2py-users
no problem!

correct - 12ms per request is the time I recorded to completely
process one webpage request, from the start point of the http request
headers being sent to the server, to the last byte of the HTTP 200
response being successfully received by the client.

This is using all default settings for session (enabled) etc. and the
default controller and 'Hello World' view one gets when using the
admin webpages to create a new application. The only addition to the
defaults are the 5 dummy table definitions.

I have added this to alter ego here:
http://www.web2py.com/AlterEgo/default/show/248

(Some of the bullet point lists did not come out right in the markdown
in this article, I will need to go edit at a later date)

Here is a full output from the httperf load test tool for the last
test, compiled with migrate=False on:

Total: connections 1000 requests 1000 replies 1000 test-duration
12.119 s

Connection rate: 82.5 conn/s (12.1 ms/conn, <=1 concurrent
connections)
Connection time [ms]: min 0.1 avg 12.1 max 75.3 median 10.5 stddev 6.0
Connection time [ms]: connect 0.9
Connection length [replies/conn]: 1.000

Request rate: 82.5 req/s (12.1 ms/req)
Request size [B]: 84.0

Reply rate [replies/s]: min 79.4 avg 84.1 max 88.8 stddev 6.6 (2
samples)
Reply time [ms]: response 11.2 transfer 0.1
Reply size [B]: header 391.0 content 5619.0 footer 2.0 (total 6012.0)
Reply status: 1xx=0 2xx=1000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 1.46 system 5.97 (user 12.1% system 49.2% total
61.3%)
Net I/O: 491.1 KB/s (4.0*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

mdipierro

unread,
Aug 24, 2009, 8:14:04 AM8/24/09
to web2py-users
Fantastic. Email me for the edit code.

Iceberg

unread,
Aug 24, 2009, 8:28:21 AM8/24/09
to web2py-users
Thanks for the interesting test, Pearson (What_Ho).

Besides of trying Yarko's "putting table definitions in gluon just as
a quick hack to compare performance", would you mind try a more
general way?

A. Put table definitions into applications/yourapp/modules/
yourtable.py, and then just import them from your db.py? I assume
this is enough to bypass the source code parsing overhead in each
request, although the db objects are still rebuild in every request.

B. If you do plan A, you might probably like to do a plan B, just
simply compile your app with full db.py, then benchmark again. Maybe A
and B provide similar result.

C. I am not sure yet, but if plan A helps a bit, we can again use the
cache trick to reuse the same db object for every request. Something
like this:
from applications.yourapp.modules.yourtable import _init_db
db=cache.ram('my_db', _init_db(), 99999999)
Can it give another boost?

Sincerely,
Iceberg

mdipierro

unread,
Aug 24, 2009, 8:38:46 AM8/24/09
to web2py-users
I rearranged it a bit. Hope it is ok.

http://www.web2py.com/AlterEgo/default/show/248

This is very valuable. It would be nice if somebody where post a
reddit link about this.

Massimo

mdipierro

unread,
Aug 24, 2009, 8:43:29 AM8/24/09
to web2py-users
It would be nice to try the same benchmark with postgresql (running
locally) and connection pools enabled

DAL(...,pool_size=100)

Connections pools are ignored with sqlite because it does not support
concurrent requests.

Also notice that because of the GIL this benchmark is only using one
of the two cores.

I added some comments about this too.


On Aug 24, 7:14 am, mdipierro <mdipie...@cs.depaul.edu> wrote:

what_ho

unread,
Aug 30, 2009, 2:45:54 PM8/30/09
to web2py-users
Hi all - just had time this weekend to look further into this.

Thank you to suggestions from Iceberg, Yarko and Massimo. I have tried
putting table definitions into a module, and also run some timing
tests against a Postgres database

As a note, for my previous tests coming out at 12ms page response,
those same previous test now come out at 10ms on my computer .. I
think just in picking up latest Ubuntu patches etc. something must
have changed in my configuration, anyway all the figures below are
just a rough guide.

Here are the changes I made:

db and mail object creation now in modules\application.py.
These objects are created once then shared.

Setting 'ENABLE_APPLICATION_SCOPE = False' in application.py switches
back to standard behaviour (separate db and mail object on every page
request)

auth,crud,service objects are dependent on the current page request,
so I left these in models\db.py.

One exception is auth.define_tables - this is run in modules
\application.py. Usually runs once only unless
ENABLE_APPLICATION_SCOPE is False, then auth.define_tables runs on
every page request.

Postgres version is 8.4.0-2 installed on same machine web2py is
running on. Computer set-up same as for previous test (except for
Ubuntu patches mentioned above), the test is to request the index page
created by default for a new web2py application.

Running 1000 page requests with httperf tool:

postgres, compiled, migrate=false, application scope objects disabled,
pool_size parameter not set:
58.2ms average page response time

postgres, compiled, migrate=false, application scope objects disabled,
pool_size=100:
10.9ms average page response time

postgres, compiled, migrate=false, application scope objects
enabled,pool_size=100:
5.2ms average page response time

sqllite, compiled, migrate=false, application scope objects disabled:
10.1ms average page response time

sqllite, compiled, migrate=false, application scope objects enabled:
5.4ms average page response time

---------

My summary:

Sharing objects like the database model definition at the website
application scope does give a speed boost, but it is not as big a
difference as I thought - I have been pleasantly surprised that the
existing per-webpage request evaluation of models was already pretty
fast as it is.

The difference in Postgres with timings when pool_size is not set at
all is significant : This almost warrants defaulting the pool size for
this database to 100 anyway, or at least outputting to the log a
friendly warning to suggest setting a pool size parameter.

I can see some downsides when attempting to use objects at application
scope:

- I'm not yet sure objects like DAL and Mail will be happy being
shared at the application level. Could well be locking / threading
issues, would need to investigate further

- In general in Python, module source code changes do not get picked
up automatically. During development I would work around this with the
reload(...) statement

- My example shared module probably needs additional try..catch
statements to handle if it fails to run, so every webpage request will
keep on trying to create the shared objects until they have been
successfully created.

- There may be good reason to refresh these shared objects
periodically anyway, maybe tie the shared object creation to a cron
job.

If I could be confident there are no threading issues then using
shared objects could be an option. But until then I am now happier
sticking with the standard setup - putting maybe quite complex model
structures in the regular model folder, these will be evaluated on
every webpage request, but the site will still run plenty fast - good
enough for me anyway!

Here is my updated db.py and module source code at the end of this
message. Just add both to a blank application created by the web2py
admin.

Cheers,
- Alex

--
db.py model file, pulling in shared 'application scope' objects:

# coding: utf8
from gluon.tools import *
import applications.perftest.modules.application as app

#Get objects shared between page requests
app.Helper.get_application_scope(globals())

#Rest of this file sets up per-page model state:
if not globals().has_key('auth'):
auth=Auth(globals(),db)

auth.settings.hmac_key='6944f00d-1758-41e4-8fdb-625b0be17e1a'
auth.settings.mailer=mail # for user email
verification
auth.settings.registration_requires_verification = True
auth.settings.registration_requires_approval = True
auth.messages.verify_email = \
'Click on the link http://.../user/verify_email/%(key)s to verify
your email'
crud=Crud(globals(),db)
crud.settings.auth=auth

#Connect gae session if required
if request.env.web2py_runtime_gae: session.connect(request, response,
db=db)

#Set up service for this request
from gluon.tools import *
service=Service(globals())

----------------------------------------
application.py module file:

# coding: utf8

from gluon.storage import Storage
from gluon.sql import SQLDB, SQLField, DAL, Field
from gluon.tools import *
import thread

ENABLE_APPLICATION_SCOPE = True

class Helper(object):

singleton = None

def __init__(self, environment):

request = environment['request']
self.storage = Storage()
self.storage.db = self.get_db(request)
self.storage.mail = self.get_mail()

@staticmethod
def get_application_scope(globals):

if ENABLE_APPLICATION_SCOPE:

if Helper.singleton is None:
locker = thread.allocate_lock()
locker.acquire()
Helper.singleton = Helper(globals)
Helper.singleton.init_auth(globals);
locker.release()

globals.update(Helper.singleton.storage)
else:
request_scope = Helper(globals)
request_scope.storage.auth = request_scope.init_auth
(globals)
globals.update(request_scope.storage)

def get_db(self, request):

if request.env.web2py_runtime_gae:
db = DAL('gae')
else:
#db = DAL('sqlite://storage.sqlite')
db=DAL('postgres://testuser:test@localhost:5432/web2py',
pool_size=100)

for i in range(1, 5):
db.define_table('perftest' + str(i),
SQLField('string1','string'),
SQLField('text1','text'),
SQLField('blob1','blob'),
SQLField('password1','password'),
SQLField('upload1','upload'),
SQLField('boolean1','boolean'),
SQLField('integer1','integer'),
SQLField('double1','double'),
SQLField('date1','date'),
SQLField('time1','time'),
SQLField('datetime1','datetime'),migrate=False)

return db

def get_mail(self):
mail=Mail() # mailer
mail.settings.server='smtp.gmail.com:587' # your SMTP
server
mail.settings.sender='y...@gmail.com' # your email
mail.settings.login='username:password' # your
credentials or None
return mail

#Just used to initialise auth database tables
def init_auth(self, globals):
auth=Auth(globals, self.storage.db)
auth.define_tables(migrate=False)
return auth

what_ho

unread,
Aug 30, 2009, 2:53:54 PM8/30/09
to web2py-users
Hi Yarko -

Just from feedback for you - From my tests I found with all
optimisations enabled (migrate=False, python files compiled etc.) that
running in database model definitions does not take a lot of time,
just as you suggest.

Running model definitions in once takes even less time, but I still
have reservations such objects would behave themselves, when all the
code that creates them expects those object to just last for one page
request at present.

With regards to your idea of splitting up a large database definition
into chunks and only loading the chunk related to the particular page
request, that's a good idea, but with the speed of the definition
evalution, I actually now don't think in most cases it would be worth
it for performance anyway. The best reason if any might be just for
readability and other developers - to break a complex database down
into logical components.

Cheers,
-Alex

On Aug 23, 9:04 pm, Yarko Tymciurak <yark...@gmail.com> wrote:

what_ho

unread,
Aug 30, 2009, 3:19:34 PM8/30/09
to web2py-users
Hi Iceberg -

With regards to bypassing source code parsing overhead, I think that
overhead only occurs when the site is not compiled. Once you compile a
site in the web2py admin pages the .pyc bytecode files are created for
the models, and then on each webpage request it looks like these .pyc
files are loaded directly into the prepared environment with the
request, response objects etc. and then the models are run in.

(apologies if you are already well versed in the above, I am only just
finding these things out myself!)

Just to confirm this, attached to another post on this thread I have
some example code with the model definitions now in a module Setting
ENABLE_APPLICATION_SCOPE to false at the top of this file causes it to
create model objects every page request, so this is basically your
suggestion above to move model definitions into a module file, With
this arrangement I clocked much the same timings as when I just had
all model definitions in one models\db.py file - no speed increase
unfortunately just from using modules (although I did get a speed
increase by reusing a shared object between page requests)

For using caching.ram , I did not go with this for now. My thinking
from my experience in other development environments was caching
usually involves serialising and deserialising objects. It will never
be as fast as if you can just hold onto one shared object in a static
variable, so I concentrated the next test I did around this.

Cheers,
- Alex

mdipierro

unread,
Aug 31, 2009, 8:50:53 AM8/31/09
to web2py-users
Thanks for this test Alex.

Massimo
>   'Click on the linkhttp://.../user/verify_email/%(key)sto verify
>         mail.settings.sender='...@gmail.com'         # your email
Reply all
Reply to author
Forward
0 new messages