possible major performance improvement

84 views
Skip to first unread message

Massimo Di Pierro

unread,
Apr 12, 2012, 10:20:21 AM4/12/12
to web2py-developers
Say you have this code:

# in models/db.py
db = DAL()
db.define_table('person',Field('name'))

You have the problem that the table definition is executed at every request. That is how web2py works and we live with it. It has its own benefits and keeps things simples. If you have 100 tables you may have performance issues. You can use conditional tables (break models in subfolders) but here I am going to propose yet another and better solution.

Now you can do this:

# in module/tables.py
from gluon import *
db = DAL()
db.define_table('person',Field('name'))

# in models/db.py
from tables import db
db._adapter.reconnect()

The tables are defined once and only once per thread and not at every request, when the model is first imported. db stays cached. Yet the connection pooling logic is runs at evey request thanks to the call to db._adapter.reconnect(). despite the name it does not always reconnect. It only does it if a new connection is necessary, else it uses the pool.

This can still be improved a little in the implementation (for example not reconnect always commit but we does not need to).

Comments? Suggestions?

Richard Vézina

unread,
Apr 12, 2012, 10:31:43 AM4/12/12
to web2py-d...@googlegroups.com
Do we have to care about determining with models we need for a give controller/function/request or the models are only executed once since they are defined in modules?

Seems a elegant solution to the problem.

Richard

--
mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/

massimo....@gmail.com

unread,
Apr 12, 2012, 10:38:48 AM4/12/12
to web2py-d...@googlegroups.com
It does not matter any more. But adain thi needs more testing


From my Android phone on T-Mobile. The first nationwide 4G network.


-------- Original message -------- Subject: Re: [w2py-dev] possible major performance improvement From: Richard Vézina To: web2py-d...@googlegroups.com CC:

villas

unread,
Apr 12, 2012, 10:40:57 AM4/12/12
to web2py-d...@googlegroups.com
I believe this concept is a major improvement.  It is important that we can finally have an "official",  flexible method of avoiding the rebuilding of table defs on every request.  If this can become a proper documented feature,  I think it will remove one of the main recurrent criticisms of the framework.

Ross Peoples

unread,
Apr 12, 2012, 10:54:55 AM4/12/12
to web2py-d...@googlegroups.com
+1

Mariano Reingart

unread,
Apr 12, 2012, 11:13:02 AM4/12/12
to web2py-d...@googlegroups.com
It looks promising, but what about:

db.Field('created_by_ip',readable=False,writable=False,default=request.client),
db.Field('created_on','datetime',readable=False,writable=False,default=request.now),

That would not work anymore, would it?

With care, runtime definitions could be left in models:

db.auth_user.reviewer.writable=db.auth_user.reviewer.readable=auth.has_membership('manager')
db.auth_user.speaker.writable=db.auth_user.speaker.readable=auth.has_membership('manager')

so the first two would become:

db.auth_user.created_by_ip.default = request.client
db.auth_user.created_on.default = now

Field default keyword should be banned or warned?

Best regards

Mariano Reingart
http://www.sistemasagiles.com.ar
http://reingart.blogspot.com

Bruno Rocha

unread,
Apr 12, 2012, 11:21:06 AM4/12/12
to web2py-d...@googlegroups.com
lambdas can solve this problem?

default=lambda: request.now
--

Bruno Rocha

unread,
Apr 12, 2012, 11:22:42 AM4/12/12
to web2py-d...@googlegroups.com


On Thu, Apr 12, 2012 at 11:20 AM, Massimo Di Pierro <massimo....@gmail.com> wrote:
from tables import db
db._adapter.reconnect()

can I do this only on controlelrs?

def index():
    from tables import db
    db._adapter.reconnect()  # we need a more elegant method name here!

    ....

   return dict()


Can it work in this way? so it will avoid models at all

Massimo Di Pierro

unread,
Apr 12, 2012, 11:23:13 AM4/12/12
to web2py-d...@googlegroups.com
You are exactly right. All of the field attributes would have to be
stay in the model or in the controller.

I think we should not change the way people use web2py today. At least
not by default. There are certain advantages in not having to worry
about this aspect.
The new feature will allow re-factoring of code to increase speed in
those cases when this is a critical issue.
The new feature may open the road to a web3py that is not backward
compatible and has similar syntax but different rules.

Massimo

Massimo Di Pierro

unread,
Apr 12, 2012, 11:24:42 AM4/12/12
to web2py-d...@googlegroups.com
yes for default:

default=lambda: current.request.now


we could make lambdas for for every field attribute but is it worth it? It will make everyhing slower compared to setting defaults in models anyway.

Massimo Di Pierro

unread,
Apr 12, 2012, 11:24:50 AM4/12/12
to web2py-d...@googlegroups.com
yes


Mariano Reingart

unread,
Apr 12, 2012, 11:31:42 AM4/12/12
to web2py-d...@googlegroups.com
lambdas will not help on i.e. readable and writable (and in general,
it would be a big refactory, not only DAL)...

Anyway, I think reconnect looks nice, knowing its caveats, is the best
approach so far.

BTW, db._adapter.reconnect() could be simplified to just db.reconnect()?

what happens if I call recoonect twice?

Best regards,

Massimo Di Pierro

unread,
Apr 12, 2012, 11:35:31 AM4/12/12
to web2py-d...@googlegroups.com

On Apr 12, 2012, at 10:31 AM, Mariano Reingart wrote:

> lambdas will not help on i.e. readable and writable (and in general,
> it would be a big refactory, not only DAL)...

Here is a big problem now that you make me think about it. There is
only one db object shared by all threads.

If one thread sets

db.table.field.readable = True #

whit will affect the field attribute (in the example readable) for all
concurrent and subsequence requests. As usual there is a reason web2py
does what it does how it does it.


>
> Anyway, I think reconnect looks nice, knowing its caveats, is the best
> approach so far.
>
> BTW, db._adapter.reconnect() could be simplified to just
> db.reconnect()?

yes.

> what happens if I call recoonect twice?

nothing bad happens.

>
> Best regards,
>
> Mariano Reingart
> http://www.sistemasagiles.com.ar
> http://reingart.blogspot.com
>
>
>
> On Thu, Apr 12, 2012 at 12:24 PM, Massimo Di Pierro
> <massimo....@gmail.com> wrote:
>> yes for default:
>>
>> default=lambda: current.request.now
>>
>>
>>
>> we could make lambdas for for every field attribute but is it worth
>> it? It
>> will make everyhing slower compared to setting defaults in models
>> anyway.
>>
>> On Apr 12, 2012, at 10:21 AM, Bruno Rocha wrote:
>>
>> lambdas can solve this problem?
>>
>> default=lambda: request.now
>>
>> On Thu, Apr 12, 2012 at 12:13 PM, Mariano Reingart <rein...@gmail.com
>> >
>> wrote:
>>>
>>> It looks promising, but what about:
>>>
>>>

>>> db
>>> .Field
>>> ('created_by_ip
>>> ',readable=False,writable=False,default=request.client),
>>>
>>> db
>>> .Field

>>> ('created_on
>>> ','datetime',readable=False,writable=False,default=request.now),
>>>
>>> That would not work anymore, would it?
>>>
>>> With care, runtime definitions could be left in models:
>>>
>>>

>>> db
>>> .auth_user
>>> .reviewer
>>> .writable
>>> =db.auth_user.reviewer.readable=auth.has_membership('manager')
>>>
>>> db
>>> .auth_user
>>> .speaker
>>> .writable

Bruno Rocha

unread,
Apr 12, 2012, 12:17:18 PM4/12/12
to web2py-d...@googlegroups.com


On Thu, Apr 12, 2012 at 12:35 PM, Massimo Di Pierro <massimo....@gmail.com> wrote:
  db.table.field.readable = True #

it belongs to the forms, if one chooses to use the new approach, have to know that this kind of thing will nedd to be done at form level, setting hidden fields...

Massimo Di Pierro

unread,
Apr 12, 2012, 12:51:38 PM4/12/12
to web2py-d...@googlegroups.com
but it is complicated. changing any attribute (except default=lambda) will no longer be thread safe.


Mariano Reingart

unread,
Apr 12, 2012, 1:40:35 PM4/12/12
to web2py-d...@googlegroups.com
I still think we have to decouple table definition (define_table) from
instantiation (i.e. reconnect, but thread-safe), so what about:

from tables import db

db = db.connect()

This will bind a new db object (assuming connect create a new copy of
the database object).

Best regards,

On Thu, Apr 12, 2012 at 1:51 PM, Massimo Di Pierro

Massimo Di Pierro

unread,
Apr 12, 2012, 1:54:54 PM4/12/12
to web2py-d...@googlegroups.com
Yes but if we are copying db and all the stuff it contains (tables,
fields, etc) than it is no different than what web2py does today.

Mariano Reingart

unread,
Apr 12, 2012, 2:20:58 PM4/12/12
to web2py-d...@googlegroups.com
Yes, but if db would be decupled, it could be cached per thread ;-)

Anyway, copying should be faster than executing models.

Best regards,

On Thu, Apr 12, 2012 at 2:54 PM, Massimo Di Pierro

Massimo Di Pierro

unread,
Apr 12, 2012, 2:49:04 PM4/12/12
to web2py-d...@googlegroups.com
that should be easy to try. I will try in the next few days.

Michele Comitini

unread,
Apr 12, 2012, 6:21:23 PM4/12/12
to web2py-d...@googlegroups.com
Threads... ;)

Why not use an extended Storage class? This storage can cope with two
values for each properties one value is thread local and the other is
not.
Behind the extended Storage object two bucket (dicts?) contain the
properties. One bucket is thread local the other is a singleton.

* Request start
The thread local bucket is instantiated at each request empty.

* Writing
When writing directly to the property the value is written in the
thread local bucket.
When the Field constructor is called the value is written in the
singleton bucket.

* Reading
A value corressponding to requested property in thread local bucket
overrides always singleton bucket.

* End of request
Thread local bucket reference is thrown away.

Could this work?

mic


Il 12 aprile 2012 20:49, Massimo Di Pierro
<massimo....@gmail.com> ha scritto:

Massimo Di Pierro

unread,
Apr 13, 2012, 9:49:09 AM4/13/12
to web2py-d...@googlegroups.com
I do not think this will work. Even if a second request is executed in the same thread and there is one db, it should see all its properties initialized and not altered by previous requests. Copy is the only solution. the problem is that there are some ambiguities in how deep the copy should be.

Jonathan Lundell

unread,
Apr 13, 2012, 9:56:01 AM4/13/12
to web2py-d...@googlegroups.com
On Apr 13, 2012, at 6:49 AM, Massimo Di Pierro wrote:
>
> I do not think this will work. Even if a second request is executed in the same thread and there is one db, it should see all its properties initialized and not altered by previous requests. Copy is the only solution. the problem is that there are some ambiguities in how deep the copy should be.

Have you made performance measurements of DAL initialization? How expensive is it, and where does the time go?

Reply all
Reply to author
Forward
0 new messages