DAL usage

131 views
Skip to first unread message

Arnon Marcus

unread,
Jan 16, 2014, 3:34:35 AM1/16/14
to web...@googlegroups.com
Our schema is quite large (200+ Tables) and changes very seldom.
I was thinking, is there a way to not have to rebuild it in it's entierty on every request?
I mean, is there a way to seperate-out the schema definition from the connection object? Ideally, I would put the schema definition code in a separate module in the modulesz folder, and import it into the model file that creates the connecfion, and somehow pass the ready-made schema-object(s) into the newly-created connection object on each request.
Can this be done?
What are the "gotchas" for this (if any)?
Are there restrictions for somed schema-definitions tbat can not be used like this?

Richard Vézina

unread,
Jan 16, 2014, 4:10:52 PM1/16/14
to web2py-users
lazy_table supposed to do just that no?

Richard



--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Anthony

unread,
Jan 16, 2014, 4:43:36 PM1/16/14
to web...@googlegroups.com
If you set DAL(..., lazy_tables=True), most of the table definition code is deferred until the first time you call db.tablename, so tables that are not accessed on a given request are not fully defined. Of course, you can put the schema definition code into a function or class in a module and import and execute that function/class at request time, but the actual definition code would still have to run at each request (albeit, only for the tables you actually need for that request).

It sounds like instead, you would prefer for the module to create a table object so the table object can be imported directly (and then added to the DAL connection object of the current request) -- that way, the table object would be created only once, the first time it is imported. This isn't possible with the current API, though I suppose you could create instances of the dal.Table class in the module and then write your own version of DAL.define_table to attach the imported Table objects to a DAL instance. Perhaps this could be made an option of the current .define_table() method (as it is, if you pass a Table object to .define_table(), it just makes copies of all the Field objects, but perhaps there could be an option to pass in a Table object that gets used as is). Not sure how much benefit there would be to this approach, assuming any given request doesn't involve too many table definitions -- lazy_tables, conditional models, and importing functions/classes from modules may be sufficiently speedy for most use cases.

Anthony

Anthony

unread,
Jan 16, 2014, 4:52:32 PM1/16/14
to web...@googlegroups.com
It sounds like instead, you would prefer for the module to create a table object so the table object can be imported directly (and then added to the DAL connection object of the current request) -- that way, the table object would be created only once, the first time it is imported.

Of course, the problem with this approach is that you couldn't dynamically change any attributes of the table or its fields (e.g., .readable and .writable) at request time, because that would effect other requests as well.

Anthony

Arnon Marcus

unread,
Jan 16, 2014, 6:16:23 PM1/16/14
to web...@googlegroups.com
It sounds like instead, you would prefer for the module to create a table object so the table object can be imported directly (and then added to the DAL connection object of the current request) -- that way, the table object would be created only once, the first time it is imported.


That's exactly what I meant.
It would then refresh itself only when you make a change to the schema, as the module would have to be re-compiled on import-time on the next request.

In most ORMs/DALs there is a clear separation between the connection-object and the schema-object(s), and for good reason : schema-changes are few and far between - requests/connections are numerous and rapid - it makes no sense tying them together like that... It's a poor design-choice, IMHO.

Of course, the problem with this approach is that you couldn't dynamically change any attributes of the table or its fields (e.g., .readable and .writable) at request time, because that would effect other requests as well.


I don't know that needing to define a per-request readable/writable is such a common use-case - or else I don't fully understand what you are alluding to...
Are these definitions for security that are modified based on auth-definitions, after being cross-referenced with the request/session-user?
If so, I guess they would have to be re-set automatically by the DAL object on each request, but that doesn't mean the entire-schema needs to be re-generated - the auth-definitions are already in the database anyways, and this means it's not a schema-related issue - it's just data that got bolted-onto the schema-objects for convenience... If it is taken from the database anyways (or even if it's caches), the bolted-on variables can be reset on each request in a thread-safe kind of way - though that may require further thinking...

Anthony

unread,
Jan 16, 2014, 9:01:30 PM1/16/14
to web...@googlegroups.com
I don't know that needing to define a per-request readable/writable is such a common use-case - or else I don't fully understand what you are alluding to...

Sometimes you want to show/hide particular fields in a form, grid, etc. depending on some condition, so you set the readable and writable attributes dynamically in the controller. Also, default, represent, and compute attributes as well as validators sometimes use request specific data. 
 
If so, I guess they would have to be re-set automatically by the DAL object on each request,

It's not just a matter of resetting. If there is only a single fixed instance of an object that gets imported from a module, any change to an attribute during a request will persist (and affect other simultaneous requests).

Anthony

Anthony

unread,
Jan 16, 2014, 9:04:48 PM1/16/14
to web...@googlegroups.com
In most ORMs/DALs there is a clear separation between the connection-object and the schema-object(s), and for good reason : schema-changes are few and far between - requests/connections are numerous and rapid - it makes no sense tying them together like that... It's a poor design-choice, IMHO.

The issue isn't connecting the schema objects with the connection object (that can be handled). The issue is that the schema object may include request specific mutable attributes.

Anthony

Arnon Marcus

unread,
Jan 17, 2014, 2:18:50 AM1/17/14
to web...@googlegroups.com
A database scema is a description of the structure of a database - it has nothing to do with requests.
You are talking about a convinience-feature that could have been implemented differently - this coupling is convinient but makes no sense from a performance standpoint.
I am looking for a way around that...
If there isn't any, it only means that for this to be possible it needs to be re-implemented in a way that would make that possible.

Vinicius Assef

unread,
Jan 17, 2014, 8:27:31 AM1/17/14
to web2py
On Fri, Jan 17, 2014 at 5:18 AM, Arnon Marcus <a.m.m...@gmail.com> wrote:
> A database scema is a description of the structure of a database - it has nothing to do with requests.
> You are talking about a convinience-feature that could have been implemented differently - this coupling is convinient but makes no sense from a performance standpoint.
> I am looking for a way around that...

migrate=False is here to accomplish that.

BTW, I think we have a misunderstanding point here. migrate=True is
usefull mainly in development environment and it's an excellent
feature, avoiding you to run command from shell, in a Django style.
Change your model and know the db schema will be in sync with it
autmatically is a great feature. It allows you to practice baby steps
in your db structure. In development environment.

On the other hand, migrate=True on production servers is a really bad
practice. In production machines, your database user shouldn't have
access to run DDL statements. It's a security commandment, right?

So, in a system running in production, migrate=True should raise a
grant error when it tries to sync db schema.

migrate=True is our good friend, but just for dev environment.

--
Vinicius Assef

Anthony

unread,
Jan 17, 2014, 9:52:53 AM1/17/14
to web...@googlegroups.com
Yes, as I mentioned, the problem is not coupling the schema with the connection but the fact that the model definition includes some relatively static attributes (such as the database schema) as well as some more dynamic attributes that might change from request to request. Your approach would require a re-implementation, with the database schema held in a structure separate from the rest of the model definition. On a typical request, it's not clear how much gain that would give you -- maybe a few milliseconds (though depending on your performance needs, I suppose that could be meaningful).

Anthony

Arnon Marcus

unread,
Jan 17, 2014, 10:01:35 AM1/17/14
to web...@googlegroups.com
I am trying to shave-off the needless DDL model definition in each request - when you have hundreds of tables with (cumulatively) thousands of fields, the needless overhead of having to re-compile all these objects on every request starts to add-up. It has nothing to do with auto-migration - you can still have auto-migration with schema-objects defined in an imported-module : every time the module changes, it re-compiles, and then auto-migration kicks in.
I am talking about the coupling of "mostly"-static data with highly dynamic one - all within the same execution-architecture - it's a sub-optimal design. For small projects, that usually wouldn't matter, as the overhead would be small. For bigger projects though....


--
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/WR6RAMRQesg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.

Anthony

unread,
Jan 17, 2014, 10:19:33 AM1/17/14
to web...@googlegroups.com
On Friday, January 17, 2014 10:01:35 AM UTC-5, Arnon Marcus wrote:
I am trying to shave-off the needless DDL model definition in each request - when you have hundreds of tables with (cumulatively) thousands of fields,

Are there requests where you need to use all or most of the database tables, or do you typically need only a handful of tables in any given request? If the latter, there is certainly no need to define all the tables on every request.

Anthony

Michele Comitini

unread,
Jan 17, 2014, 10:31:48 AM1/17/14
to web...@googlegroups.com
There are conditional models in web2py.  Normal flow control statements can be used to reduce the number of definitions too.


2014/1/17 Anthony <abas...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

Arnon Marcus

unread,
Jan 17, 2014, 10:45:24 AM1/17/14
to web...@googlegroups.com
I guess we can try to do conditional executions...
It's quite a hassle, though, as the controllers don't do all the work - most of our queries are in fact in separate modules that the controllers use, which make it much eazier to maintain but it would make it more difficult to segregate the conditionallity of schema-definitions based on controllers.. Not sure what other criteria we could use though...

Anthony

unread,
Jan 17, 2014, 11:11:34 AM1/17/14
to web...@googlegroups.com
Why don't you put your model definitions in modules (either inside of functions or classes)? Then, when you need a particular model or set of models, just import the relevant function/class and call it with whatever arguments are needed to give you the models you want.

Michele Comitini

unread,
Jan 17, 2014, 11:52:24 AM1/17/14
to web...@googlegroups.com
I agree with Anthony, if you put table definitions inside functions you get a great deal of flexibility and speedup.

e.g.
def define_user_tables():
   define_table('user' ....
   define_table('user_data', ....
   define_table('user_image' ...

then you just call:
 define_user_tables()
wherever you need those tables...



2014/1/17 Anthony <abas...@gmail.com>

--

Arnon Marcus

unread,
Jan 17, 2014, 2:56:27 PM1/17/14
to web...@googlegroups.com
That's a good idea, thanks.


You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/WR6RAMRQesg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.

黄祥

unread,
Jan 17, 2014, 7:18:00 PM1/17/14
to web...@googlegroups.com
is conditional models is same like response.models_to_run()? because i searched in this form related with conditional models, yet the experts in that discussions talk about put the models into modules and import it when needed.
ref:

another thing is, how about the performance according to put the models into modules and response.models_to_run()? which is better?

thanks and best regards,
stifan

Anthony

unread,
Jan 17, 2014, 9:05:42 PM1/17/14
to web...@googlegroups.com
On Friday, January 17, 2014 7:18:00 PM UTC-5, 黄祥 wrote:
is conditional models is same like response.models_to_run()? because i searched in this form related with conditional models,

Yes, "conditional models" generally refers to use of models_to_run. Note, there is a default models_to_run that runs models in folders named the same as the current controller and (optionally) function. You can change that by specifying a custom models_to_run.
 
another thing is, how about the performance according to put the models into modules and response.models_to_run()? which is better?

Assuming you use models_to_run to define the exact same set of models as you import from modules, then performance should be similar, with a slight edge to the import method (using models_to_run requires putting models in model files, which must be read in and executed on each request, whereas module files do not need to be read on every request, only on the first import).

If you need very fine-grained control over exactly which models are defined when, importing from modules exactly when/where needed might be simpler than trying to conditionally define models_to_run (which must be done within the model files, before you get to the controller code).

Anthony

Arnon Marcus

unread,
Jan 18, 2014, 8:11:29 AM1/18/14
to web...@googlegroups.com
Thanks anthony, you're right - this models_to_run approach makes no sense in my use-case - it assumes a 1:1 correlation between controllers and their used-models, which would only occure in very simple apps. The whole point of using a relational database is to have great flexibility and diversity of relations. The models_to_run approach seems to assume a very restricted use-pattern of relations.

Anthony

unread,
Jan 18, 2014, 2:08:16 PM1/18/14
to web...@googlegroups.com
Only the default value of models_to_run assumes a 1:1 correspondence between controllers and models. You can set the value of models_to_run to whatever you want conditionally (and it can even change multiple times from model file to model file). So, for example, you could do:

if request.controller in ['controller1', 'controller2', 'controller3']:
    response
.models_to_run = ['path/to/modelA.py', 'path/to/modelB.py']

In any case, you still might prefer to put model definitions in modules and import where needed.

Anthony

Anthony

unread,
Jan 18, 2014, 2:10:07 PM1/18/14
to web...@googlegroups.com
Also, note that models_to_run is a list of regexes (relative to the /models folder), so you don't need to list each individual model file.

Anthony

黄祥

unread,
Jan 20, 2014, 10:19:12 AM1/20/14
to web...@googlegroups.com
hi anthony,

/modules/mymodule.py:

from gluon import Field

def mymodels(db):
    db
.define_table('table1', Field('field1'), Field('field2'))
    db
.define_table('table2', Field('field1'), Field('field2'))

/models/db.py or /controllers/some_controller.py:

from mymodule import mymodels
mymodels
(db)

in your example above, it said that the import can be on models and controllers, i've test both (put in models and in controllers), if i just put in controllers, the database administration (appadmin) for table1 and table2 is not shown, the database administration (appadmin) for table1 and table2 will shown if i import mymodels(db) in models.

my question, is it possible to import mymodels in controllers yet still got access for database adminsitration (appadmin) for table1 and table2 without import it in models?

ref:

Anthony

unread,
Jan 20, 2014, 1:17:57 PM1/20/14
to web...@googlegroups.com
appadmin is just a controller, like any other, so it only sees models defined in model files (note, there is special logic that runs all model files when the appadmin controller is called, even if response.models_to_run would normally make some models conditional). appadmin can't know what models might be imported by other controllers. In a model file, you could do something like:

if request.controller == 'appadmin':
   
[code to import and define all models]

I suppose one advantage of using models_to_run rather than importing from modules is you get full appadmin functionality without a workaround like the above.

Anthony

黄祥

unread,
Jan 21, 2014, 7:41:49 AM1/21/14
to web...@googlegroups.com
brilliant, never thought of that. thank you so much. yes i agree with you about the advantage of using models_to_run rather than importing tables from modules.

thanks and best regards,
stifan
Reply all
Reply to author
Forward
0 new messages