Database Creation

375 views
Skip to first unread message

Mark Wallsgrove

unread,
Apr 11, 2012, 8:12:40 AM4/11/12
to peewe...@googlegroups.com
Hi all,

I am building an application where the database location is not known until a configuration file has been read. I have created a proxy object that delays the creation of a SqliteDatabase until a method on the proxy is called (rather bad code: http://pastebin.com/cRZhnmz6). This has had little effect as it looks like Python is calling the __new__ method on the BaseModel when I import the class:

  File "/home/smoky/Documents/Workspace/TASv3/Code/master/MasterBusinessLogic/Results/Loggers/SQLLiteLogger.py", line 5, in <module>
    from MasterBusinessLogic.Results.Model import ModelUtils
  File "/home/smoky/Documents/Workspace/TASv3/Code/master/MasterBusinessLogic/Results/Model/ModelUtils.py", line 100, in <module>
    class BaseModel(Model):
  File "/usr/local/lib/python2.7/dist-packages/peewee-0.9.1-py2.7.egg/peewee.py", line 2353, in __new__
    if _meta.db_table in _meta.database.adapter.reserved_tables:
  File "/home/smoky/Documents/Workspace/TASv3/Code/master/MasterBusinessLogic/Results/Model/ModelUtils.py", line 28, in __getattribute__
    return getattr(CreationDelayProxy.__getObject(self), name)
  File "/home/smoky/Documents/Workspace/TASv3/Code/master/MasterBusinessLogic/Results/Model/ModelUtils.py", line 23, in __getObject
    config = ConfigurationManager().getConfiguration('resultworker').configuration
  File "/home/smoky/Documents/Workspace/TASv3/Code/common/Utils/Configuration/ConfigurationManager.py", line 66, in getConfiguration
    raise InvalidConfiguration("Cannot find configuration: %s" % name)
InvalidConfiguration: Cannot find configuration: resultworker

Should this occur? Does __new__ normally get called when a class is imported? Has anyone else got a workaround for this?

# EDIT:
Would the code below create a static instance of Model by any chance?

class BaseModel(Model):
    class Meta:
        database = CreationDelayProxy(SqliteDatabase)
  
Best Regards,
Mark

Mark Wallsgrove

unread,
Apr 11, 2012, 8:40:31 AM4/11/12
to peewe...@googlegroups.com
.. .. Found the "problem". __new__ is called on BaseModel because its defined as a Metaclass for Model. __new__ will be called on a Metaclass when the creation of related class occurs.
Message has been deleted

Mark Wallsgrove

unread,
Apr 11, 2012, 9:07:54 AM4/11/12
to peewe...@googlegroups.com
Figured it.. .. Line 2353 & 2380 require the database. It looks like the default values are only overridden for PostgresqlDatabase. 

Charles

unread,
Apr 11, 2012, 9:09:50 AM4/11/12
to peewe...@googlegroups.com
Gahh...I typed up a whole thing, damn this new google groups interface.

Anyways, I was just thinking out loud about an API for this.  It could work as a new option, i.e.

class Meta:
    database_loader = load_database

def load_database():
   return SqliteDatabase('foo')

I'm not in love with this, though.  Generally its good practice to share a single Database object across as many models as will be using that particular database, since connections are managed by the database instance.

Mark Wallsgrove

unread,
Apr 11, 2012, 9:58:50 AM4/11/12
to peewe...@googlegroups.com
I have attached a patch that stops the connection being created until it is needed. The attributes that are required belong to the adapter classes. The code within the BaseModel.__new__ could refer to the adapter class definition rather than a instance. What you think?
patch

Mark Wallsgrove

unread,
Apr 11, 2012, 10:00:10 AM4/11/12
to peewe...@googlegroups.com
Btw, the problem isn't about sharing multiple databases, it is about when the database object is created.

Charles

unread,
May 9, 2012, 11:41:46 AM5/9/12
to peewe...@googlegroups.com
I think I've got a decent solution.  The details are in github issue 79:


The docs are taking a while to update, but you can find notes here:

NeoPolus

unread,
May 13, 2012, 1:35:33 PM5/13/12
to peewe...@googlegroups.com
Hi.
We (some friends and me) have been considering using Peewee for a small Flask project - I'm a Peewee newbie :) -, and yesterday we struggled for some time trying to find the right way to do just what is being discussed here: deferring the initialization of the database until the configuration has been loaded.

We started with a simple Flask application, much like the "walrus-mix" sample, where the database is configured before the base model is defined, something like this:

from flask import Flask
from peewee import *
...
app = Flask(__name__)
app.config.from_object('config.Config')
database = SqliteDatabase(app.config['DATABASE'])
...
class BaseModel(Model):
    class Meta:
        database = database

class User(BaseModel):
    username = CharField()
    ...

When using a single Python file, it remains easy and works well. But if you try to refactor the code to define the models on a separate source file (think of having a "models.py" file) it doesn't look so nice anymore, as you need to do something like this on the main flask-app source file:

from flask import Flask
from peewee import *

app = Flask(__name__)
app.config.from_object('config.Config')

from models import User, ...   # We cannot put this import on the top of the file
                               # with the rest of imports! => ugly and non pythonic :(



It gets even worse if you try to put every model on it's own source file (something like having 'users/models.py', 'posts/models.py'... or 'models/users.py', 'models/posts.py'...): not only every model import must be done after the configuration has been loaded, but every model file would depend on a common file defining the 'BaseModel'; which does not look very decoupled and reusable (you may want to reuse your 'users/models.py' on several projects...).


Not knowing about this thread - :( - we tried to solve both problems doing something like this:

from flask import Flask
import peewee
from users.models import User  # Note: User is a direct subclass of peewee.Model
from posts.models import Post, Comment
...
def use_db(db, basemodel=peewee.Model):    
    for cls in basemodel.__subclasses__():
        cls._meta.database = db

app = Flask(__name__)
app.config.from_object('config.Config')
db = peewee.SqliteDatabase(app.config['DATABASE'])
use_db(db) # TADAAA!


TL;DR


What are we missing in Peewee?

  • A way of setting the database to use *after* the models have been defined.
    • I understand that the version 0.9.6 adds a solution to this, to bad I didn't notice this thread until today :(
  • A decoupled way of setting such database, without having to define a 'BaseModel' on a shared file (a relative import).
    • Any ideas on what is the right way to accomplish this?
    • Maybe some utility function similar to 'use_db' can be defined on Peewee to allow this.

Best regards!

Charles

unread,
May 13, 2012, 7:29:44 PM5/13/12
to peewe...@googlegroups.com
I'm excited to hear you guys are considering using peewee for your project -- I much prefer using peewee w/flask over django for any size project these days, but small projects are a great fit (by the way, have you seen flask-peewee? https://github.com/coleifer/flask-peewee).  Sounds like you've got two questions

1. how to defer loading database until config loaded
2. how to structure imports in a non-gross way

I will try and answer them the best i can:

1. this is a new feature in 0.9.6 -- it is handy for when you don't exactly know what database you need until runtime.  the way it works is stupid simple, just pass in None instead of your db name, then when you're ready set the db name (and any other connection info using the init method):


2. this is not a peewee problem, but a python problem, and is somewhat more pronounced when using a framework like flask where there is no automagical module loading/discovery (as in django).  just remember its python.  the way i structure my flask apps is shown in the flask-peewee example project: https://github.com/coleifer/flask-peewee/tree/master/example - i have used this structure to build complex apps w/multiple blueprints and have not had issues w/imports.

The gist of it is thus -- we don't want circular imports, but we want stuff in its own modules.  so i do it like this:

app.py -- holds the Flask app and any other init, like creating a peewee database or loading up config
models.py -- imports the database and app from app.py and then defines models using those databases
views.py -- imports the app from app.py and models from models.py and does interesting things

All good so far.  now here's the key:

main.py -- imports app.py, models.py and views.py, then exposes your flask app and *this* is what you point your wsgi server at.

so main.py looks like this and ensure that everything is loaded and in the correct order to avoid circular imports:

from app import app, db
from models import *
from views import *

# load up any blueprints here as well:
from some_blueprint.models import *
from some_blueprint.views import *

if __name__ == '__main__':
    app.run()

i do not like defining models in a closure, this is bad juju.  hope these suggestions help.

charlie

NeoPolus

unread,
May 14, 2012, 12:49:05 PM5/14/12
to peewe...@googlegroups.com
Thanks for your reply Charles :)

Yes, we heard about Flask-Peewee, and use it. And again yes, you groked it, I'm talking about two different but related problems: deferring (the db creation) and structuring (the source code).

Note: You may want to just skip to the "TL;DR" section.

To put you in context We are testing and learning Flask, and we studied some options for the database/ORM layer. Originally we selected SqlAlchemy and Peewee as test cases, one being complex and explicit and the other one being simple and implicit; then created a sample app with each one, and finally decided that Peewee is closer to our needs :)

This is our story with Peewee in Flask:

First step - Peewee

We created a sample Flask app that just used Peewee (we wanted to learn how to use Peewee on a standard project, without the Flask-Peewee 'magic'), being a simple structure much like the one you propose, but separating the Flask app and Peewee database (the code is online if you want to take a look), similar to this:
    app.py -- Flask app initialization.
    database.py -- Peewee database initialization and base model.
    user.py (your models.py) -- imports database.py and then defines the user model.
    cowlab.py (your main.py + views.py) -- imports app.py, database.py..., defines the views (controller) and runs the app. (note: 'cowlab' is the project name)


Why did we split app.py into app.py and database.py? Because if we want to use a different database on our unit tests, we are forced to overwrite the application configuration (on the test files) before loading the database or models (before Database(app) is called):
from app import app				# Flask app with default config
app.config.from_object('config.TestingConfig')  # Load testing config *before* setting up the database. FIXME: Ugly!
from database import database     		# 'Database(app)' is called here
from user import User                 		# Import the peewee models, using the testing database.

It works, but it's ugly not having all the imports together at the begin of the file :(

With the 0.9.6 deferred loading, our code could be improved, and instead of doing things like:
# ...load the app configuration (set 'app.testing') before reaching this point:
database = peewee.SqliteDatabase('testing.db' if app.testing else 'development.db', threadlocals=True)

We would do:

database = peewee.SqliteDatabase(None,threadlocals=True)
# ...load the app configuration anytime (i.e. 'app.testing' or the database name), and afterwards:
database.init('testing.db' if app.testing else 'development.db')

That would solve the ugliness I was talking about :)

...as long as we don't want to switch database engines for developing and testing (like Postgres for developing and Sqlite for testing) :(

Q: Is there a way to defer the selection of the database engine?

Second step - Flask-Peewee

Afterwards we switched to flask-peewee, to take advantage of its features (the authentication and the admin views). (code)

Thanks Flask-Peewee, we wouldn't even really need to define a 'BaseModel' and its Meta inner class to set the database, Flask-Peewee does it for us nicely:
>>> from flask_peewee.db import Database
>>> from app import app
>>> database = Database(app)
>>> database.database # 'Automagically' setup with the database configured in app.config
<peewee.SqliteDatabase at 0x92ef48c>
>>> database.Model    # The base model is also setup with the configured database.
flask_peewee.db.BaseModel
>>> database.Model._meta.database
<peewee.SqliteDatabase at 0x92ef48c>
>>> database.Model._meta.database.database
'development.db'

And (I have just checked) we may use deferred loading too with Flask-Peewee:

from app import app # Note: app.config['DATABASE'] = {'engine': 'peewee.SqliteDatabase', 'name': None, 'check_same_thread': False }
from flask_peewee.db import Database

db = Database(app)
# ...overwrite the app configuration anytime (set 'app.testing'), and afterwards:
db.database.init('testing.db' if app.testing else 'development.db')

But again, it only allows to defer the database name, not the database engine :(

Third step - Decoupling

Now, on a third step, we are trying to refactor the code to make the models independent of the Flask app, so they can be reused (for example, in a -long and distant- future, we could have a non-flask backend that needs to access the database). That's why I don't like having to import the app.py (or database.py) on 'models.py' (or wherever the models are defined), as that makes the models depend on Flask. I know this is not a problem on small sized projects, but as we are doing this to learn, we want to check whether Peewee can be scaled to bigger setups.

So, to allow this decoupling we need a way to allow setting the database to use *after* the models have been loaded, and without requiring the models to "import app".

I think that we can accomplish it by using both the deferred database feature of 0.9.6 *and* a sort of 'use_db' (name-it-as-you-like) function.


TL;DR - What I want to do

An example of what I would like to be able to do:

import peewee

class A(peewee.Model):
    name = peewee.CharField()
class B(Something):
    text = peewee.CharField()

# Currently the database for 'A' and 'B' models is the default 'peewee.db' Sqlite one.

peewee.use(peewee.SqliteDatabase('first.db'))

A.create_table()

peewee.use(peewee.PostgresqlDatabase('second',user='code'))

B.create_table()
# Now we have a Sqlite database named 'first.db' with a table named 'a'
# and a Postgres database with a table named 'b'.
peewee.use(peewee.SqliteDatabase('third.db'), A)
peewee.use(peewee.SqliteDatabase('fourth.db'), B)

A.create_table() # Creates the table in the 'third.db' Sqlite database.
B.create_table() # Creates the table in the 'fourth.db' Sqlite database.

A.create(name="a test") # Inserts a record in the 'a' table of 'third.db'
B.create(name="a test", text="some text") # Inserts a record in the 'b' table of 'fourth.db'


The use function would be something like this:

def use(db, basemodel=peewee.Model):
    """
    Allows to set the database to use for the Peewee models.
    If basemodel is specified only that model and subclasses are updated.
    """
    if basemodel != peewee.Model: # Don't touch peewee.Model
        basemodel._meta.database = db  
    for cls in basemodel.__subclasses__():
        cls._meta.database = db
	use(db, cls) # Recursive!

What do you think?
Does somebody see some hidden problem in this idea?

Thanks for your thoughts!

El 14/05/12 01:29, Charles escribió:
-- 
Borja López Soilán
bo...@kami.es

Charles

unread,
May 14, 2012, 1:26:41 PM5/14/12
to peewe...@googlegroups.com
I have done several projects using flask where I use my models outside the context of the flask app, for example with command-line tools, so I believe I can speak to your questions about decoupling things.  There are several things it sounds like you want to decouple:

* your models from your database
* your database from your flask app

Decoupling models from the database is not designed to be all that easy, since a model must to talk to a database.  You can accomplish this to some degree by deferring loading your database.  By deferring loading your db, you can do things like point at a particular db if staging locally, if testing or if running in production.  Beyond that you're getting into a gray area and I'm not sure of the use-cases.

Decoupling your db from your flask app shouldn't be too hard:

1. create some shared configuration, will be used for your flask app and whatever other tools you intend to write
2. create your database instance and models using the config
3. write whatever tools using just the db and models
4. write whatever views/web stuff using the db and models

This is more or less what I wrote in my previous post, the only exception being put the db and models and config outside the context of the flask app.

Finally, I would caution against a setup like the one in your example, how are you going to keep track of which db your models point to?  Why would a model need to exist on multiple databases?  You can use deferred loading to point the model at different dbs for testing/staging/production.
Reply all
Reply to author
Forward
0 new messages