Porting myself to Python/Pylons, I have to learn to use SQLAlchemy
properly because looking through the 300+ pages of its documentation it
seems complex but very powerful, especially its ORM.
However in my PHP apps I always contain all the data logic to the model
itself. The controllers only call specific methods in order to do
something with the model, not the data directly: loadById() would load
model data into its properties from relevant tables. save() would check
if pkey is set, and then insert or update existing rows accordingly.
Also, since I work exclusively with transactional Postgres, I like to
rely on constraints to tell me what's wrong. Instead of locking entire
table to see if new data would produce duplicates in columns that should
be unique, I rely on the DB transaction to raise Constraint Violation
error from which I read the column in question and can inform the user
to modify this or that field in the form, because it's duplicate.
The problem with that is that the DB engine throws a generic Constraint
Violation from which I have to "extract" constraint name, and produce a
meaningful message to the user, like "Name already exists, please try
another". Therefore I catch generic SQL exceptions in the model save()
method, and re-throw custom exceptions made for the model like, for
example, NameDuplicateError, or EmailDuplicateError, etc... so the
controller can catch these and send back proper message, or alter the
form to highlight fields in question.
I wonder what other patterns would you guys use?
Vlad
Generally, but I think there may be also a case for "services" that
use multiple models.
> Porting myself to Python/Pylons, I have to learn to use SQLAlchemy
> properly because looking through the 300+ pages of its documentation it
> seems complex but very powerful, especially its ORM.
>
> However in my PHP apps I always contain all the data logic to the model
> itself. The controllers only call specific methods in order to do
> something with the model, not the data directly:
This is the approach I use. My thinking is that the model should be
separate from any user interface or application. I view a Pylons app
as a (thin) layer that adapts a model (and perhaps services) to the
Web (as a UI or Web services). As such, I usually implement the model
in a separate Python package.
> loadById() would load model data into its properties from relevant tables.
Nitpicky style point: the prevalent style for function/method names in
Python is load_by_id. See PEP 8.
There are several philosophies on this. MVC has strayed far from its
original meaning, and there are several equally valid interpretations
of it.
There are three ways to implement the model. The minimal approach is
to put only the table definitions in it. The maximal approach is to
put all data logic into it. The super-maximal approach is to write a
model abstraction layer, so that you can replace the storage backend
completely (e.g., SQLAlchemy to object database or CouchDB). The
fourth approach is to put as much logic and permissions checking into
the database itself, which is especially feasable with PostgreSQL.
One way to say it is that the model is for things specific to your
business, the view is for things specific to the UI, and the
controller is for things specific to Pylons (the
request/response/session API). But some people are more pragmatic,
and pull things into the controller that are more easily done in
Python code. Because Pylons doesn't have a place for "model logic"
and "view logic". Clearly, the model can be separated into logic vs
data structure, and the view also contains Python code vs template.
That would argue for the model to "contain" the database structure
rather than being the database structure, and the view would be a
Python function that invokes a template, rather than just the template
itself. But Pylons tutorials have always gone for a minimal model and
minimal view, and so that's what most people write.
I prefer a medium model. I have class methods in my ORM objects for
all the logical collections of records my app requires. So one method
to iterate the recent records for the home page. Another to do a
search. Another to iterate all reacords, But I don't resolve queries
into lists. I return the query object so that the controller or
template can iterate it, get a record count, etc. That means the
controller/template knows it's a SQLAlchemy query, which violates some
containment principles, but it leads to the most streamlined code.
There are a few people who have been pushing most of their logic into
PostgreSQL stored procedures and triggers, and using its roles to
enforce permissions. Aruynn Shaw has done the most work on this that
I've seen, but last I looked it hadn't gotten onto PyPI and was hard
to find. This looks like the best link:
https://projects.commandprompt.com/public/
- Exceptable: convert PostgreSQL exceptions to Python exceptions.
- Simpycity: a Python db layer for calling PostgreSQL stored
procedures. (Not compatible with SQLAlchemy.)
- VerticallyChallenged: authorization via database roles, with
Repoze.who for authentication. (This one doesn't seem to be online
yet.)
If you want to use these and reach a dead end, let me know and I can
put you in touch with Aurynn. Maybe that will help prod her to make
PyPI releases. :)
> The problem with that is that the DB engine throws a generic Constraint
> Violation from which I have to "extract" constraint name, and produce a
> meaningful message to the user, like "Name already exists, please try
> another". Therefore I catch generic SQL exceptions in the model save()
> method, and re-throw custom exceptions made for the model like, for
> example, NameDuplicateError, or EmailDuplicateError, etc... so the
> controller can catch these and send back proper message, or alter the
> form to highlight fields in question.
That's what Exceptable is for. I think it could be made more pythonic,
but I haven't had a need for it myself so I haven't had an incentive
to.
--
Mike Orr <slugg...@gmail.com>
Yeah, I just wasn't sure of the best practice in Pylons. The examples
seem to use the models as pure ORM maps, with minor exception like the
QuickWiki tutorial and the formatting method added to the model.
> I prefer a medium model. I have class methods in my ORM objects for
> all the logical collections of records my app requires. So one method
> to iterate the recent records for the home page. Another to do a
> search. Another to iterate all reacords, But I don't resolve queries
> into lists. I return the query object so that the controller or
> template can iterate it, get a record count, etc.
Your approach is very much like mine, with the exception that it is
Pylons/SQLAlchemy specific with returning the query object. Good idea
because my current practice is to return arrays of model classes (PDO's
FETCH_CLASS mode), so that the controller can call individual methods on
"rows" of classes if needed. The only drawback to this is separating
full vs. limited data, for example if you needed only some fields that
make sense in the list, so the method returning these takes a list class
fields to load, aside to limit and offset of the rows. Complicated, but
imho provides best separation of concerns and with that offers a good
level of DRY.
Also, good number of row retrievals requires a hierarchical list, so I
have a helper class that constructs multilevel arrays (which would be
dictionaries in Pylons), and "linear" arrays with appended field that
designates its level in otherwise hierarchical list. This is
particularly useful when using same set of rows to construct:
- tables (pure list of rows)
- Navigational constructs (hierarchical dictionaries)
- SELECT html elements with '--' added for the number of "levels" it is
nested in the tree.
But for all that I need the class methods (static methods in PHP) to
always return same format of data. Be it classes of models in question
or dictionaries.
> There are a few people who have been pushing most of their logic into
> PostgreSQL stored procedures and triggers, and using its roles to
> enforce permissions. Aruynn Shaw has done the most work on this that
> I've seen, but last I looked it hadn't gotten onto PyPI and was hard
> to find. This looks like the best link:
> https://projects.commandprompt.com/public/
>
I like to push only data related logic into stored procedures and
triggers. If nothing then for one reason: tampering with pure SQL will
have same result as if the application interface was used, otherwise one
risks corrupting the data. Especially if more than one administrator is
involved.
I also toyed with having no queries at all in the application except
simple calls to stored procedures that deal with the data. This approach
for example would turn models into nothing more than simple ORM maps,
with a few extra methods that directly call stored procedures. I
understand that this might lose the benefit of a certain level of
"caching" that occurs within SQLAlchemy...
> - Exceptable: convert PostgreSQL exceptions to Python exceptions.
>
Thanks, I'll be needing this!
> - Simpycity: a Python db layer for calling PostgreSQL stored
> procedures. (Not compatible with SQLAlchemy.)
>
I'll take a look.
> - VerticallyChallenged: authorization via database roles, with
> Repoze.who for authentication. (This one doesn't seem to be online
> yet.)
>
Other than for DB administration, I don't see the point in this. :) But,
oh, well...
> That's what Exceptable is for. I think it could be made more pythonic,
> but I haven't had a need for it myself so I haven't had an incentive
> to
How else are SQL errors raised in Python, then? I suppose it is driver
specific?
Thanks,
Vlad
Precisely. But it also cuts down the code required, especially if
complex apps where more than one controller would like same operation on
a model.
> Nitpicky style point: the prevalent style for function/method names in
> Python is load_by_id. See PEP 8.
>
Yup, I've read the style guidelines, I just used the method names that I
currently use in PHP. Thanks, though!
Vlad
Oh, I do this too. If the controller really wants a dict or
hierarchical structure or scalar result rather than an iterable of ORM
instances, I calculate that and return it.
> I like to push only data related logic into stored procedures and
> triggers. If nothing then for one reason: tampering with pure SQL will
> have same result as if the application interface was used, otherwise one
> risks corrupting the data. Especially if more than one administrator is
> involved.
Well, they believe strongly in PostgreSQL and use its unique features
(which means the applications aren't portable to other SQLAlchemy
databases). The argument is that you get a lot of efficiency, data
integrity guarantees, and security if you push logic to the lowest
database level. So that you won't ever have another client
accessing/writing it in an inconsistent way. Plus it's neutral to
client programming languages.
> I also toyed with having no queries at all in the application except
> simple calls to stored procedures that deal with the data. This approach
> for example would turn models into nothing more than simple ORM maps,
> with a few extra methods that directly call stored procedures. I
> understand that this might lose the benefit of a certain level of
> "caching" that occurs within SQLAlchemy...
The SQLAlchemy session caches mapped records in case you ask for them
again. I'm not sure that matters much in a Pylons application. The
session is cleared after every request, and how often do you make
repeated calls for the same record within a request? You have the
record in a variable, why do you need to fetch it again?
Postgres functions can return ad hoc "rows" or "tables", so you'd have
to have a lot of ORM classes if you wanted to map all those. Either
that or just use the raw ResultProxy.
>> - Exceptable: convert PostgreSQL exceptions to Python exceptions.
>>
>
> Thanks, I'll be needing this!
It does what you said your code does: it catches general Postgres
exceptions and parses the message to raise a specific Python
exception.
>> - VerticallyChallenged: authorization via database roles, with
>> Repoze.who for authentication. (This one doesn't seem to be online
>> yet.)
>>
>
> Other than for DB administration, I don't see the point in this. :)
The argument is that Postgres' built-in authorization is better tested
than anything you can write on your own. Plus, by protecting the data
at the lowest database level, you can be sure that no Python or other
utility can bypass your security policy.
The downside is that it's meant for a limited number of Postgres
users, and is not attuned to a web application that may create dozens
of users in a week. So it only works if you can map your web users
into a few pre-specified database users. You would also have to log
into the database with the highest-permission user and then switch
down using a SQL statement, similar to Unix 'su'. But I'm not sure how
you'd go up in privilege again in another request (if you're reusing
the same connection).
The other thing is it doesn't have row-based permissions, only
table-based. Unless it was added in PostgreSQL 8.4. Although I suppose
you can mimic row-based permissions in a stored procedure.
The other issue is how to get all your stored procedures into the
database. The solution seems to be text SQL files in your version
control system.
> How else are SQL errors raised in Python, then? I suppose it is driver
> specific?
I did a few tests with MySQL (I'm not using Postgres myself yet) and got:
sqlalchemy.exc.ProgrammingError # SQL syntax
sqlalchemy.exc.IntegrityError # Duplicate primary key
OperationalError: (OperationalError) (1048, "Column 'name' cannot be null")
# I can't tell who raised this exception.
InvalidRequestError: The transaction is inactive due to a rollback in
a subtransaction. Issue rollback() to cancel the transaction.
# This was the primary error raised when the previous
# occurred during a session.commit().
FlushError: New instance <User at 0xa65b5ac> with identity key (<class
'rlink.model.auth.User'>, ('mike...@example.com',)) conflicts with
persistent instance <User at 0xa749e6c>
# Raised during another session.commit()
So it looks like all of them except maybe the third are generic
SQLAlchemy exceptions.
--
Mike Orr <slugg...@gmail.com>
The links are, in order:
https://projects.commandprompt.com/public/exceptable/
https://projects.commandprompt.com/public/simpycity/
https://projects.commandprompt.com/public/verticallychallenged/
VC I haven't updated the wiki/written docs for yet.. but it's coming. :)
The repo can be viewed
https://projects.commandprompt.com/public/verticallychallenged/repo/
>
> If you want to use these and reach a dead end, let me know and I can
> put you in touch with Aurynn. Maybe that will help prod her to make
> PyPI releases. :)
I *am* trying on this, I ran into a bug with my setup.py not pulling in
the required SQL files. :(
>
>> The problem with that is that the DB engine throws a generic Constraint
>> Violation from which I have to "extract" constraint name, and produce a
>> meaningful message to the user, like "Name already exists, please try
>> another". Therefore I catch generic SQL exceptions in the model save()
>> method, and re-throw custom exceptions made for the model like, for
>> example, NameDuplicateError, or EmailDuplicateError, etc... so the
>> controller can catch these and send back proper message, or alter the
>> form to highlight fields in question.
>
> That's what Exceptable is for. I think it could be made more pythonic,
> but I haven't had a need for it myself so I haven't had an incentive
> to.
>
Yes, this is exactly the problem that Exceptable is designed to solve.
If you can give me an exact example of the constraint error, I can add a
check into Exceptable proper.
Thanks,
--
Aurynn Shaw
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 103
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
We are all in the gutter, but some of us are looking at the stars.
-- Oscar Wilde
Yes. :) By enforcing it at the database level, any client connecting to
the database will need to play by the same rules, short of being a
database superuser.
>
> The downside is that it's meant for a limited number of Postgres
> users, and is not attuned to a web application that may create dozens
> of users in a week. So it only works if you can map your web users
> into a few pre-specified database users. You would also have to log
> into the database with the highest-permission user and then switch
> down using a SQL statement, similar to Unix 'su'. But I'm not sure how
> you'd go up in privilege again in another request (if you're reusing
> the same connection).
This is also true; Vertically Challenged requires that the database
roles being used are just that - roles that define specific actions.
Users are then stored in a table as normal, and mapped to the correct
role at session initiation via the Postgres command SET ROLE.
SET ROLE allows for the permissions change to be undone, as well, so
that a database handle can be reset to use the more permissible role at
the end of the execution session.
>
> The other thing is it doesn't have row-based permissions, only
> table-based. Unless it was added in PostgreSQL 8.4. Although I suppose
> you can mimic row-based permissions in a stored procedure.
It wasn't, though column-level permissions are available.
So far, stored procedures are the only way to allow for row-level
permissions.
>
> The other issue is how to get all your stored procedures into the
> database. The solution seems to be text SQL files in your version
> control system.
This is what we do, though PG9.0 is going to allow for DO blocks.
Combined with pl/pythonu, it should be possible to issue queries which
encapsulate (for example) pickled code from the Python layer.
As pl/pythonu is an untrusted language, this *will* require a database
superuser, though. :(
Regards,
I know I'm a bit off-topic but does SQLAlchemy allows for row-based
permissions?
Thanks,
--
Frank
Permissions are handled by the database engine, not SQLAlchemy.
SQLAlchemy is just a front end for passing SQL statements to the
database.
You pretty much have to go to commercial database engines for
row-based permissions at tjios. MySQL might have it in one of its
backends somewhere. SQLite does not have any permission restrictions,
although it's subject to the filesystem permissions of the database
file.
--
Mike Orr <slugg...@gmail.com>
create table rolemask (
role_name name,
role_mask int4,
primary key rolemask_pkey (role_name)
);
Not the most elegant solution, but it does a very limited job reasonably well because the rule allows the planner to cache the query plan that includes the security check once for a connection/table combination. It's a simple _RETURN rule that (if memory serves) looks like this:
create rule "_RETURN" as on select to sometable do instead
select * from sometable where row_mask is null or
row_mask >= (select role_mask from rolemask where role_name = SESSION_USER);
Similar rules on insert/update can be used a little more creatively, and can throw an error. The select rule simply excludes rows entirely. Also, the insert/update rules don't apply to COPY operations.
I've done a few different variations of the above, including using multiple masks (adding a mask_type to the tables involved), functions, or functions and rules together, but this seems to give the best performance, even compared to a caching function (pl/pythonu) that stores role and permissions data at the connection level.
Just a thought :-)
Richard
>--
>You received this message because you are subscribed to the Google Groups "pylons-discuss" group.
>To post to this group, send email to pylons-...@googlegroups.com.
>To unsubscribe from this group, send email to pylons-discus...@googlegroups.com.
>For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en.
>
> I wonder what other patterns would you guys use?
>
>
>
> Vlad
>
Little late chiming in here.
In the past I've tried to fully utilize all the features of the model
classes, and I found myself bending my requirements to fit the
framework. So I've backed off from that in favor of treating models as
strictly row representations. This led to my controllers being a lot
more complicated, which I didn't like either. So I've recently found
that bfg model traversal(i call it context traversal) fits my brain
better than routes. I'm not saying that everyone should ditch pylons
in favor of bfg however. Just saying that I'm finding that there's
potentially another object role in the system besides a
model/view/controller.
I guess my line of thinking right now is....
1 - views inject data into templates
2 - controllers receive web requests and dispatch to something that
produces something else to hand off to the views
3 - models represent rows and maybe relationships
4 - context receives data from the controller decides what to do and
does it. whether it be get some models, or validate form posts.
So far I feel this has led to a fairly rigid pattern that makes
interaction between all the pieces consistent. But I'm sure that
there's some complex scenario out there that breaks everything.
--
Thomas G. Willis
I still, pretty much, always query the db to check for a duplicate
value or possible constraint violation.
1- It's easier than parsing errors. plus you don't have to
continually flush() or save() to trap the error.
2- There's really not much of a performance hit on any site i've ever
worked on.
In terms of the MVC debate - I've gone through the entire array of
options that Mike Orr has illustrated ( though not all in Pylons ).
I've had Models that have most of the logic, and the Controllers just
call them... I've had Models that have no logic... and I've had Models
that wrap ORMs as an abstraction layer, so I can use multiple backends
(or ORMs ) easily.
After years of struggle, I've ended up with this approach:
- I put 'search' and 'new' functions in the model. I also put
'render' functions too ( like model.F_url() to display a url )
- If more than one controller needs to do similar operations on the
model, I break it out into a helper method and stash it as a library
function.
The problem I encountered with putting too much logic in the models,
is that MANY of my models were highly interrelated with other models.
From a maintenance and testing standpoint, this created too many
headaches. I really don't like 'horizontal' interactions of models
creating and modifying one another. So many issues arose. Examples
that caused issues for me were Group creates / controls Members ( and
vice versa ); or if Account creates AccountAsset or ApiKey.
I can't recall if I had the most issues with Python or Perl or
something else... but with horizontal interaction within the model, I
always had issues of inclusion order / dependencies / etc.
So I've found it simpler that each Model is only concerned with a
representation of One member of its own class -- and that a
Controller , or helper function , handles the interrelation of all of
them.
But if you have the constraints, you're doing the checks twice. First
from your application, and then the db engine does it anyways since
there are constraints. Perhaps you don't have a performance hit, but I
can assure you, if you had a (highly) concurrent system and lots of rows
to check and lock through (because you need SELECT ... FOR UPDATE;
alternatively just lock the entire table) the performance hit would
become significant and visible.
Vlad
Well I mostly rely on the Exception 23505 which is constraint violation.
I have checked the Exceptable code but I am not sure how to proceed
building my own exception class based on a regex pattern. AFAIK, need to
check this, but I believe Postgres reports constraint violations with
the constraint name in double quotes at the end of the string, for example:
duplicate key value violates unique constraint "constraint_name"
insert or update on table "table_name" violates foreign key constraint
"constraint_name"
So I guess the Constraint violations exceptions should match both the
code 23505 and regex 'constraint "([a-z_0-9]+)"$', where the constraint
name could be extracted as a property of the constraint exception? That
way we can extend it into custom exceptions like I am doing now in PHP.
Vlad
On Feb 3, 12:28 pm, Haron Media <i...@haronmedia.com> wrote:
> But if you have the constraints, you're doing the checks twice. First
> from your application, and then the db engine does it anyways since
> there are constraints. Perhaps you don't have a performance hit, but I
> can assure you, if you had a (highly) concurrent system and lots of rows
> to check and lock through (because you need SELECT ... FOR UPDATE;
> alternatively just lock the entire table) the performance hit would
> become significant and visible.
Absolutely -- one would have a significant performance hit on a highly
concurrent system. However I've only worked on a handful of sites in
my entire life where this would cause a noticeable issue. Very few
folks deal with this amount of traffic, and approaches like yours are
premature optimization for most folks.
It's great that this is a real issue for you.