Model method versus overriding save()

76 views
Skip to first unread message

Victor Hooi

unread,
Dec 8, 2012, 1:54:37 AM12/8/12
to django...@googlegroups.com
Hi,

I have a "ranking" field for an item that returns an integer between 1 to 10 based on a number of criteria of each item.

My question is - what are the pros and cons of using a model method to return this, versus overriding the save() method and saving it directly into a normal IntegerField on that item?

I understand that model methods won't let me use them within QuerySet filters on that item - is there any way around that?

If I just override the model's save() method to get it recalculate and save that field each time, I can use it within QuerySet filters. Any cons with that approach?

What do you guys tend to use in your projects?

Cheers,
Victor

Chris Cogdon

unread,
Dec 8, 2012, 2:27:50 AM12/8/12
to django...@googlegroups.com
It's a simple performance vs storage question.

Storing a calculatable field also risks it getting out of sync with reality, but if you're doing the query on that _so_ much, then its usualyl worth it.

Also, with the right database and a trigger, that's something the database can ensure for you. Ie, a field that the database updates for you.

Derek

unread,
Dec 8, 2012, 1:37:00 PM12/8/12
to django...@googlegroups.com
Rather than use a trigger (which is DB-specific and also hard to debug because not part of your code base), suggest you use signals[1].

Derek

[1] https://docs.djangoproject.com/en/dev/topics/signals/

Thomas Lockhart

unread,
Dec 9, 2012, 6:54:08 AM12/9/12
to django...@googlegroups.com
On 12/8/12 5:37 AM, Derek wrote:
Rather than use a trigger (which is DB-specific and also hard to debug because not part of your code base), suggest you use signals[1].
Hmm. Triggers have advantages over application-level code where they can be used. They are likely more efficient (no data needs to be transferred to the client) and are more likely to ensure data integrity (by operating within a database transaction in an atomic fashion, and without application-level edge case failures). For cases like this one, debugging should be pretty easy because one can add or increment values in a table and watch the trigger do the additional work.

That's not to say that signals aren't a great feature, just that compact trigger code can have advantages and, well, that code is part of the code base too.

                              - Tom

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/G7fp5OLkapgJ.
To post to this group, send email to django...@googlegroups.com.
To unsubscribe from this group, send email to django-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

Mike Dewhirst

unread,
Dec 9, 2012, 11:30:01 AM12/9/12
to django...@googlegroups.com
On 9/12/2012 5:54pm, Thomas Lockhart wrote:
> On 12/8/12 5:37 AM, Derek wrote:
>> Rather than use a trigger (which is DB-specific and also hard to debug
>> because not part of your code base), suggest you use signals[1].
> Hmm. Triggers have advantages over application-level code where they can
> be used. They are likely more efficient (no data needs to be transferred
> to the client) and are more likely to ensure data integrity (by
> operating within a database transaction in an atomic fashion, and
> without application-level edge case failures). For cases like this one,
> debugging should be pretty easy because one can add or increment values
> in a table and watch the trigger do the additional work.
>
> That's not to say that signals aren't a great feature, just that compact
> trigger code can have advantages


For the sake of maintainability it might be better to keep all database
manipulation in the model layer rather than split it betweem models and
database.



and, well, that code is part of the
> code base too.

Which means you need to keep pretty comprehensive documentation if you
are doing database stuff in two areas.

Personally, I'd keep it all in the django ORM until the project is
mature and requires the final molecule of optimisation.

Mike

>
> - Tom
>
>>
>> Derek
>>
>> [1] https://docs.djangoproject.com/en/dev/topics/signals/
>>
>> On Saturday, 8 December 2012 04:27:50 UTC+2, Chris Cogdon wrote:
>>
>> It's a simple performance vs storage question.
>>
>> Storing a calculatable field also risks it getting out of sync
>> with reality, but if you're doing the query on that _so_ much,
>> then its usualyl worth it.
>>
>> Also, with the right database and a trigger, that's something the
>> database can ensure for you. Ie, a field that the database updates
>> for you.
>>
>>
>> On Friday, December 7, 2012 5:54:37 PM UTC-8, Victor Hooi wrote:
>>
>> Hi,
>>
>> I have a "ranking" field for an item that returns an integer
>> between 1 to 10 based on a number of criteria of each item.
>>
>> My question is - what are the pros and cons of using a model
>> method to return this, versus overriding the save() method and
>> saving it directly into a normal IntegerField on that item?
>>
>> I understand that model methods *won't* let me use them within

Chris Cogdon

unread,
Dec 9, 2012, 11:49:22 PM12/9/12
to django...@googlegroups.com
Even though I'm a total database junkie (and where by that I mean postgresql > mysql :) ), I have to agree with Mike. If you can keep it in the model layer, do that. Once you start putting optimisations into the database layer, you lose a lot of portability between databases: there is no such thing as "standard SQL" for anything other than toy applications. Optimisation tend to be very engine-specific.

However, just remember that those optimisations are possible, and the database is far more reliable for maintaining your invariants than the client is.

Victor Hooi

unread,
Dec 9, 2012, 11:59:23 PM12/9/12
to django...@googlegroups.com
Hi,

Hmm, so are we saying that:
  • Using model methods uses less storage, but performs worse.
  • Overriding the model's save() method uses more storage, but performs better.
I understand the storage part - but I'm a bit confused about the performance part - how does one perform better than the other?

Also - in terms of using them with QuerySets - there aren't any workarounds to use model methods with QuerySets are there? It seems like that would be a definite argument in favour of using the second method, right?

Finally - thanks for the tip about signals() - so should I be using something like django.db.models.signals.post_save in addition to overriding save(), or instead of it?

Cheers,
Victor

Mike Dewhirst

unread,
Dec 10, 2012, 5:08:18 AM12/10/12
to django...@googlegroups.com
On 10/12/2012 10:59am, Victor Hooi wrote:
> Also - in terms of using them with QuerySets - there aren't any
> workarounds to use model methods with QuerySets are there? It seems like
> that would be a definite argument in favour of using the second method,
> right?

I'm not sure what you mean exactly here but you can use them in model
methods. Use the class manager.

ClassName.objects.filter(thing=self.this, whatever=self.that, etc=etc)

>
> Finally - thanks for the tip about signals() - so should I be using
> something like django.db.models.signals.post_save in addition to
> overriding save(), or instead of it?

I haven't used signals yet but I think I'm going to in the near future
because I can't think of another way to deal with certain actions I want
to happen after deletion of related objects.

Mike

>

Chris Cogdon

unread,
Dec 12, 2012, 9:21:03 PM12/12/12
to django...@googlegroups.com
On Sunday, December 9, 2012 9:08:18 PM UTC-8, Mike Dewhirst wrote:
On 10/12/2012 10:59am, Victor Hooi wrote: 
> Also - in terms of using them with QuerySets - there aren't any 
> workarounds to use model methods with QuerySets are there? It seems like 
> that would be a definite argument in favour of using the second method, 
> right? 

I'm not sure what you mean exactly here but you can use them in model 
methods. Use the class manager. 

ClassName.objects.filter(thing=self.this, whatever=self.that, etc=etc) 
 
What I believe Victor is asking for is that the model method is on the left hand side of that equals.

On my understanding of the framework, my best answer is "no", because the query system is unable to construct a SQL query that will encompass your model method, since that's in python.

This is a very classic "pre-calculate vs re-calculate" argument that a lot of applications go through. Django handles a bunch of these using "lazy" queries, that don't bother calculating things until it really needs them, but even so, needs some hints to get best performance (see things like select_related and prefetch_related). 

Here's some options for you:

1. Stick with a model method. Do most of your querying through django's ORM, but then do a post-filter in python using the results of the model method. This saves database space, and if the pre-filter result set is small, this will be pretty fast still. Howerver, if this is called a _lot_ you'll be hurt on performance.

2. Whenever you "save" a model, re-calculate the field and save it. This is very flexible, while chewing up a bit of space. This opens up the possibility of a desync between the source data and the calculated, but should be rare if you're never saving the model data outside of the save method (and your database handles transactions properly). Very fast if you're doing a lot of searching vs updating. But if you're updating FAR more often than running your search, you're wasting performance on the rarely used pre-calculation

3. If the calculation can be done at the database level, use the "extra" ORM method to do a search. This is better than 1, but will take a fair bit more database knowledge to get going, AND might tie you to a specific database implementation.

3a. Use a database "view" that presents the pre-calculation, making the ORM query a lot simpler. This means ensuring that you are allowed to update a view (postgresql 9.3 planned) or can somehow do a query on the view (which is a different table according to django). More django work here, and again might tie you to a particular database implementation


So, my guess is... as long as you're doing a reasonable number of "queries" versus "updates" to the table, to make the pre-calculation worthwhile, just go ahead and set that pre-calculation in the model's save() method,

 
Reply all
Reply to author
Forward
0 new messages