Model Creation Optimization

10 views
Skip to first unread message

David Cramer

unread,
Dec 23, 2007, 9:02:19 PM12/23/07
to Django developers
I was looking over some code today and noticed that the __new__
definition of ModelBase does if x in dir(y). Doing some quick tests
showed this was a lot slower than doing if hasattr(x, y). Is there any
reason it's done like this?

http://www.pastethat.com/Qxayj

This is the specific group of lines:
for base in bases:
# TODO: Checking for the presence of '_meta' is hackish.
if '_meta' in dir(base):

Adrian Holovaty

unread,
Dec 24, 2007, 12:03:58 AM12/24/07
to django-d...@googlegroups.com

Hmm, I can't think of a reason it uses dir() instead of hasattr(). I
tracked down the change to changeset 2432 on the magic-removal branch
in February 2006:

http://code.djangoproject.com/changeset/2432

Do all the unit tests still pass if you convert it to a hasattr()?

Adrian

--
Adrian Holovaty
holovaty.com | djangoproject.com

David Cramer

unread,
Dec 24, 2007, 2:47:08 AM12/24/07
to Django developers
With the latest trunk I actually had 4 unit test failures -- maybe my
setup is screwed. http://www.pastethat.com/VulU5

These happened either way though, the same errors.

On Dec 23, 9:03 pm, "Adrian Holovaty" <holov...@gmail.com> wrote:

Ivan Sagalaev

unread,
Dec 24, 2007, 3:29:47 AM12/24/07
to django-d...@googlegroups.com
David Cramer wrote:
> With the latest trunk I actually had 4 unit test failures -- maybe my
> setup is screwed. http://www.pastethat.com/VulU5
>
> These happened either way though, the same errors.

David, looks like your test DB is not created in utf-8 charset. Try
these two settings:
http://www.djangoproject.com/documentation/settings/#test-database-charset

Malcolm Tredinnick

unread,
Dec 24, 2007, 3:31:49 AM12/24/07
to django-d...@googlegroups.com

On Sun, 2007-12-23 at 23:47 -0800, David Cramer wrote:
> With the latest trunk I actually had 4 unit test failures -- maybe my
> setup is screwed. http://www.pastethat.com/VulU5
>
> These happened either way though, the same errors.

Looks like you might need to set TEST_DATABASE_CHARSET='utf-8'. Django's
tests need UTF-8 (or something comparable) to test a few and
unfortunately MySQL's default charset seems to be Latin1 on most
installations. See docs/tests.txt or docs/settings.txt for details.

Malcolm

--
On the other hand, you have different fingers.
http://www.pointy-stick.com/blog/

David Cramer

unread,
Dec 24, 2007, 3:37:31 AM12/24/07
to Django developers
Any reason it's not the default?

On Dec 24, 12:31 am, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:
> On Sun, 2007-12-23 at 23:47 -0800, David Cramer wrote:
> > With the latest trunk I actually had 4 unit test failures -- maybe my
> > setup is screwed.http://www.pastethat.com/VulU5

Malcolm Tredinnick

unread,
Dec 24, 2007, 4:05:36 AM12/24/07
to django-d...@googlegroups.com

On Mon, 2007-12-24 at 00:37 -0800, David Cramer wrote:
> Any reason it's not the default?

It might not always be available, was one concern (although I suspect
not a real problem). The other one is that if you're running your
application's tests, you should be using the same encoding that your
production database is using, which might not be UTF-8.

Malcolm

--
Atheism is a non-prophet organization.
http://www.pointy-stick.com/blog/

ivan.illarionov

unread,
Dec 24, 2007, 12:59:11 PM12/24/07
to Django developers
Please take a look at my patch at http://code.djangoproject.com/ticket/6214
There's no need to check for _meta
if base is not Model and issubclass(base, Model)

All Model subclasses except Model class itself have _meta attribute.

It can be optimized further if we build the list of parents in the
beginning of __new__ where a Model subclass check already exists and
Model class will raise NameError

ivan.illarionov

unread,
Dec 24, 2007, 2:04:07 PM12/24/07
to Django developers
Just benchmarked my patch. Class creation runs faster. 1000 simple
classes was created in 2.5 seconds vs. 2.7 seconds and 1000 subclasses
in 2.6 vs. 2.8

On 24 дек, 20:59, "ivan.illarionov" <ivan.illario...@gmail.com> wrote:
> Please take a look at my patch athttp://code.djangoproject.com/ticket/6214

ivan.illarionov

unread,
Dec 24, 2007, 3:41:42 PM12/24/07
to Django developers
My patch is little faster against hasattr() 2.54 vs 2.56 and 2.56 vs
2.59

ivan.illarionov

unread,
Dec 24, 2007, 4:08:53 PM12/24/07
to Django developers
I was able to optimize it even more by moving classmethods of Model to
ModelBase as normal methods. It's really faster and clearer because
separates initialization of Model classes and Model instances

Aaron Krill

unread,
Dec 24, 2007, 4:19:33 PM12/24/07
to Django developers
Please make sure these changes do not cause any unit test failures.
Also, benchmarks!!! :-) How much of a speed difference is there thanks
to these latest changes you've made to ModelBase? And you may want to
run benchmarks and ensure its not effecting speeds adversely
elsewhere.

On Dec 24, 1:08 pm, "ivan.illarionov" <ivan.illario...@gmail.com>
wrote:

ivan.illarionov

unread,
Dec 24, 2007, 4:58:29 PM12/24/07
to Django developers
All tests passed. Very simple benchmarks show speed increase of about
0.04-0.07 seconds per 1000 model classes compared to hasattr() version
and about 0.15 seconds per 1000 model classes compared to dir(base). I
know it is not dramatically faster but it *is* faster (and IMHO
clearer and more DRY). Logic tells me that those changes can effect
speeds only in positive way.

On 25 дек, 00:19, Aaron Krill <corpor...@infinitephilosophy.com>
wrote:

Ivan Illarionov

unread,
Jan 11, 2008, 3:40:35 PM1/11/08
to Django developers
Updated the patch http://code.djangoproject.com/ticket/6214
Another small optimization.

1. It passes all unit tests
2. It is IMO more clear
3. It is faster (1-1.5 seconds per 10000 models on my machine) than
current django model creation

Luper Rouch

unread,
Feb 14, 2008, 8:47:18 PM2/14/08
to django-d...@googlegroups.com
I use custom metaclasses to do some additional initialization in my code :

class CustomMetaclass(ModelBase):
def __new__(cls, name, attrs, bases):
# Skip Django initialization if we are looking at CustomModel
try:
CustomModel
except NameError:
return type.__new__(cls, name, bases, attrs)
...
# Do Django initialization
return super(CustomMetaclass, cls).__new__(cls, name, bases, attrs)

class CustomModel(Model):
__metaclass__ = CustomMetaclass

class MyModel(CustomModel):
...

This worked perfectly before the r7098 optimizations, but now I get
"AttributeError: type object 'CustomModel' has no attribute '_meta'" (in
django/db/models/base.py, line 46) :

42 # Build complete list of parents
43 for base in parents:
44 if base is not Model:
45 new_class._meta.parents.append(base)
46 new_class._meta.parents.extend(base._meta.parents)

CustomModel is not Model, but as it is not initialized by Django it does
not have a _meta attribute either (and I don't want it initialized by
Django because it would create a *_custommodel table, plus the extra
unnecessary initializations).

The quick hack was to add a fake _meta to CustomModel :

class CustomModel(Model):
class _meta:
parents = ()
fields = ()
__metaclass__ = CustomMetaclass

I understand this may be a corner case, but the ability to extend
Django's initialization is really nice, and this optimization makes the
code ugly on the applications side (and was there really a need for an
optimization ? "1-1.5s per 10000 models" does not tell much on what has
been gained, what is the total time needed to initialize 10000 models ?)

Malcolm Tredinnick

unread,
Feb 15, 2008, 12:37:38 AM2/15/08
to django-d...@googlegroups.com

On Fri, 2008-02-15 at 02:47 +0100, Luper Rouch wrote:
[...]

> CustomModel is not Model, but as it is not initialized by Django it does
> not have a _meta attribute either (and I don't want it initialized by
> Django because it would create a *_custommodel table, plus the extra
> unnecessary initializations).

As I mentioned in the ticket you filed, if CustomModel is meant to be a
Model subclass, you must have a _meta attribute. That's one of the
things about being a Model subclass: you have a _meta. You want to do
the equivalent of creating an int subclass that doesn't support integer
operations. It's not like it's a huge problem for you to adjust to this.
Also, keep in mind that model subclassing is entirely unsupported in
Django at the moment and changes are undergoing to support it. That
might well involve a few other internal shufflings (ModelBase.__new__
gets simpler, in fact), although I don't think they will affect this
particular case.

I don't feel that Django should have to work around the fact that people
are creating things pretending to be a Model subclass that don't meet
the requirements. The fact that it worked before was accidental, not
some feature of the construct (go back over the history of that method
and you can see that 30 months ago it didn't support much subclassing
ability at all and we have slowly been tidying it up and in [7098]
continued that trend by removing some more unneeded code).

[...]


> I understand this may be a corner case, but the ability to extend
> Django's initialization is really nice, and this optimization makes the
> code ugly on the applications side (and was there really a need for an
> optimization ? "1-1.5s per 10000 models" does not tell much on what has
> been gained, what is the total time needed to initialize 10000 models ?)

The change wasn't made solely because of the speed benefits. Models just
aren't created that often for it to be the most relevant factor (__new__
is only called import time normally). It was also made because it makes
the code cleaner, particularly with some upcoming further changes in
there.

Anyway, I'm not going to comment any further here, since I obviously
think the change is a plus. I want to here what some of the other
maintainers think.

Malcolm

--
For every action there is an equal and opposite criticism.
http://www.pointy-stick.com/blog/

Ivan Illarionov

unread,
Feb 15, 2008, 1:35:51 AM2/15/08
to Django developers
> I understand this may be a corner case, but the ability to extend
> Django's initialization is really nice, and this optimization makes the
> code ugly on the applications side (and was there really a need for an
> optimization ? "1-1.5s per 10000 models" does not tell much on what has
> been gained, what is the total time needed to initialize 10000 models ?)

This change was not about speed - there was a comment: # TODO:
Checking for the presence of '_meta' is hackish - so I created the
patch to fix that solely because of this comment. Only later, when
Malcolm started this thread I noticed that this is also faster.

Malcolm Tredinnick

unread,
Feb 15, 2008, 1:41:15 AM2/15/08
to django-d...@googlegroups.com

On Fri, 2008-02-15 at 16:37 +1100, Malcolm Tredinnick wrote:
[...]

> The change wasn't made solely because of the speed benefits. Models just
> aren't created that often for it to be the most relevant factor (__new__
> is only called import time normally). It was also made because it makes
> the code cleaner, particularly with some upcoming further changes in
> there.

I've changed it to a hasattr() approach in [7716]; on further reflection
this seemed more in line with a duck-typing approach.

Regards,
Malcolm

--
If it walks out of your refrigerator, LET IT GO!!
http://www.pointy-stick.com/blog/

Luper Rouch

unread,
Feb 15, 2008, 5:50:07 AM2/15/08
to django-d...@googlegroups.com
Malcolm Tredinnick a écrit :

> On Fri, 2008-02-15 at 16:37 +1100, Malcolm Tredinnick wrote:
> [...]
>
>> The change wasn't made solely because of the speed benefits. Models just
>> aren't created that often for it to be the most relevant factor (__new__
>> is only called import time normally). It was also made because it makes
>> the code cleaner, particularly with some upcoming further changes in
>> there.
>>
>
> I've changed it to a hasattr() approach in [7716]; on further reflection
> this seemed more in line with a duck-typing approach.
>
> Regards,
> Malcolm
>
>
This also solves my particular issue (no need for fake _meta anymore in
Model subclasses), so I guess everyone is happy now :)

Cheers,
Luper

Reply all
Reply to author
Forward
0 new messages