Partial Models Discussion

6 views
Skip to first unread message

David Cramer

unread,
May 6, 2008, 11:35:45 PM5/6/08
to Django developers
I'd like to present my concept for partial models, which would be an
attempt to replace the use of .values() returning a dictionary
(although .values() still has uses if you dont actually want an
instance). Keep in mind, the way I'm presenting this would keep #17
working :)

values, values_tuple, and partial models, in my opinion, should all be
replaced with one method call (what if theres a values_list or
values_set, seems ugly). This would become a new values method:

values(results=[object|tuple|dict]) or something similar

Upon returning the object, you would get proxies, which held a model
instance in it. The proxy itself, would allow you to override any
field (e.g. this would be useful for .extra(select)), but any fields
which aren't set could be passed back to the model (so you could
override a foreignkey, without affecting the model instance, and thus
not affecting #17).

When the proxy is instanced, we would identify which fields are
currently available, and map those in. We would also similarly make
sure all fields are identifyable. We would then (possibly, didn't
think this through) add proxy attributes for each field/relation that
isn't available in the dataset, and set this as a LazyLookup. Upon
calling this attribute, it would do an SQL query for this single
attribute, and throw a warn() signal to make the developer aware (as
you most likely don't want to do this).

The biggest thing about this for me, is I don't want to return
dictionaries or tuples, but I want to optimize my SQL usage. I don't
need to return an article's body just to show headlines, but I do want
to be able to call get_absolute_url, for example.

Russell Keith-Magee

unread,
May 6, 2008, 11:46:20 PM5/6/08
to django-d...@googlegroups.com
On Wed, May 7, 2008 at 11:35 AM, David Cramer <dcr...@gmail.com> wrote:
>
> I'd like to present my concept for partial models, which would be an
...

> When the proxy is instanced, we would identify which fields are
> currently available, and map those in. We would also similarly make
> sure all fields are identifyable. We would then (possibly, didn't
> think this through) add proxy attributes for each field/relation that
> isn't available in the dataset, and set this as a LazyLookup. Upon
> calling this attribute, it would do an SQL query for this single
> attribute, and throw a warn() signal to make the developer aware (as
> you most likely don't want to do this).

If I've understood you correctly, It sounds like you are proposing the
same thing as:

http://code.djangoproject.com/ticket/5420

Yours,
Russ Magee %-)

David Cramer

unread,
May 6, 2008, 11:49:21 PM5/6/08
to django-d...@googlegroups.com
Sort of, although I'm going to go against Adrian on the hide() method (I'd rather be explicit than implicit).
--
David Cramer
Director of Technology
iBegin
http://www.ibegin.com/

Gary Wilson Jr.

unread,
May 23, 2008, 2:29:43 PM5/23/08
to django-d...@googlegroups.com
David Cramer wrote:
> I'd like to present my concept for partial models, which would be an
> attempt to replace the use of .values() returning a dictionary
> (although .values() still has uses if you dont actually want an
> instance). Keep in mind, the way I'm presenting this would keep #17
> working :)
>
> values, values_tuple, and partial models, in my opinion, should all be
> replaced with one method call (what if theres a values_list or
> values_set, seems ugly). This would become a new values method:
>
> values(results=[object|tuple|dict]) or something similar
>
> Upon returning the object, you would get proxies, which held a model
> instance in it.

However, one of the benefits of values() returning a dict is that you
avoid the more expensive model instance creation when you don't need it.
So I lean more towards something like #5420 rather than changing what
values() returns.

And, if you would rather be explicit in the positive tone, then maybe a
show() method to complement the hide() would satisfy you.

Gary

David Cramer

unread,
May 23, 2008, 2:51:32 PM5/23/08
to django-d...@googlegroups.com
IMO show() and hide() are extremely ugly. And I think .values() is becoming ugly with the addition of values_tuple or whatever it's called. I don't see a real good reason to clutter the namespace even more than it already is. I'd rather have .values(type=dict) or something similar.

Gary Wilson Jr.

unread,
May 23, 2008, 4:18:52 PM5/23/08
to django-d...@googlegroups.com
David Cramer wrote:
> IMO show() and hide() are extremely ugly. And I think .values() is becoming
> ugly with the addition of values_tuple or whatever it's called. I don't see
> a real good reason to clutter the namespace even more than it already is.
> I'd rather have .values(type=dict) or something similar.

Sorry, for some reason I completely skipped over the type switching in
the values() call. I do agree that it would be better than having
separate methods for each.

I wonder if we would also need to support the people who want to exclude
fields:

Model.objects.values('field1', 'field2', exclude_fields=[...], type=...)

Has a discussion of something like the "type" keyword argument been
brought up before? The only two threads [1][2] I found about
valuelist() and value_list() don't mention the idea.

Gary

[1]
http://groups.google.com/group/django-developers/browse_frm/thread/22b44f4eafaf956a/
[2]
http://groups.google.com/group/django-developers/browse_frm/thread/4c7ba291577e6e73/

David Cramer

unread,
May 23, 2008, 4:29:54 PM5/23/08
to django-d...@googlegroups.com
Nope it's just something I was throwing around.

What would exclude do in that example? I feel it should be explicit rather than implicit (although I do see the reason for implicit calls where you don't want to return text/blob fields, but explicit is always better).

Ken Arnold

unread,
May 23, 2008, 4:33:23 PM5/23/08
to Django developers
On May 23, 2:29 pm, "Gary Wilson Jr." <gary.wil...@gmail.com> wrote:
> However, one of the benefits of values() returning a dict is that you
> avoid the more expensive model instance creation when you don't need it.

I wouldn't be so quick to assume that creating model instances (or
ducks that look like them) is so bad. I did some quick-and-dirty
profiling (Python 2.5.2, Pentium M) for 3 scenarios: making a normal
class instance, making a dict, and making a class with __slots__. I
put two data fields on each. The results, after running timeit several
times to get stable results, are neck-and-neck:

normal class instance: 1.34 microseconds per instantiation
dict: 1.04 us
class with __slots__: 1.07 us

Granted, I'm not creating model instances. The model __init__ is
expensive, but unnecessary in this case *if* you avoid dispatching
signals. (Just call ModelClass.__new__(ModelClass).) So theoretically
we could make full model instances in barely more time than making
dicts.

There's a little complexity in what to do about items you haven't
retrieved. Pseudocode to solve that:

class LazyLookupDescriptor(object):
def __init__(self, name): self.name = name
def __get__(self, instance, owner): return
do_real_db_lookup(instance, name) # and cache on the instance.
def __set__ # ... you get the idea.

def values_returning_models():
tmpclass = ModelBase(name, (model_class,), attrs) # would need to
ensure abstract, maybe add __slots__, etc.
tmpclass_new = tmpclass.__new__
for field in hidden_fields:
setattr(tmpclass, field, LazyLookupDescriptor(field))
for row in db_iter:
obj = tmpclass_new(tmpclass)
obj.__dict__.update(row) # or something like that
yield obj

Alternately, the __get__ could set instance.__class__ to the real
model class and opportunistically fill in all of the rest of the data
(and log an info message that we're doing that). Pulling in selected
attributes from related objects is going to be more painful, though...
dict easily wins there.

But when I tried using the model __new__ and then setting attributes,
it took 2.04 us; I can't figure out why it's that much slower.

As for signals, maybe we do these optimizations only if no signal
handlers are registered.

Hope this is helpful; apologies if this has all been brought up before
and I missed it.
-Ken

David Cramer

unread,
May 23, 2008, 5:01:13 PM5/23/08
to django-d...@googlegroups.com
I would say it's not so much about the speed, but the memory overhead that is caused by creating these objects.
Reply all
Reply to author
Forward
0 new messages