Reducing queries for FK/M2M lookups? Equivalent to select_related, or prefetch_related?

717 views
Skip to first unread message

Victor Hooi

unread,
Nov 11, 2012, 8:15:40 PM11/11/12
to django-...@googlegroups.com
Hi,

I'm wondering what's the best way to reduce the number of queries for looking up FK's?

As suggested by the docs, I'm not using Full=True.

However, that me, I currently need to make a separate query for each FK field in my model.

For example, say I have a resource, "Application", which links to several other tables - "Collocation", "Host", "Binary", etc. (there's quite a few others which are domain-specific).

If I look up a list of applications - say 20 of them - I then need to look up the separate collocations, hosts, binaries, and other tables associated with them - resulting in several hundred queries.

I'm using Request sessions (http://docs.python-requests.org/en/latest/user/advanced/#session-objects), but that's still a lot.

I noticed that I can use Sets (http://django-tastypie.readthedocs.org/en/latest/interacting.html#selecting-a-subset-of-resources) but that's still only searching across a single table.

Are there any other tricks that I can use? Is there some equivalent of select_related or prefetch_related that makes sense for Tasty-Pie?

Cheers,
Victor

Nick Doyle

unread,
Nov 16, 2012, 12:40:16 AM11/16/12
to django-...@googlegroups.com
Hi Victor
I learnt how to do this recently (also from this group)
You just set queryset = select_related
You can then use full=True with no additional DB hits too - very nice!

An example:

class PurchaseOptionResource(ModelResource):
    product_type = fields.ForeignKey('core.api.ProductTypeResource', 'product_type', full=True)
    class Meta:
        queryset = PurchaseOption.objects.select_related('product_type').all()

Results in one query with a join as you would expect from Django.
You might already guess - you can also use prefetch_related in this way for non-FK relationships.
Nice eh!

Victor Hooi

unread,
Nov 16, 2012, 2:28:53 AM11/16/12
to django-...@googlegroups.com
Hi,

Nick - Thanks for the pointer. That's a neat trick.

However, I wonder if there's any way to do this without using full=True? Or are you required to use that if you want to use the select_related/prefetch_related behaviour?

I thought part of the point of Tasty-Pie is to make everything accessible via Resource URIs - I'm wondering if there's a way to pre-pull all of the related objects down so they're ready without needing to hit the database, but still have them accessible at the URIs.

Cheers,
Victor

Gordon Wrigley

unread,
Nov 16, 2012, 4:39:40 AM11/16/12
to django-...@googlegroups.com
I am also currently optimizing DB queries. On top of the select_related/prefetch_related for full=True stuff, I have this for ToOne full=False, something similar could probably be done for ToMany

FakeModel = collections.namedtuple("FakeModel", "pk")
class YPToOneField(fields.ToOneField):
    def dehydrate(self, bundle):
        """ Patched to not load the related model when we only want a URI
        note: this breaks support for callables and __ for non inlined resources.
        """
        if not self.full:
            pk = getattr(bundle.obj, self.attribute + "_id", None)
            if pk is not None:
                foreign_obj = FakeModel(pk)
            else:
                foreign_obj = None

            if not foreign_obj:
                if not self.null:
                    raise ApiFieldError("The model '%r' has an empty attribute '%s' and doesn't allow a null value." % (previous_obj, attr))

                return None

            self.fk_resource = self.get_related_resource(foreign_obj)
            fk_bundle = Bundle(obj=foreign_obj, request=bundle.request)
            return self.dehydrate_related(fk_bundle, self.fk_resource)
        else:
            return super(YPToOneField, self).dehydrate(bundle)

I've also nuked the paginator entirely:

class YPDummyPaginator(object):
    def __init__(self, request_data, objects, resource_uri=None, limit=None, offset=0, max_limit=1000, collection_name='objects'):
        self.objects = objects
        self.collection_name = collection_name

    def page(self):
        return {
            self.collection_name: self.objects,
        }

For tracking things down I'm inserting this into the log config:

        'django.db.backends': {
            'handlers': ['console'],
            'level': 'DEBUG'
        },

And this into models.py:

if 0:
    def log_stack(orig):
        def patch(*args, **argv):
            print
            import traceback
            import sys
            traceback.print_stack(file=sys.stdout)
            print
            return orig(*args, **argv)
        return patch

    from django.db.backends.util import CursorDebugWrapper
    CursorDebugWrapper.execute = log_stack(CursorDebugWrapper.execute)
    CursorDebugWrapper.executemany = log_stack(CursorDebugWrapper.executemany)

And that's where I'm at today.

Gordon Wrigley

unread,
Nov 16, 2012, 5:02:42 AM11/16/12
to django-...@googlegroups.com
Also, there is a downside to the prefetch/related thing, tastypie uses the same queryset when it needs the object for a modification, either on the object or an object that has it as a ToOne field, so all that prefetch and related stuff happens then as well, I haven't found a way around that yet.
Reply all
Reply to author
Forward
0 new messages