ModelSerializer.flatten()

1,144 views
Skip to first unread message

Steve W

unread,
Apr 8, 2016, 2:16:10 PM4/8/16
to Django REST framework
I've been working on a method to modify an existing ModelSerializer so as to 'flatten' it, IE serialize it as a 'flat' table without the overhead of hitting and serializing its related tables.


Use Case:  A deeply nested ModelSerializer, that hits 20 tables on multiple databases, one of which is on another continent.
When serializing a single instance for a 'retrieve' request, collect and serialize all of the related objects and return one amazing collection of data (even if it takes a couple of seconds)

However, when serializing a queryset of many (potentially thousands) of records for a 'list' request, only hit the first table, serializing ONLY its native fields and any Foreign Key IDs.  Quickly serialize and return the entire queryset.
No need to declare a second serializer against the same model. No worries about the N+1 problem (N>20,000 in this example), no need to employ prefetch_related or select_related (and their resultant cost). No need to rely on pagination for acceptable performance. 

This would be more-or-less equivalent to declaring a second, bare-bones serializer on the same model that just looks like

class MyFlatSerializer(MySerializer):
    class Meta:
        fields = [f.name for f in MySerlializer.Meta.model._meta.fields if not f.is_relation]


After a few hours, one small classmethod with two lines of code do most of what I need:

I have a working prototype here

 
@classmethod
def flatten(cls):
+ """Flatten a ModelSerializer so that only the native table is queried.
+
+ Ideal for 'list' operations to avoid the N+1 problem, prefetch_related or select_related, and pagination.
+
+ """
+ #TODO: Maintain foreign key IDs in the fields
cls.flat = True
cls.Meta.fields = [f.name for f in cls.Meta.model._meta.fields if not f.is_relation]
cls._declared_fields = {}

There are some flat=True declarations in a couple __init__ methods that aren't currently used.

Then, in a viewset all I have to do is


class MyViewSet(viewsets.ModelViewSet):
    """Example class showing the use of ModelSerializer.flatten() for list queries"""

    def retrieve(self):
        "retrieve and serialize a single model instance, including all related tables"
        instance = self.get_object()
        serializer = MySerializer(instance)

    def list(self, flatten=False):
        "quickly serialize an arbitrarily large queryset by first flattening the serializer
        queryset = MyModel.objects.all()

        MySerializer.flatten()
        serializer = MySerializer(queryset, many=True)


Performance (time to return the serializer.data object for 1000 records)
Before Flattening: 65 seconds
After Flattening: 0.2 seconds

TODOs:
As noted, the code currently throws away Foreign Key IDs present in the flat table, when it should actually automatically make them into PrimaryKeyRelatedField or HyperlinkedRelatedField.
Needs to honor Meta.fields and Meta.exclude options
I need to get a better understanding of the _get_declared_fields method, and to learn if the Meta.depth option could be employed (ie by setting it to zero??)

It would be nice to have flatten() be a decorator to view functions:
@flatten
def list():
...

Any Javascript framework could then stand poised ready to start collecting fully-serialized individual instances on-demand or behind-the-scenes.


Any thoughts or insights?  Thanks.

Steve Walker
Data Researcher
Sharp Laboratories of America

Steve W

unread,
Apr 8, 2016, 2:18:12 PM4/8/16
to Django REST framework
(Hit POST while still editing.)

example list method should read
 def list(self, flatten=False):
        "quickly serialize an arbitrarily large queryset by first flattening the serializer
        queryset = MyModel.objects.all()
        if flatten:
Reply all
Reply to author
Forward
0 new messages