Add an defer=True option for model fields

303 views
Skip to first unread message

Dan Davis

unread,
Feb 19, 2019, 12:43:24 PM2/19/19
to Django developers (Contributions to Django itself)
I have a developer who stores the binary copy of a file in his table.  In ColdFusion, this was acceptable, because he was writing every query by hand, and could simply exclude that field.  However, with the Django ORM it is a bit of a problem.   The primary table he uses is just for the file, and has a file_name, file_type, file_size, and BinaryField.

The problem is that he has a database-level view that incorporates this field, and it may be that he needs to keep this because other schemas in our big-office Oracle use the view as an exported synonym.

What I advised him to do was to take the BinaryField out of the database-level view, to protect the ORM from reading these large files into memory, as in:

                     [obj for obj in LicensesDBView.objects.all()] 

Or, if he cannot do that, to simply defer the field:

                     [obj for obj in LicensesDBView.objects.defer('scanned_license').all()] 

I was not sure whether to tell him to implement a ModelManager with a get_queryset() method that defers the field, but it made me wonder whether we should have a concept of an "initially deferred" field.
That is, this is a field that starts deferred, and can be pulled into the select using a values iterator or a call to only() or defer(), e.g. the one that cancels prior defers.   The concept of "initially deferred" fields would certainly require a new queryset method, such as "nodefer" which is sort of like only but doesn't cause only those fields to load, or rather defer could accept a syntax like defer('-scanned_license') to cancel that previous deferred loading field.

I'm afraid I probably don't understand all the implications of this feature, so I thought I'd bring it up on the list before filing any sort of issue. Its likely this has been discussed before; I cannot do a historical search all the time, especially when ancient history may not be today's read on this issue.

Dan Davis

unread,
Feb 19, 2019, 1:02:50 PM2/19/19
to Django developers (Contributions to Django itself)

What I mean by the below:
> I was not sure whether to tell him to implement a ModelManager with a get_queryset() method that defers the field,

Of course this works, but I'm not going to maintain this code, and that sort of sophistication creates a need for more sophisticated maintenance.


--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/ee5c04e5-69d6-42f9-95ff-c01d553b24c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Josh Smeaton

unread,
Feb 20, 2019, 6:23:17 AM2/20/19
to Django developers (Contributions to Django itself)
There is a ticket for this one already, filed 4 years ago by me :)


There are a few options described, but I think `defer=True` was winning out. I don't think we considered an `undefer`, but a `defer(None)` would fix that. Once that API is built, then we can consider getting LOB fields to defer themselves in specific situations.

Adam Johnson

unread,
Feb 20, 2019, 7:33:03 AM2/20/19
to django-d...@googlegroups.com
Dan, as to solving your problem with the Django of today: if you don't mention a field in the Django model definition, Django won't select it. So, you can declare the table for two Django models, and use "vertical partitioning" so that when you query one, the other isn't selected, unless it is added with select_related():

class LicensesDBView(models.Model):
    file_name = models.TextField(primary_key=True)
    file_type = ...
    file_size = ...
    
    class Meta:
        managed = False
        db_table = 'licenses_db_view'
    
class LicensesDBViewBlob(models.Model):
    file_name = models.OneToOneField(LicensesDBView, primary_key=True)
    file_data = models.BinaryField()
    
    class Meta:
        managed = False
        db_table = 'licenses_db_view'

HTH,

Adam



For more options, visit https://groups.google.com/d/optout.


--
Adam
Reply all
Reply to author
Forward
0 new messages