Update returning

289 views
Skip to first unread message

Tom Carrick

unread,
Jan 26, 2021, 8:27:54 AM1/26/21
to django-d...@googlegroups.com
Hi,

I found myself with a use-case today of wanting to return some data from an update, as I want to make sure I return exactly what is in the database without making an extra query.

I've found https://code.djangoproject.com/ticket/28682 and agree with the resolution there.

I suppose there is a way to do this in a backwards compatible way, something like:

Foo.objects.update(["id", "name"], name="Rob")

But it's very ugly. But how about a new method:

Foo.objects.update_returning(["id", "name"], name="Rob")

Doesn't seem quite so bad. There's also a possibility of something like:

Foo.objects.update_returning(updates={"name": "Rob"}, returning=["id", "name"])

I'd expect it to return a list of dicts. I'm not sure what's best, if anything. It could be it's a bit too niche, but it is sometimes quite useful.

Tom

Adam Johnson

unread,
Jan 26, 2021, 9:00:01 AM1/26/21
to django-d...@googlegroups.com
I think we could do the most logical:

QuerySet.objects.update(name="Rob", returning=["id', "name"])

There is a precedent for small backwards incompatible changes like this, for example when "named" was added to "values_list()". However maybe backwards compatibility is worth keeping. We can be backwards compatible by special-casing models with a field called 'returning', in which case the field is updated. This blocks the 'returning' functionality but users can always rename a field.

I'd rather not add a new method or otherwise change the signature of update() more radically.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAHoz%3DMaDz7VLXqRk01D_bUOLN%3D5TBiqVrw-BzyLogeaTMtXM4g%40mail.gmail.com.


--
Adam

Florian Apolloner

unread,
Jan 26, 2021, 9:59:12 AM1/26/21
to Django developers (Contributions to Django itself)
Not that I am completely convinced that the following is a good idea; but what about:

QuerySet.objects.update(name="Rob").values("id", "name")

On second thought I'd like an .returning() more than values, but I am not sure if it makes sense to add a new function just for the sake of a small backwards compatibility.

> I'd expect it to return a list of dicts.

Currently .update returns the number of rows affected I think. This information should still be there even if returning extra data.

Adam Johnson

unread,
Jan 26, 2021, 11:26:02 AM1/26/21
to django-d...@googlegroups.com
Not that I am completely convinced that the following is a good idea; but what about: 
QuerySet.objects.update(name="Rob").values("id", "name")

That's not possible since update() directly performs the update - it's not lazy in any way. It could be done in the other order like `QuerySet.objects.values("id", "name").update(name="Rob")` but I don't see the necessity to define "returning" fields in a chainable manner.



--
Adam

Florian Apolloner

unread,
Jan 26, 2021, 12:36:10 PM1/26/21
to Django developers (Contributions to Django itself)
On Tuesday, January 26, 2021 at 5:26:02 PM UTC+1 Adam Johnson wrote:
Not that I am completely convinced that the following is a good idea; but what about: 
QuerySet.objects.update(name="Rob").values("id", "name")

That's not possible since update() directly performs the update - it's not lazy in any way. It could be done in the other order like `QuerySet.objects.values("id", "name").update(name="Rob")` but I don't see the necessity to define "returning" fields in a chainable manner.

Ha, not sure what I was thinking. The sentence below I noted that update() would return something but I didn't think that this would break chaining. My bad.

I looked further around and `get_or_create` has the nice workaround of being able to use `defaults__exact` if it clashes with the `defaults` keyword. Sadly we do not have that option here. Truth to be told I do not think that many people have fields called returning

charettes

unread,
Jan 26, 2021, 11:54:42 PM1/26/21
to Django developers (Contributions to Django itself)
If we were to change the update signature from (**updates) to (updates=None, *, returning=None, **kwargs) the `returning` collision could be avoided by doing update({"foo": "bar"}, returning=["id", "foo"]) like Tom is suggesting.

I think that's the best option here if we want to elegantly add support for this feature while maintaining backward compability. Something along the lines of

def update(updates=None, *, returning=None, **kwargs):
    if updates and kwargs:
        raise TypeError('updates must be either specified through the first positional argument or kwargs')
    if updates is None:
        updates = kwargs
    ...

I guess we could force the usage of positional `updates` if `returning` is specified to prevent any silent breakages as well.

The usage of `returning` bring another set of questions though. Since UPDATE are not ordered RETURNING data has little value without the primary key associated with the updated rows. Maybe the return value of `returning=[f1, ..., fn]` should be a dict mapping the primary key to list of returning values.

e.g.

Post.objects.create(id=1, score=41)
Post.objects.update({"score": F("score") + 1}, returning=["score"])
-> {1: [42]}

Cheers,
Simon

Tom Carrick

unread,
Jan 27, 2021, 4:39:05 AM1/27/21
to django-d...@googlegroups.com
Simon, you give me too much credit, that is step beyond what I'd thought of :) It looks good to me.

Why not a dict of dicts or perhaps a dict of namedtuples instead? I think a list might be a bit annoying to map back to the requested fields.

Maybe I will try to put a proof of concept together...

Tom

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

Florian Apolloner

unread,
Jan 27, 2021, 4:45:43 AM1/27/21
to Django developers (Contributions to Django itself)
Hi Simon,

On Wednesday, January 27, 2021 at 5:54:42 AM UTC+1 charettes wrote:
I think that's the best option here if we want to elegantly add support for this feature while maintaining backward compability. Something along the lines of ...

That is certainly an interesting approach. It kinda breaks the "there should be one way of doing things" rule, but…

The usage of `returning` bring another set of questions though. Since UPDATE are not ordered RETURNING data has little value without the primary key associated with the updated rows. Maybe the return value of `returning=[f1, ..., fn]` should be a dict mapping the primary key to list of returning values.

I am not sure I like that. For things where you update just one row and want to know the new values the primary key doesn't make much sense. Granted for multiple rows it would maybe easier to have it automatically keyed by the pk, but returning something always (the pk) without having an option to disable  it seems kinda wrong to me. Not sure what the best option would be.

Cheers,
Florian

Tom Carrick

unread,
May 12, 2021, 8:18:50 AM5/12/21
to django-d...@googlegroups.com
Apologies, I had totally forgotten about this, but I'm still interested in working on it, but still not sure about a few things.

I've been thinking about the return value a bit. I can foresee cases where you wouldn't want the id returned.  You might want the user to update something by slug, username, or some other identifier without revealing the IDs. Of course the user could reformat the return value however they like, but I don't see a reason to ask for something that isn't necessary.

So I think a list of some kind of object (namedtuple or dict probably) makes the most sense to me. As for also adding the count, I am not sure. The return value would then be e.g. (1, [<data>]). I'm guessing this count would remain as the number of matched rows, rather than the updated ones - I am not sure if returning only gives back rows that were modified or not, the Postgres docs are at least unclear on this. If they're always going to be the same, I'm not sure there is much reason for returning the count when len(return_value) will do.

I'm also not really sure on the data structure though. Namedtuples make the most sense to me but a dict might be useful for those wanting to shove this directly into JsonResponse, without needing _asdict(), for example.

Cheers,
Tom

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages