Update returning

627 views
Skip to first unread message

Tom Carrick

unread,
Jan 26, 2021, 8:27:54 AM1/26/21
to django-d...@googlegroups.com
Hi,

I found myself with a use-case today of wanting to return some data from an update, as I want to make sure I return exactly what is in the database without making an extra query.

I've found https://code.djangoproject.com/ticket/28682 and agree with the resolution there.

I suppose there is a way to do this in a backwards compatible way, something like:

Foo.objects.update(["id", "name"], name="Rob")

But it's very ugly. But how about a new method:

Foo.objects.update_returning(["id", "name"], name="Rob")

Doesn't seem quite so bad. There's also a possibility of something like:

Foo.objects.update_returning(updates={"name": "Rob"}, returning=["id", "name"])

I'd expect it to return a list of dicts. I'm not sure what's best, if anything. It could be it's a bit too niche, but it is sometimes quite useful.

Tom

Adam Johnson

unread,
Jan 26, 2021, 9:00:01 AM1/26/21
to django-d...@googlegroups.com
I think we could do the most logical:

QuerySet.objects.update(name="Rob", returning=["id', "name"])

There is a precedent for small backwards incompatible changes like this, for example when "named" was added to "values_list()". However maybe backwards compatibility is worth keeping. We can be backwards compatible by special-casing models with a field called 'returning', in which case the field is updated. This blocks the 'returning' functionality but users can always rename a field.

I'd rather not add a new method or otherwise change the signature of update() more radically.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAHoz%3DMaDz7VLXqRk01D_bUOLN%3D5TBiqVrw-BzyLogeaTMtXM4g%40mail.gmail.com.


--
Adam

Florian Apolloner

unread,
Jan 26, 2021, 9:59:12 AM1/26/21
to Django developers (Contributions to Django itself)
Not that I am completely convinced that the following is a good idea; but what about:

QuerySet.objects.update(name="Rob").values("id", "name")

On second thought I'd like an .returning() more than values, but I am not sure if it makes sense to add a new function just for the sake of a small backwards compatibility.

> I'd expect it to return a list of dicts.

Currently .update returns the number of rows affected I think. This information should still be there even if returning extra data.

Adam Johnson

unread,
Jan 26, 2021, 11:26:02 AM1/26/21
to django-d...@googlegroups.com
Not that I am completely convinced that the following is a good idea; but what about: 
QuerySet.objects.update(name="Rob").values("id", "name")

That's not possible since update() directly performs the update - it's not lazy in any way. It could be done in the other order like `QuerySet.objects.values("id", "name").update(name="Rob")` but I don't see the necessity to define "returning" fields in a chainable manner.



--
Adam

Florian Apolloner

unread,
Jan 26, 2021, 12:36:10 PM1/26/21
to Django developers (Contributions to Django itself)
On Tuesday, January 26, 2021 at 5:26:02 PM UTC+1 Adam Johnson wrote:
Not that I am completely convinced that the following is a good idea; but what about: 
QuerySet.objects.update(name="Rob").values("id", "name")

That's not possible since update() directly performs the update - it's not lazy in any way. It could be done in the other order like `QuerySet.objects.values("id", "name").update(name="Rob")` but I don't see the necessity to define "returning" fields in a chainable manner.

Ha, not sure what I was thinking. The sentence below I noted that update() would return something but I didn't think that this would break chaining. My bad.

I looked further around and `get_or_create` has the nice workaround of being able to use `defaults__exact` if it clashes with the `defaults` keyword. Sadly we do not have that option here. Truth to be told I do not think that many people have fields called returning

charettes

unread,
Jan 26, 2021, 11:54:42 PM1/26/21
to Django developers (Contributions to Django itself)
If we were to change the update signature from (**updates) to (updates=None, *, returning=None, **kwargs) the `returning` collision could be avoided by doing update({"foo": "bar"}, returning=["id", "foo"]) like Tom is suggesting.

I think that's the best option here if we want to elegantly add support for this feature while maintaining backward compability. Something along the lines of

def update(updates=None, *, returning=None, **kwargs):
    if updates and kwargs:
        raise TypeError('updates must be either specified through the first positional argument or kwargs')
    if updates is None:
        updates = kwargs
    ...

I guess we could force the usage of positional `updates` if `returning` is specified to prevent any silent breakages as well.

The usage of `returning` bring another set of questions though. Since UPDATE are not ordered RETURNING data has little value without the primary key associated with the updated rows. Maybe the return value of `returning=[f1, ..., fn]` should be a dict mapping the primary key to list of returning values.

e.g.

Post.objects.create(id=1, score=41)
Post.objects.update({"score": F("score") + 1}, returning=["score"])
-> {1: [42]}

Cheers,
Simon

Tom Carrick

unread,
Jan 27, 2021, 4:39:05 AM1/27/21
to django-d...@googlegroups.com
Simon, you give me too much credit, that is step beyond what I'd thought of :) It looks good to me.

Why not a dict of dicts or perhaps a dict of namedtuples instead? I think a list might be a bit annoying to map back to the requested fields.

Maybe I will try to put a proof of concept together...

Tom

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

Florian Apolloner

unread,
Jan 27, 2021, 4:45:43 AM1/27/21
to Django developers (Contributions to Django itself)
Hi Simon,

On Wednesday, January 27, 2021 at 5:54:42 AM UTC+1 charettes wrote:
I think that's the best option here if we want to elegantly add support for this feature while maintaining backward compability. Something along the lines of ...

That is certainly an interesting approach. It kinda breaks the "there should be one way of doing things" rule, but…

The usage of `returning` bring another set of questions though. Since UPDATE are not ordered RETURNING data has little value without the primary key associated with the updated rows. Maybe the return value of `returning=[f1, ..., fn]` should be a dict mapping the primary key to list of returning values.

I am not sure I like that. For things where you update just one row and want to know the new values the primary key doesn't make much sense. Granted for multiple rows it would maybe easier to have it automatically keyed by the pk, but returning something always (the pk) without having an option to disable  it seems kinda wrong to me. Not sure what the best option would be.

Cheers,
Florian

Tom Carrick

unread,
May 12, 2021, 8:18:50 AM5/12/21
to django-d...@googlegroups.com
Apologies, I had totally forgotten about this, but I'm still interested in working on it, but still not sure about a few things.

I've been thinking about the return value a bit. I can foresee cases where you wouldn't want the id returned.  You might want the user to update something by slug, username, or some other identifier without revealing the IDs. Of course the user could reformat the return value however they like, but I don't see a reason to ask for something that isn't necessary.

So I think a list of some kind of object (namedtuple or dict probably) makes the most sense to me. As for also adding the count, I am not sure. The return value would then be e.g. (1, [<data>]). I'm guessing this count would remain as the number of matched rows, rather than the updated ones - I am not sure if returning only gives back rows that were modified or not, the Postgres docs are at least unclear on this. If they're always going to be the same, I'm not sure there is much reason for returning the count when len(return_value) will do.

I'm also not really sure on the data structure though. Namedtuples make the most sense to me but a dict might be useful for those wanting to shove this directly into JsonResponse, without needing _asdict(), for example.

Cheers,
Tom

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

Aivars Kalvāns

unread,
Sep 25, 2023, 12:44:37 PM9/25/23
to Django developers (Contributions to Django itself)
Hi!

I want to implement these changes and I have a PR in the ticket https://code.djangoproject.com/ticket/32406
At the moment I have a new `update_returning` method but I can easily replace it with ` (updates=None, *, returning:bool=None, **kwargs)` if you decide to add functionality to the existing method instead. I did a search on github and found only a single project with `returning` as model field.
However the returned value in my implementation is a `QuerySet` and I can do `.get()`, `.only()`, `.defer()` and `.values()` or `.values_list()` on that. Mainly because my use case is updating and refreshing the model in a single database operation. The ticket has more examples. What do you think, do you see any issues with this approach?

Plamedi klj

unread,
Oct 3, 2023, 5:38:37 PM10/3/23
to django-d...@googlegroups.com

Tom Carrick

unread,
Oct 7, 2023, 11:46:16 AM10/7/23
to django-d...@googlegroups.com
Hi Aivars,

Since we spoke yesterday I've been thinking about this...

I don't really see the value in returning a QuerySet. There are only a limited number of options that make sense at this point, and even those are tough to justify. Like what would happen if you do `Foo.objects.update(x=1, returning=["x"]).values("y")`?

I do see value in returning the model instances, though I feel there is a potential footgun here where you can update a field, but not return it, so you'd get the old value in the instance. Something like:

`Foo.objects.update(x=1, y=2, returning=["x"])`

Now you have updated y, but the model instance that gets returned still has the old value. Maybe this is fine with a warning in the docs.

I think regardless of what we do here, we should stick with the proposed and roughly agreed upon API of having an extra argument to `update()` rather than creating a new QuerySet method.

Cheers,
Tom


aivars....@gmail.com

unread,
Oct 7, 2023, 1:20:49 PM10/7/23
to django-d...@googlegroups.com
Hi!

I considered making `returning: bool` a flag that we are returning data. That would make it  `Foo.objects.update(x=1, y=2, returning=True)` and avoid some footguns. Or even a new function (`update_returning`?) because I have mixed feelings about different return types based on parameters.

Returning fields other than ones updated is a valid case IMO: we can update some records and return the PKs of updated records for logging instead of having just the number of rows updated. Something like`.values_list('pk', flat=True)` is useful then.





You received this message because you are subscribed to a topic in the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-developers/qQ5DT91nBLM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAHoz%3DMaWtPtsJhZp2m_am8acrOTkSHPzKZpP1iTf83PwxtUKGA%40mail.gmail.com.


--
Aivars

Tom Carrick

unread,
Oct 8, 2023, 5:45:38 AM10/8/23
to django-d...@googlegroups.com
I think it's okay to return something else based on a parameter. This is already done for e.g. values_list(flat=True) and values_list(named=True).

While the bool addresses the obvious footgun I think it also loses a lot of flexibility. If you have a field modified by a pre-update trigger or a generated field that uses a field you updated, you wouldn't be able to see these without another query, as I understand it.

Reply all
Reply to author
Forward
0 new messages