[Django] #33647: bulk_update silently truncating values for size limited fields

45 views
Skip to first unread message

Django

unread,
Apr 15, 2022, 9:11:04β€―AM4/15/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: | Status: new
Uncategorized |
Component: Database | Version: 4.0
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
On postgres backend, `bulk_update` passes overlong values for size limited
fields along without any notification/exception, instead truncating the
value.

Repro:
{{{
#!div style="font-size: 80%"
Code highlighting:
{{{#!python
# some model to repro
class TestModel(models.Model):
name = models.CharField(max_length=32)

# in the shell
>>> from bulk_test.models import TestModel
>>> tm=TestModel(name='hello')
>>> tm.save()
>>> tm.name
'hello'
>>> tm.name='m'*100
>>> tm.save() # good, raises:
...
django.db.utils.DataError: value too long for type character varying(32)

>>> TestModel.objects.all().values('name')
<QuerySet [{'name': 'hello'}]>
>>> TestModel.objects.all().update(name='z'*100) # good, raises as
well:
...
django.db.utils.DataError: value too long for type character varying(32)

>>> TestModel.objects.all().values('name')
<QuerySet [{'name': 'hello'}]>
>>> TestModel.objects.bulk_update([tm], ['name']) # not raising,
instead truncating:
1
>>> TestModel.objects.all().values('name')
<QuerySet [{'name': 'mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm'}]>
}}}
}}}

Not sure, if this is intended/expected behavior, well it is inconsistent
to `.save` or `.update`, which both raise here. I only tested postgres
backend for this, it may apply to other size limiting databases as well
(sqlite itself is not affected, as it does not limit values).

If this is intended, it may be a good idea to at least document the
slightly different behavior, so users are aware of it, and can prepare
their code to avoid silent truncation with follow-up errors. A better way
prolly would fix `bulk_update` to spot value overflows and raise, but I am
not sure, if thats feasible.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 15, 2022, 1:25:48β€―PM4/15/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* type: Uncategorized => Bug
* stage: Unreviewed => Accepted


Comment:

I manage to reproduce, this is due to
`requires_casted_case_in_updates=True` on the Postgres backend does a
silent `::varchar(2)` cast on the `CASE` statement.

{{{#!python
diff --git a/tests/queries/test_bulk_update.py
b/tests/queries/test_bulk_update.py
index bc252c21c6..f7244aab72 100644
--- a/tests/queries/test_bulk_update.py
+++ b/tests/queries/test_bulk_update.py
@@ -3,7 +3,7 @@
from django.core.exceptions import FieldDoesNotExist
from django.db.models import F
from django.db.models.functions import Lower
-from django.db.utils import IntegrityError
+from django.db.utils import DataError, IntegrityError
from django.test import TestCase, override_settings, skipUnlessDBFeature

from .models import (
@@ -259,6 +259,14 @@ def test_ipaddressfield(self):
CustomDbColumn.objects.filter(ip_address=ip), models
)

+ def test_charfield_constraint(self):
+ article = Article.objects.create(
+ name="a" * 20, created=datetime.datetime.today()
+ )
+ article.name = "b" * 50
+ with self.assertRaises(DataError):
+ Article.objects.bulk_update([article], ["name"])
+
def test_datetime_field(self):
articles = [
Article.objects.create(name=str(i),
created=datetime.datetime.today())
}}}

We'll need to find an elegant way to cast to `varchar` instead of
`varchar(N)`

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:1>

Django

unread,
Apr 16, 2022, 5:50:26β€―AM4/16/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jerch):

Replying to [comment:1 Simon Charette]:
> ... this is due to `requires_casted_case_in_updates=True` on the


Postgres backend does a silent `::varchar(2)` cast on the `CASE`
statement.

Then only postgres is affected here? (from the code it seems that other
backends dont set this flag...)

For postgres the next question would be, if other data types with
contraints are affected as well (basically any type, that allows narrowing
by `type(???)` notation), or if this is a varchar only edge case. From
https://www.postgresql.org/docs/current/datatype.html possible candidates
for `type != type(???)` behavior are:
- bit
- bit varying
- character
- character varying
- interval
- numeric
- time
- timestamp

Imho django uses most of these for some field type (beside bit/bit
varying?)

In general the broader "super" type with no constraints can be derived in
postgres like this:


{{{
#!div style="font-size: 80%"
Code highlighting:

{{{#!sql
postgres=# select 'varchar(5)'::regtype;
regtype
-------------------
character varying
(1 row)
}}}
}}}

Maybe it is enough to apply the super type to the cast in that line
https://github.com/django/django/blob/a1e4e86f923dc8387b0a9c3025bdd5d096a6ebb8/django/db/models/query.py#L765?

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:2>

Django

unread,
Apr 18, 2022, 11:58:11β€―AM4/18/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)


Comment:

> Then only postgres is affected here?

yep, if you set this feature flag to `False` on
[https://github.com/django/django/blob/6723a26e59b0b5429a0c5873941e01a2e1bdbb81/django/db/backends/postgresql/features.py#L52
the Postgres backend] and run the `queries.test_bulk_update` you'll be
able to see why it was added in the first place.

> In general the broader "super" type with no constraints can be derived
in postgres like this:

> ...


> Maybe it is enough to apply the super type to the cast in that line

That could be one approach yes, we'd likely need to adapt `CAST` to allow
for such usage though. Not sure of what the argument should be named
though, maybe `generic` which defaults to `False`? Not sure what
`generic=True` would mean in the case of `Cast(expr,
ArrayField(CharField(max_length=20), generic=True)` would it be
`::varchar[]` or `::array`.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:3>

Django

unread,
Apr 19, 2022, 12:12:04β€―AM4/19/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

An alternative approach might to commit to dropping the whole
`CASE`/`WHEN` wrapping altogether on backends that support it as described
in #29771.

We know the underlying expression construction approach performs poorly
#31202 and the following form doesn't suffer for the type inference issue
we're experiencing here

{{{#!sql
UPDATE test_model SET name = v.name
FROM (VALUES
(1, β€˜aaaaaaaaaa’),
(2, β€˜bbbbbbbbbb’)
) AS v(id, name)
WHERE test_model.id = v.id
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:4>

Django

unread,
Apr 19, 2022, 4:54:44β€―AM4/19/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jerch):

Replying to [comment:4 Simon Charette]:


> An alternative approach might to commit to dropping the whole
`CASE`/`WHEN` wrapping altogether on backends that support it as described
in #29771.

Hmm, yes perfwise I totally agree - the UPDATE FROM VALUES variants are
much better for pumping tons of individual values. I already tried to
address that pattern in https://github.com/netzkolchose/django-fast-
update, with string formatting atm. The runtime numbers there speak for
themselves. But for a serious integration in the ORM there are some
obstacles to overcome:
- f-expressions wont work anymore (at least not without big workarounds)
- profound ORM integration needs serious rework on the update sql compiler
(or even a separate one just for the update + values pattern)
- depends on recent db engines (+distinction of mysql8 vs. mariadb)

Regarding f-expressions - idk if thats a biggie: `bulk_update` always
occured to me as an interface to pump individual values rather than doing
column trickery with it, so I would not mind, if that interface does not
support f-expressions anymore. But thats just me, if people insist on
using f-expressions here as well, this would need a workaround of unknown
complexity.

Imho the second point can be done, it just needs someone with time and
enough dedication (well I can try that, but would need serious help, as I
lack deeper knowledge of ORM internals).

The last point is more tricky - how to deal with older or incompatible
database engines here? Create a fallback (current implementation)? Or just
confront users with "nope, not supported here, get a newer/compatible db
engine"? It also raises the question, where to park the actually
implementation - while the ORM itself could blueprint the UPDATE FROM
VALUES pattern in ORM style, the backends would have to translate it into
their very own style, or even substyles for mysql (mysql8 != mariadb
here).

I guess that such a ground-shaking change to django would need some sort
of consensus first?

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:5>

Django

unread,
Apr 25, 2022, 2:58:07β€―PM4/25/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jerch):

Replying to [comment:4 Simon Charette]:
> An alternative approach might to commit to dropping the whole
`CASE`/`WHEN` wrapping altogether on backends that support it as described
in #29771.
>

> We know the underlying expression construction approach performs poorly
#31202 and the following form doesn't suffer for the type inference issue
we're experiencing here
>
> {{{#!sql
> UPDATE test_model SET name = v.name
> FROM (VALUES
> (1, β€˜aaaaaaaaaa’),
> (2, β€˜bbbbbbbbbb’)
> ) AS v(id, name)
> WHERE test_model.id = v.id
> }}}

Can you give me a pointer, how to get a discussion about that rolling? Or
who to contact? I already wrote about my implementation of that idea 3
months ago, no response from anyone. Then I put everything into a neat
package with extensive tests, no response (beside some 3rd person giving
really helpful feedback). I even wrote a mailing list request about it
yday to discuss some details - no response either (though its still pretty
fresh). I really dont know what else to do. Could it be that no one is
actually interested in a revamped bulk_update implementation in django? Or
is django development known to have a very slow pace / being in
maintenance mode mostly? I dont want to blame anyone - could all be me
pushing the wrong buttons, but I've never faced it to that degree in any
other OSS project.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:6>

Django

unread,
Apr 26, 2022, 2:37:44β€―AM4/26/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Carlton Gibson):

[https://groups.google.com/g/django-
developers/c/jIGj1KKuaDM/m/4EiKPV5XAgAJ Link to the mailing list thread]

Hi JΓΆrg β€” thanks for the input here. Sorry you're feeling frustrated.

> Could it be that no one is actually interested in a revamped bulk_update
implementation in django? Or is django development known to have a very
slow pace / being in maintenance mode mostly?

So there's three points there:

* I suspect it's not **lots** of people who are directly vested, but there
are a number of regular contributors to the ORM (Simon included) and I'd
imagine this is a topic of interest, but, as you've already pointed out in
your mailing list post, there are several tradeoffs to consider, and it'll
need some thought. Folks have limit bandwidth: that doesn't entail no
interest. I hope that's clear.
* Django **does** have a slow pace. That's OK. After 16+ years, that's
proven to be one of its strengths. It's a big project, with a lot of
surface area, and (again) folks have limited bandwidth. It's one reason
why third-party packages, such as the one you've done, are a good way to
go, as they allow a faster pace, and a sandbox to work on issues.
* Despite the slow pace, Django is in anything but maintenance mode: you
need only look at the release notes over the last few major releases to
see that new features are constantly being worked on and delivered. If you
zoom-out from any particular issue, I contest, the development pace is
actually quite rapid for a project of Django's size and maturity (despite
being "slow" on the surface.)

We're currently heads-down working towards the feature freeze for Django
4.1 β€” there is no chance (really) of this getting addressed for that. That
leaves a realistic opportunity to discuss it for Django 4.2, and if you're
keen, and the technical questions can be resolved, there's no reason it
couldn't get in for that. If we miss that, then the next one... β€” Again
zooming out, it soon fades that it took x-cycles to get any particular
feature work completed.

Looking at the timestamps on the discussion here, not much time has passed
between comments. I'd suggest a little patience, and working on the third-
party implementation to resolve any outstanding issues in that time. If
it's **readyβ„’** following up on the mailing list thread may be appropriate
to let folks know they can give it a try.

I hope that all makes sense, and helps anchor expectations.
[https://www.commitstrip.com/en/2014/05/07/the-truth-behind-open-source-
apps/ There's a nice comic here which I always try to keep in mind].

Kind Regards,

Carlton

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:7>

Django

unread,
Apr 26, 2022, 5:03:11β€―AM4/26/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jerch):

Hey Carlton,

thanks for the headsup. I didnt meant to sound like a drama queen, sorry
if it came that way. Also I am not eager to push my ideas through at any
price, since I could be totally on the wrong track, simply for reasons
I've overlooked. So it is more about getting feedback at all, whether
things go into the right direction or not.

What I've learned from >20ys OSS contributions - giving ppl early feedback
helps to keep them engaged, and lowers the risk of time consuming dead end
implementations (time on both ends, the implementer and the
reviewers/maintainers), esp. when the bigger picture needs to be addressed
(API changes, bigger codebase changes involved at several places). I know
that django is a very big codebase with lots of legacy, which imho makes
it even harder for someone from outside to get involved. This for sure is
a balancing act to maintain. Ofc slow pace is not a bad thing - in fact I
like django for not buying every shiny new idea in town, as it gives very
solid development experience (using django myself since version 0.8).
At this point I wonder if the separated issue tracker in trac vs. repo in
github might be part of a communication/transfer issue? (At least for me
as maintainer of several projects github/gitlab made conceptual
discussions and overall communications alot easier than back in
SVN/mailing list times...)

I hope I did not derail this issue too much. :)

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:8>

Django

unread,
Apr 26, 2022, 9:44:45β€―AM4/26/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Carlton Gibson):

Hi JΓΆrg.

> ...giving ppl early feedback helps to keep them engaged, and lowers the


risk of time consuming dead end implementations (time on both ends, the
implementer and the reviewers/maintainers), esp. when the bigger picture
needs to be addressed (API changes, bigger codebase changes involved at
several places)

Sure. I'm not sure the 11 days since you opened the ticket is that much
for folks to come to a view. If your workaround is preforming well for
you, that's good input. Otherwise you may need a little patience, though I
see some input on the mailing list... πŸ™‚

> ...if the separated issue tracker in trac...

That's out of scope for this one I'm afraid πŸ˜…. (There are mailing list
discussions about it, but it's not trivial, not least because of the
history in Trac... β€” As I say out of scope for here... 😬)

Thanks.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:9>

Django

unread,
Apr 26, 2022, 1:12:00β€―PM4/26/22
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jerch):

Replying to [comment:9 Carlton Gibson]:

> ... I'm not sure the 11 days since you opened the ticket is that much


for folks to come to a view. If your workaround is preforming well for
you, that's good input. Otherwise you may need a little patience, though I
see some input on the mailing list... πŸ™‚

Well this ticket here is only loosely related and much younger (found the
issue while doing some bulk_update tests). Ah whatever, will just wait on
the mailing list input...

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:10>

Django

unread,
Jul 14, 2023, 8:48:30β€―AM7/14/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: danhamilt
Type: Bug | Status: assigned

Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by danhamilt):

* owner: nobody => danhamilt
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:11>

Django

unread,
Jul 14, 2023, 7:24:08β€―PM7/14/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: (none)
Type: Bug | Status: new

Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Dan Hamilton):

* owner: Dan Hamilton => (none)
* status: assigned => new


--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:12>

Django

unread,
Aug 12, 2023, 2:16:18β€―AM8/12/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Priyank
| Panchal
Type: Bug | Status: assigned

Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Priyank Panchal):

* owner: (none) => Priyank Panchal


* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:13>

Django

unread,
Sep 10, 2023, 5:20:16β€―AM9/10/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Priyank
| Panchal
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by David Sanders):

Hi Priyank,

Just checking are you still interested in working on this?

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:14>

Django

unread,
Sep 16, 2023, 1:21:09β€―AM9/16/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Priyank Panchal):

* owner: Priyank Panchal => (none)


* status: assigned => new


--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:15>

Django

unread,
Sep 16, 2023, 1:23:52β€―AM9/16/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Priyank Panchal):

Hello, I have attempted to address this issue ,and I've identified the
problem. It appears that the problem arises whenever the CAST() function
is executed. it seems that changing these parameters to False is not an
option. Additionally, the issue with SQLite occurs when all characters are
stored in the database. What would be the best approach to resolve this
problem?

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:16>

Django

unread,
Oct 14, 2023, 11:45:07β€―AM10/14/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned

Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Akash Kumar Sen):

* owner: (none) => Akash Kumar Sen


* status: new => assigned

* has_patch: 0 => 1


Comment:

Sorry I missed your comment and accidentally created a patch, Let's
connect if you need help in some other ORM ticket.

Patch - https://github.com/django/django/pull/17363

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:17>

Django

unread,
Oct 14, 2023, 1:32:50β€―PM10/14/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* has_patch: 1 => 0


Comment:

Replying to [comment:17 Akash Kumar Sen]:


> Sorry I missed your comment and accidentally created a patch, Let's
connect if you need help in some other ORM ticket.
>
> Patch - https://github.com/django/django/pull/17363

We cannot regress `Cast()` to avoid truncating values in `bulk_update()`.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:18>

Django

unread,
Oct 14, 2023, 1:47:20β€―PM10/14/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Akash Kumar Sen):

I don't think we are regressing the `Cast()` here the query generated is
{{{
UPDATE "queries_article" SET "name" = (CASE WHEN ("queries_article"."id" =
1) THEN bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb WHEN
("queries_article"."id" = 2) THEN
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb ELSE NULL END)::varchar
WHERE "queries_article"."id" IN (1, 2)
}}}
it is just generating the super type i.e. `varchar` instead of
`varchar(20)` in case of a `CharField`

If you can explain a little further that would be great @felixxm

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:19>

Django

unread,
Oct 14, 2023, 2:12:18β€―PM10/14/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

> If you can explain a little further that would be great Mariusz

Explained in PR.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:20>

Django

unread,
Oct 15, 2023, 8:23:32β€―AM10/15/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Akash Kumar Sen):

* has_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:21>

Django

unread,
Oct 15, 2023, 12:30:51β€―PM10/15/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:22>

Django

unread,
Oct 15, 2023, 12:45:50β€―PM10/15/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Akash Kumar Sen):

I have checked your GitHub comment. Any suggestions you have in mind?
Initially I went for updating the compiler itself, but that seems to be a
much more tedious task. I have one more hacky Idea like this which is as
follows:
- Introduce a new database function named {{{CastSuperType}}} that will
always cast the super type for every possible arguments.
- Like varchar for varchar(20) and similar equivalents for all the other
fields that supports casting.

- I also thought of having a different {{{SQLUpdateCompiler}}} compiler
for PostgreSQL, but as the UpdateQuery(https://github.com/Akash-Kumar-
Sen/django/blob/bulk_update/django/db/models/sql/subqueries.py#L48) code
is shared I am finding a hard time to do that.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:23>

Django

unread,
Oct 15, 2023, 11:23:27β€―PM10/15/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_better_patch: 1 => 0


* has_patch: 1 => 0


Comment:

Replying to [comment:23 Akash Kumar Sen]:


> I have checked your GitHub comment. Any suggestions you have in mind?
> Initially I went for updating the compiler itself, but that seems to be
a much more tedious task. I have one more hacky Idea like this which is as
follows:
> - Introduce a new database function named {{{CastSuperType}}} that will
always cast the super type for every possible arguments.
> - Like varchar for varchar(20) and similar equivalents for all the other
fields that supports casting.
>
> - I also thought of having a different {{{SQLUpdateCompiler}}} compiler
for PostgreSQL, but as the UpdateQuery(https://github.com/Akash-Kumar-
Sen/django/blob/bulk_update/django/db/models/sql/subqueries.py#L48) code

is shared between the databases I am finding a hard time to do that.

This is a tricky issue to solve, and we cannot move forward with a stub
solution just for this reason. I suspect that we will need to revisit
`bulk_update()` to make it work properly, but I don't have any specific
advice.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:24>

Django

unread,
Oct 15, 2023, 11:31:17β€―PM10/15/23
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Akash Kumar Sen):

Following this approach mentioned by Simon in
[https://code.djangoproject.com/ticket/33647#comment:3 comment 3] would be
reasonable I think.

> That could be one approach yes, we'd likely need to adapt `CAST` to
allow for such usage though. Not sure of what the argument should be named
though, maybe `generic` which
> defaults to `False`? Not sure what `generic=True` would mean in the case
of `Cast(expr, ArrayField(CharField(max_length=20), generic=True)` would
it be `::varchar[]` or `::array`.

--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:25>

Django

unread,
Apr 9, 2024, 4:23:08β€―AM4/9/24
to django-...@googlegroups.com
#33647: bulk_update silently truncating values for size limited fields
-------------------------------------+-------------------------------------
Reporter: jerch | Owner: Akash
| Kumar Sen
Type: Bug | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Craig de Stigter):

related: #35362
--
Ticket URL: <https://code.djangoproject.com/ticket/33647#comment:26>
Reply all
Reply to author
Forward
0 new messages