#36748: Postgres UNNEST optimisation of bulk_create cannot handle fields with
`get_placeholder`.
-------------------------------------+-------------------------------------
Reporter: Chris Wesseling | Type: Bug
Status: new | Component: Database
| layer (models, ORM)
Version: 5.2 | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
In order to allow fields to define a different placeholder than `"%s"` in
https://github.com/django/django/blob/97acd4d2f92eef8c285bac070d437bf0fd52e071/django/db/models/sql/compiler.py#L1769
`BaseSQLInsertCompiler.assemble_as_sql` does:
{{{
get_placeholders = [getattr(field, "get_placeholder", None) for
field in fields]
}}}
and calls those functions with `(value, compiler, connection)` to generate
the placeholders.
These alternative placeholders are not always UNNESTable. e.g. `"%s::{}"`
for arrays and and ranges, which we tried to work-around by filtering
{{{
# Fields that don't use standard internal types might not be
# unnest'able (e.g. array and geometry types are known to be
# problematic).
or any(
(field.target_field if field.is_relation else
field).get_internal_type()
not in self.connection.data_types
for field in fields
)
}}}
but more general since `get_placeholder` can be a function of `value`, we
can't UNNEST forall `fields, value_rows`, as it breaks the "one
placeholder fits all values" assumption.
Therefore if any field in fields has get_placeholder, we should fallback
to the default, unoptimised implementation.
In fact the addition of
{{{
# Field.get_placeholder takes value as an argument, therefore
the
# resulting placeholder might be dependent on the value.
# in UNNEST requires a single placeholder to "fit all values"
in
# the array.
or any(hasattr(field, "get_placeholder") for field in fields)
}}}
before the mentioned internal_type filter that's already present, handles
all cases in the test suite and never hits the internal_type case, because
all test examples also have a get_placeholder. I don't know if this will
always be the case, so I suggest keeping the present filter too.
I bumped into this regression when upgrading to 5.2 and the call to a
Postgres extension function in such a placeholder, but which has a db_type
that is present in self.connection.data_types, didn't happen anymore for
bulk inserts.
--
Ticket URL: <
https://code.djangoproject.com/ticket/36748>
Django <
https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.