[Django] #35936: Speeding up Postgres bulk_create by using unnest

20 views
Skip to first unread message

Django

unread,
Nov 24, 2024, 10:11:33 PM11/24/24
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Type:
| Cleanup/optimization
Status: new | Component: Database
| layer (models, ORM)
Version: 5.1 | Severity: Normal
Keywords: postgres bulk | Triage Stage:
create unnest | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Per [https://forum.djangoproject.com/t/speeding-up-postgres-bulk-create-
by-using-unnest/36508/1 this forum discussion] there seems to be a
consensus and peer validation of performance benefits of using `unnest`
for the bulk insertion of literal values on Postgres.
--
Ticket URL: <https://code.djangoproject.com/ticket/35936>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 24, 2024, 10:11:43 PM11/24/24
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: postgres bulk | Triage Stage:
create unnest | Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* has_patch: 0 => 1
* needs_better_patch: 0 => 1
* owner: (none) => Simon Charette
* status: new => assigned

--
Ticket URL: <https://code.djangoproject.com/ticket/35936#comment:1>

Django

unread,
Nov 25, 2024, 1:32:45 AM11/25/24
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: postgres bulk | Triage Stage: Accepted
create unnest |
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by David Sanders):

* stage: Unreviewed => Accepted

--
Ticket URL: <https://code.djangoproject.com/ticket/35936#comment:2>

Django

unread,
Nov 27, 2024, 10:59:29 PM11/27/24
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: postgres bulk | Triage Stage: Accepted
create unnest |
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* needs_better_patch: 1 => 0

--
Ticket URL: <https://code.djangoproject.com/ticket/35936#comment:3>

Django

unread,
Dec 9, 2024, 5:02:12 PM12/9/24
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: postgres bulk | Triage Stage: Ready for
create unnest | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/35936#comment:4>

Django

unread,
Dec 11, 2024, 7:56:30 AM12/11/24
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: postgres bulk | Triage Stage: Ready for
create unnest | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce <42296566+sarahboyce@…>):

* resolution: => fixed
* status: assigned => closed

Comment:

In [changeset:"a16eedcf9c69d8a11d94cac1811018c5b996d491" a16eedc]:
{{{#!CommitTicketReference repository=""
revision="a16eedcf9c69d8a11d94cac1811018c5b996d491"
Fixed #35936 -- Used unnest for bulk inserts on Postgres when possible.

This should make bulk_create significantly faster on Postgres when
provided
only literal values.

Thanks James Sewell for writing about this technique, Tom Forbes for
validating the performance benefits, David Sanders and Mariusz Felisiak
for the review.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35936#comment:6>

Django

unread,
Dec 11, 2024, 8:03:07 AM12/11/24
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: postgres bulk | Triage Stage: Ready for
create unnest | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce <42296566+sarahboyce@…>):

In [changeset:"2638b75554d2624dca3062a8da113a47f855f2a2" 2638b755]:
{{{#!CommitTicketReference repository=""
revision="2638b75554d2624dca3062a8da113a47f855f2a2"
Refs #35936 -- Avoided field placeholder lookup for each value inserted.

By building the list of placeholders for each inserted fields once it
doesn't have to be looked up for each inserted rows twice.

The query_values_10000.benchmark.QueryValues10000.time_query_values_10000
ASV
benchmark showed a 5% speed up for 10k items on SQLite for a single field
insertion. Larger performance gains are expected when more fields are
involved.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35936#comment:5>

Django

unread,
Jan 20, 2025, 8:15:41 AM1/20/25
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: postgres bulk | Triage Stage: Ready for
create unnest | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce <42296566+sarahboyce@…>):

In [changeset:"22fc151bb86a553d84c62d7effd289356e9b6c6c" 22fc151b]:
{{{#!CommitTicketReference repository=""
revision="22fc151bb86a553d84c62d7effd289356e9b6c6c"
Fixed #36107 -- Adjusted UNNEST bulk_create strategy to opt-out sized
arrays.

The array fields opt-out heuristic failed to account for sized arrays.

Note that we keep relying on db_type as opposed to performing an
ArrayField
instance check against the column's field as there could be other
implementations of model fields that use Postgres arrays to store the
optimization must be disabled for all of them.

Refs #35936.

Thanks Claude Paroz for the report and test.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35936#comment:7>

Django

unread,
Jan 20, 2025, 8:17:34 AM1/20/25
to django-...@googlegroups.com
#35936: Speeding up Postgres bulk_create by using unnest
-------------------------------------+-------------------------------------
Reporter: Simon Charette | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: postgres bulk | Triage Stage: Ready for
create unnest | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce <42296566+sarahboyce@…>):

In [changeset:"ad6bca92a87e72dc7e5572f792463ba33f4da37d" ad6bca92]:
{{{#!CommitTicketReference repository=""
revision="ad6bca92a87e72dc7e5572f792463ba33f4da37d"
[5.2.x] Fixed #36107 -- Adjusted UNNEST bulk_create strategy to opt-out
sized arrays.

The array fields opt-out heuristic failed to account for sized arrays.

Note that we keep relying on db_type as opposed to performing an
ArrayField
instance check against the column's field as there could be other
implementations of model fields that use Postgres arrays to store the
optimization must be disabled for all of them.

Refs #35936.

Thanks Claude Paroz for the report and test.

Backport of 22fc151bb86a553d84c62d7effd289356e9b6c6c from main.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35936#comment:8>
Reply all
Reply to author
Forward
0 new messages