[Django] #24203: Optimisation: adding multiple fields to same model should attempt to run single ALTER TABLE statement

41 views
Skip to first unread message

Django

unread,
Jan 22, 2015, 1:39:23 PM1/22/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
--------------------------------------+-----------------------
Reporter: peterlauri | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Migrations | Version: 1.7
Severity: Normal | Keywords: migration
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+-----------------------
Origin: https://groups.google.com/d/msg/django-users/DDekrNgXH2U/k1un-
e8CbnMJ

When adding multiple fields to a Model the makemigrate and migrate does
not attempt to create one ALTER TABLE statement, but rather one per field
added. This can cause slow migrations of large tables.

I propose to optimise to attempt to merge multiple AddField to one single
ALTER TABLE request when the migrate is run.

My real world case the db backend was MySQL, here is PG example of 2nd
migration db output after adding two fields to one model.

{{{
Applying play.0002_auto_20150122_1825...DEBUG ALTER TABLE "play_play"
ADD COLUMN "field2" varchar(100) NULL; (params [])
DEBUG (0.000) ALTER TABLE "play_play" ADD COLUMN "field2" varchar(100)
NULL; args=[]
DEBUG ALTER TABLE "play_play" ALTER COLUMN "field2" DROP DEFAULT; (params
[])
DEBUG (0.000) ALTER TABLE "play_play" ALTER COLUMN "field2" DROP DEFAULT;
args=[]
DEBUG ALTER TABLE "play_play" ADD COLUMN "field3" varchar(100) NULL;
(params [])
DEBUG (0.000) ALTER TABLE "play_play" ADD COLUMN "field3" varchar(100)
NULL; args=[]
DEBUG ALTER TABLE "play_play" ALTER COLUMN "field3" DROP DEFAULT; (params
[])
DEBUG (0.000) ALTER TABLE "play_play" ALTER COLUMN "field3" DROP DEFAULT;
args=[]
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24203>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 22, 2015, 2:34:36 PM1/22/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: 1.7
Severity: Normal | Resolution:

Keywords: migration | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by peterlauri):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Comment:

Some ideas: https://github.com/peterlauri/django/pull/1

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:1>

Django

unread,
Jan 29, 2015, 3:54:30 PM1/29/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: 1.7
Severity: Normal | Resolution:

Keywords: migration | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by timgraham):

Could you give some performance numbers for comparison?

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:2>

Django

unread,
Jan 29, 2015, 11:51:09 PM1/29/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: 1.7
Severity: Normal | Resolution:

Keywords: migration | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by peterlauri):

Replying to [comment:2 timgraham]:


> Could you give some performance numbers for comparison?

I can provide this when I'm back from vacation in two weeks (13th of
February), I don't have access to the db for the moment.

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:3>

Django

unread,
Feb 1, 2015, 10:53:52 AM2/1/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by MarkusH):

* cc: MarkusH (added)
* version: 1.7 => master
* stage: Unreviewed => Someday/Maybe


Comment:

On PG it is recommended (in the docs iirc, but can't find a link) to add a
column with `null=True`, promote the row values for existing rows and drop
the `not null` constraint. This way adding the column is instant and thus
no need to join multiple `AddField` operations.

On MySQL <5.6 the `ADD COLUMN` operations are expensive in any case
independent from `NULL` or `NOT NULL` afaik.

On SQLite, well, it's SQLite. It's slow anyways and doesn't support adding
multiple columns at once.

I have no clue about Oracle's behavior.

That said, I'm not sure this is really necessary but keep the issue open
for your feedback.

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:4>

Django

unread,
Mar 4, 2015, 5:05:50 AM3/4/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by simonpercivall):

Replying to [comment:2 timgraham]:
> Could you give some performance numbers for comparison?

I can give performance numbers for a MySQL installation.

Doing an ALTER TABLE on a table with 120 columns and containing approx.
20M rows takes 2–3 hours. Combining several ALTER TABLES into one takes no
longer time.

The simple math here is that three Django migrations would take 6–9 hours,
whereas the combined version would take 2–3 hours.

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:5>

Django

unread,
Mar 4, 2015, 9:01:55 AM3/4/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by peterlauri):

Replying to [comment:2 timgraham]:
> Could you give some performance numbers for comparison?

This is not from production, it contains ~500k rows. I will try it out on
PG as well, just need to prep some data.

{{{
mysql> select count(*) from thetable;
+----------+
| count(*) |
+----------+
| 489484 |
+----------+
1 row in set (0.07 sec)

mysql> ALTER TABLE thetable ADD COLUMN field2 varchar(100) NULL;
Query OK, 489484 rows affected (2 min 29.05 sec)
Records: 489484 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE thetable ADD COLUMN field3 varchar(100) NULL;
Query OK, 489484 rows affected (2 min 10.82 sec)
Records: 489484 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE thetable ADD COLUMN field4 varchar(100) NULL, ADD
COLUMN field5 varchar(100) NULL;
Query OK, 489484 rows affected (2 min 24.92 sec)
Records: 489484 Duplicates: 0 Warnings: 0

mysql>
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:6>

Django

unread,
Mar 4, 2015, 9:22:16 AM3/4/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by peterlauri):

Replying to [comment:2 timgraham]:
> Could you give some performance numbers for comparison?

I prepped a DB with some fields and 500k rows. The alter table statements
as in my original description executed each in few milliseconds. No
significant improvement to execute them in a single statement.

{{{
ballongen=# select count(*) from testtable;
count
--------
500000
(1 row)

Time: 34.261 ms
ballongen=# ALTER TABLE "testtable" ADD COLUMN "field2" varchar(100) NULL;
ALTER TABLE
Time: 56.649 ms
ballongen=# ALTER TABLE "testtable" ALTER COLUMN "field2" DROP DEFAULT;
ALTER TABLE
Time: 2.563 ms
ballongen=# ALTER TABLE "testtable" ADD COLUMN "field3" varchar(100) NULL;
ALTER TABLE
Time: 1.165 ms
ballongen=# ALTER TABLE "testtable" ALTER COLUMN "field3" DROP DEFAULT;
ALTER TABLE
Time: 0.240 ms
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:7>

Django

unread,
Mar 27, 2015, 9:46:35 AM3/27/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by adamchainz):

+1

I spend a large amount of time rewriting the SQL output from django
migrations to combine the ALTER TABLE statements for each table into one,
just so they can pass on our production database (MySQL, 30GB, hundreds of
millions of rows).

Actually I normally use [http://www.percona.com/doc/percona-toolkit/2.2
/pt-online-schema-change.html pt-online-schema-change] for tables of any
size, since it's 0 downtime, but it's still a pain to rewrite all the SQL
into a single ALTER.

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:8>

Django

unread,
Mar 27, 2015, 9:58:52 AM3/27/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by peterlauri):

Please note that in my comment #7
(https://code.djangoproject.com/ticket/24203#comment:7) I forgot to
mention this was for PostgreSQL.

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:9>

Django

unread,
Jun 10, 2015, 9:19:58 AM6/10/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by ambivalentno):

* cc: ambivalentno (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:10>

Django

unread,
Jun 13, 2015, 7:49:16 AM6/13/15
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: peterlauri | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by ambivalentno):

Regarding timings:
InnoDB table with 2 mln entries (but lots of big TextFields).
It takes ~3 hours to add a single field, and also ~3 hours to add multiple
fields in case they're grouped via single sql statement. Actually, most of
time it takes to perform a migration mysql copies data to temporary table.

--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:11>

Django

unread,
Oct 25, 2016, 2:19:17 PM10/25/16
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: Peter Lauri | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Doug Harris):

* cc: dharris+django@… (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:12>

Django

unread,
Jan 24, 2017, 10:24:15 AM1/24/17
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: Peter Lauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Ed Morley):

* cc: emorley@… (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:13>

Django

unread,
Oct 22, 2019, 12:43:20 PM10/22/19
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: Peter Lauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Phil Krylov):

* cc: Phil Krylov (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:14>

Django

unread,
Apr 9, 2020, 5:01:03 PM4/9/20
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: Peter Lauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Adam (Chainz) Johnson):

* cc: Adam (Chainz) Johnson (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:15>

Django

unread,
Aug 10, 2022, 1:12:54 PM8/10/22
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: Peter Lauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: dev

Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by elonzh):

* cc: elonzh (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:16>

Django

unread,
Aug 10, 2022, 1:13:16 PM8/10/22
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: Peter Lauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: dev
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by elonzh):

* cc: elonzh (removed)


--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:17>

Django

unread,
Jan 31, 2023, 2:24:19 PM1/31/23
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: Peter Lauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: dev
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by elonzh):

* cc: elonzh (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:18>

Django

unread,
Dec 11, 2024, 11:19:11 PM12/11/24
to django-...@googlegroups.com
#24203: Optimisation: adding multiple fields to same model should attempt to run
single ALTER TABLE statement
-------------------------------------+-------------------------------------
Reporter: Peter Lauri | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: dev
Severity: Normal | Resolution:
Keywords: migration | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Pratyush Mittal):

Yeah, clubbing will save multiple `copy to tmp table` steps on large
tables.

I have a MySQL table with ~20mln rows. I increased the decimal places for
6 columns. This took 45 minutes which would have taken 5 minutes if the
alter statements were clubbed.
--
Ticket URL: <https://code.djangoproject.com/ticket/24203#comment:19>
Reply all
Reply to author
Forward
0 new messages