Content types shouldn't be created on post_migrate signal

599 views
Skip to first unread message

Marcin Nowak

unread,
Oct 3, 2018, 7:01:37 AM10/3/18
to Django developers (Contributions to Django itself)
Hello.

There is a huge issue with content types framework. The data is loaded after every migration (post_migrate signal) automatically, and this approach is against db data consistency.
The biggest problem is with data migrations, which are applied at the first time (on clean database). Any data migration which depends on some content types will not insert the data.

The sequence looks like this:
  1. create db
  2. insert into auth_permission ...  inner join django_content_type ...
  3. post migrate: insert into django_content_type ...

Two things are wrong:
  • automatic creation of content types
  • creation of content types after running migrations
Solution:
  • creation of new app should add very first migration, which will add entry to the content types
  • create_contentypes handler should be removed from post_migrate signal
Any data migration, which depends on some content types, should be declared with proper dependency. This will guarantee existence of the required content type entry.


BR,
Marcin


Adam Johnson

unread,
Oct 3, 2018, 7:26:25 AM10/3/18
to django-d...@googlegroups.com
Wouldn't a workaround be to call create_contenttypes() in a RunPython in your app's migration before inserting the data which depends on the content types?

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/8869b1cb-2b92-4e05-823a-92e72308fc23%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Adam

Marcin Nowak

unread,
Oct 3, 2018, 7:47:16 AM10/3/18
to Django developers (Contributions to Django itself)
]
Wouldn't a workaround be to call create_contenttypes() in a RunPython in your app's migration before inserting the data which depends on the content types?


Thanks, but as a workaround you can do almost everything. My post is about proposal of fixing the issue.

BTW: RunPython() is another thing, which can break your migrations, and should not be used (especially not by Django internally), because it relies on the application layer. I discussed about it some time ago.


BR,
Marcin

Aymeric Augustin

unread,
Oct 4, 2018, 3:44:21 PM10/4/18
to django-d...@googlegroups.com
Hello Marcin,

I assume you're writing to this list because you would like other Django contributors to cooperate in order to fix this issue. Starting with "There is a huge issue with content types framework" isn't a good way to motivate them.

Speaking for myself, I would be more eager to investigate if you skipped the hyperbole and remained neutral, for example "I'm facing an issue with the content types framework".

Many of your messages on this mailing list contain strongly negative or passive-aggressive comments. They get in the way of communicating your ideas. You'd have more success if you managed to write in a positive style.

I think the issue itself is valid. I may have hit it before and worked around it, likely be executing a subset of migrations to trigger creation of content types, then executing the rest of the migrations. Django could probably do better.

Best regards,

-- 
Aymeric.



Marcin Nowak

unread,
Oct 4, 2018, 4:36:36 PM10/4/18
to Django developers (Contributions to Django itself)
Hi Aymeric.

Thank you for your reply. 

Unfortunately you wrote mostly about me or my writing style, not about the issue.
I disagree with your opinion about my comments being passive or aggressive. 
I'm always writing about a piece of code, functionality, design/architecture or bug. I never criticised a person directly.

 
Starting with "There is a huge issue with content types framework" isn't a good way to motivate them.


But there is an issue with content types framework (not with it's authors). 

 
Speaking for myself, I would be more eager to investigate if you skipped the hyperbole and remained neutral, for example "I'm facing an issue with the content types framework".


Sorry hearing that. I'm not native English speaker. 

For me there is almost no difference with these two sentences, except that first is about the affected thing ("there is an issue with") and second is more about me ("I have a problem"). 
Both are valid, I think. I have a problem which is caused by CT framework's design or issue (in fact it comes from a historical reason, before migrations era).
 
You'd have more success if you managed to write in a positive style.

I don't think that my style is unpleasant. If it is - sorry for that. 
I'm always trying to describe the problem and give some proposals. 
But I'll try to improve this.
  
I think the issue itself is valid. I may have hit it before and worked around it, likely be executing a subset of migrations to trigger creation of content types, then executing the rest of the migrations. Django could probably do better.


I'll generate CTs in very first migration. This will be a workaround, of course. 
But because Django uses migrations internally, CT's should be added to the database in a same way.

And RunPython() can be problematic. I did something similar with Liquibase' executeCommand, which was calling management command. And of course it caused problems after changing app layer. 
I'm very conservative about databases and managing them, I think that there must be complete separation between db and app layer (in the context of managing dbs), because app layer is changing frequently (more often than dbs). Mixing both worlds will cause troubles. Always.

Kind Regards,
Marcin

Arthur Rio

unread,
Oct 4, 2018, 5:09:02 PM10/4/18
to Marcin Nowak, django-d...@googlegroups.com
BTW: RunPython() is another thing, which can break your migrations, and should not be used (especially not by Django internally), because it relies on the application layer.

How else can you do a data migration? There is no `migrations.InsertIntoTable`, the only other way currently would be to run a `migrations.RunSql` query which may look different based on the database used.

Two things are wrong:
  • automatic creation of content types

Why is it wrong to automatically create a content type? Content type is an opt-in feature since you can remove it from `INSTALLED_APPS`.

  • creation of content types after running migrations

That’s the only real problem for me. Maybe using a signal for `migrations.CreateModel` (e.g. post_create_model), instead of using `post_migrate` would fix it, but there may be other scenarios I’m not thinking about.


Solution:
  • creation of new app should add very first migration, which will add entry to the content types

How would you handle creating a model later on? Or if `django.contrib.contenttypes` is only added later on to `INSTALLED_APPS`?


Regards,

Arthur

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

Marcin Nowak

unread,
Oct 4, 2018, 7:47:15 PM10/4/18
to Django developers (Contributions to Django itself)

Hi Arthur.
BTW: RunPython() is another thing, which can break your migrations, and should not be used (especially not by Django internally), because it relies on the application layer.

How else can you do a data migration? There is no `migrations.InsertIntoTable`,


You're right. That's why "Insert" should be (in my opinion) introduced. Together with "migrations.Delete".

The problem is that data migration based on app layer (python objects, ie. Models and Managers here) will cause troubles after some time (when app is changing).
In the other words - you cannot rely on your app layer when doing database changes. You should never do that, especially for projects requiring LTS.

RunPython() should be used only when you cannot do anything else. In such case you must accept all consequences. I'm not against RunPython(), but against doing data migrations using it.

The problem with hypothetical "Insert" operation is with mapping field types. Insert cannot use Models directly (app layer is changing over a time), but should "know" how to map arguments (python types, values) to a database literals. It can be achieved by introducing field mapping for a every insert or per migration file (something like "model freeze", but only for used fields).

Also Insert should not accept variables calculated in the app layer (within a migration) - it should contain only plain/direct data. But using Python as a language for migrations, will be hard to avoid such possibility. This is important, because database management should not rely on app layer. Using variables (i.e. from settings) would result in inconsistent data changes between different environments.

the only other way currently would be to run a `migrations.RunSql` query which may look different based on the database used.


RunSQL is not the solution for db agnostic operations. It may be only used within a project, because db engine used changes rarely (due to the nature and importance of the data and relational databases, and systems dependent on the db).
But RunSQL is a handful operation, because SQL is a natural language for db management. I'm doing many raw sqls in migrations.
 
Two things are wrong:
  • automatic creation of content types

Why is it wrong to automatically create a content type?


Maybe "automatic" is not a proper word. They should be created automatically, but should be applied "at the right time".
 

Content type is an opt-in feature since you can remove it from `INSTALLED_APPS`.


I know, but it is required by contrib.auth. I saw no project depending on something else, so CT is optional.. but theoretically ;)
 
  • creation of content types after running migrations

That’s the only real problem for me. Maybe using a signal for `migrations.CreateModel` (e.g. post_create_model), instead of using `post_migrate` would fix it, but there may be other scenarios I’m not thinking about.


Because signals are related to the app layer and  mixing app and db layers together is generally wrong (sorry for wording, please read as "not so good"), changing one signal to another is not a good solution.

Before I wrote about the issue, I was doing diagnosis and I was thinking about the issue for a while. So I think that the best place for applying data changes is a migration.
Django can detect model changes (adding and removals of models, too), so it can generate series of inserts or deletes in the right place, to run them at the right time.

Or if `django.contrib.contenttypes` is only added later on to `INSTALLED_APPS`?


After adding contrib.contenttypes, Django should check existence of django_content_type table. If exists, Django should only check for data changes and generate series of inserts/deletes. If not, Django should generate inserts for all models. If CT is removed later, Django should remove all CT data .


CT opt-in should be connected to a signal related to the creation of migration files (Autodetector?), not to a signal related to their execution. I.e. pre/post_autodection signals should be introduced.


Kind Regards,

Marcin

Arthur Rio

unread,
Oct 5, 2018, 12:31:14 PM10/5/18
to Marcin Nowak, django-d...@googlegroups.com
Hey Marcin,

The problem is that data migration based on app layer (python objects, ie. Models and Managers here) will cause troubles after some time (when app is changing).
In the other words - you cannot rely on your app layer when doing database changes. You should never do that, especially for projects requiring LTS.

In theory I understand the idea, but in practice, migrations, the ORM and the content type model are all part of Django and I don’t really see how the app changing could cause troubles. Do you have an example in mind?

Maybe "automatic" is not a proper word. They should be created automatically, but should be applied "at the right time".

Ok, that we agree on!

CT opt-in should be connected to a signal related to the creation of migration files (Autodetector?), not to a signal related to their execution. I.e. pre/post_autodection signals should be introduced.

I think we agree that the solution would be some sort of signal triggered when a model creation/deletion is detected (in the `makemigrations` phase as opposed to `migrate`) so that some code is added to the generated migration. The use of a signal is important to keep things decoupled and flexible.

The “some code is added to the migration” part still needs to be determined, either in the shape of insert/delete statements or a RunPython leveraging the ORM. In both cases, the values to insert (Adding a model) or to lookup for delete (Removing a model) are just 2 strings, the app label and the model class name.

After adding contrib.contenttypes, Django should check existence of django_content_type table. If exists, Django should only check for data changes and generate series of inserts/deletes. If not, Django should generate inserts for all models. If CT is removed later, Django should remove all CT data .

It’s a good idea but I don’t know offhand how we can keep migrations and content type decoupled to do that (especially the removal).

Finally, I also think the concept could be extended to the permission model which faces similar issues.


Regards

--

Arthur


--

You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

Arthur Rio

unread,
Oct 5, 2018, 1:28:06 PM10/5/18
to Marcin Nowak, django-d...@googlegroups.com
For some reason the text is white… Here it is in black:

Hey Marcin,

The problem is that data migration based on app layer (python objects, ie. Models and Managers here) will cause troubles after some time (when app is changing).
In the other words - you cannot rely on your app layer when doing database changes. You should never do that, especially for projects requiring LTS.

In theory I understand the idea, but in practice, migrations, the ORM and the content type model are all part of Django and I don’t really see how the app changing could cause troubles. Do you have an example in mind?

Maybe "automatic" is not a proper word. They should be created automatically, but should be applied "at the right time".

Ok, that we agree on!

CT opt-in should be connected to a signal related to the creation of migration files (Autodetector?), not to a signal related to their execution. I.e. pre/post_autodection signals should be introduced.

I think we agree that the solution would be some sort of signal triggered when a model creation/deletion is detected (in the `makemigrations` phase as opposed to `migrate`) so that some code is added to the generated migration. The use of a signal is important to keep things decoupled and flexible.

The “some code is added to the migration” part still needs to be determined, either in the shape of insert/delete statements or a RunPython leveraging the ORM. In both cases, the values to insert (Adding a model) or to lookup for delete (Removing a model) are just 2 strings, the app label and the model class name.

After adding contrib.contenttypes, Django should check existence of django_content_type table. If exists, Django should only check for data changes and generate series of inserts/deletes. If not, Django should generate inserts for all models. If CT is removed later, Django should remove all CT data .

It’s a good idea but I don’t know offhand how we can keep migrations and content type decoupled to do that (especially the removal).

Finally, I also think the concept could be extended to the permission model which faces similar issues.


Regards

--

Arthur

Arthur Rio

unread,
Oct 8, 2018, 12:15:29 AM10/8/18
to Django developers (Contributions to Django itself)
Hey Marcin,

I hope you had a good week-end. After looking through the codebase to get more familiar with how `pre_migrate` and `post_migrate` are currently used, I thought we could simply have the same sort of signal for `post_makemigrations`,
where a third party could get a list of the changes and the plan and append operations to the generated file (probably adding itself as dependency too). However, it seems to be far from trivial given the related
I was surprised to see that the `RenameModel` was handled by `ContentType` using the `pre_migrate` hook rather than happening in the `makemigrations` phase, maybe someone can share some context?

I really think this is a real problem and something that could get improved but to do so it would require fixing the bigger issue.


"As for the makemigrations hooks I think it would require a lot of thought and
efforts to get right. Right now the auto-detector is a black box that deals with
dependencies and model state deltas resolution."

If now is the time to take a stab at it, I'd be happy to help as much as I can.

Regards

--
Arthur

To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

Petter Strandmark

unread,
Oct 13, 2018, 2:56:25 PM10/13/18
to Django developers (Contributions to Django itself)
I encountered a similar issue recently, but with auth permissions.

Arthur Rio

unread,
Apr 13, 2021, 6:41:40 PM4/13/21
to Django developers (Contributions to Django itself)
Hi everyone 👋🏻

After a long period of inactivity I'm back at it with the help of Taylor.

As a reminder: 

After some experiments and discussions it felt like while the implementation might solve the initial problem, it's a bit under the hood and will be hard to debug if something goes wrong.
The idea was to inject operations at the time of running `migrate`.

So... we went back to the idea of having hooks during the `makemigrations` process instead, so that the operations would be written to the migration files, which would make it more explicit and less risky. Here is a first draft of how it would look like: https://github.com/django/django/pull/14229.

1. We know that the `makemigrations` process is complicated, so before we invest more time down that path, is there something obvious we might be missing?
2. What do you think of this approach with hooks (pre and post "add_operation")?
3. Do you think it would be a useful feature for other third party apps as well (not just content types and permissions)?

Thank you for your input, stay safe

Arthur & Taylor
Reply all
Reply to author
Forward
0 new messages