Proposal: Permit __file__-less migrations packages

114 views
Skip to first unread message

William Schwartz

unread,
Dec 22, 2020, 11:28:32 AM12/22/20
to Django developers (Contributions to Django itself)
Django developers,

The migration loader currently loads migrations only from apps whose migrations package has a __file__ attribute. The reason for this check, given both in the source code comments and in discussions among maintainers, is to discourage the use of PEP-420 namespace packages.

Without affecting that behavior, I propose to permit migrations run from Python environments that do not provide even regular packages with a __file__ attribute. (This falls under the general umbrella of #30950, which describes these environments a little.) The required change is just a few lines.

If this proposal is something you would consider, I will submit a Trac ticket laying out further details, including citations to Python documentation that justifies my approach. I will simultaneously submit a PR with the change and a test. Documentation changes are not likely necessary, but I can add some if desired.

I would love for this feature to land in 3.2 before the feature-freeze date, and can do all the work for it if a Django committer can review and merge the PR.

Best,
William Schwartz

Adam Johnson

unread,
Dec 22, 2020, 11:55:01 AM12/22/20
to django-d...@googlegroups.com
I've hit several problems with PEP-420 namespace packages in the last few months, which inspired me to create flake8-no-pep420: https://pypi.org/project/flake8-no-pep420/ . In summary, I found pep-420 namespace packages are not checked by Django's test runner, coverage.py, or mypy (with its bonus flag --namespace-packages). I imagine other code tools fail to import them too.

I would be inclined to stick with the previous decision unless you can prove there's a great reason to support migrations in namespace packages -- other than a distaste for blank __init__.py files. I can't imagine you actually want to split one app's migrations into several separate file system directories, do you?

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/1c2cb975-0f78-4ef9-a025-303dcb17a5aen%40googlegroups.com.


--
Adam

William Schwartz

unread,
Dec 22, 2020, 11:59:36 AM12/22/20
to Django developers (Contributions to Django itself)
As I said in my original post, I do not propose to change the behavior of Django with respect to namespace packages. My proposal only makes the check for them more specific so that we permit regular packages run from Python environments that do not set __file__.

Mariusz Felisiak

unread,
Dec 22, 2020, 12:23:19 PM12/22/20
to Django developers (Contributions to Django itself)
Hi,

    I can only repeat my arguments from the ticket. I don't see a strong reason to add an extra dependency for this. Django 4.0 will drop support for Python 3.6 and Python 3.7, so we can reconsider this ticket when developing Django 4.0+.

Best,
Mariusz

William Schwartz

unread,
Dec 22, 2020, 1:02:09 PM12/22/20
to Django developers (Contributions to Django itself)
I now see that my original post was too vague. I will more thoroughly explain myself below, including the actual patch (because it's that short!). My proposal will not change Django's behavior at all for anyone not using "freezers" such as PyOxidizer, cx_Freeze, Nuitka, PyInstaller, and the like. No changes to namespace package support. No changes to dependencies. In particular I mentioned #30950 only in its capacity as a tracking ticket, but my proposal supports a very narrow subset of that ticket's goals.

To filter out namespace packages, django.db.migrations.loader.MigrationLoader.load_disk currently rejects any module missing a __file__ attribute (including __file__ is None). Because __file__ is an optional attribute of modules not provided by some freezers, just testing for __file__ casts too wide a net. Fortunately, namespace packages have other differences with regular packages that can be detected at runtime. One of those differences is that their __loader__ is importlib._bootstrap_external._NamespaceLoader, but that is undocumented and specific to CPython. However, a difference that is documented in the Python language specification is the type of __path__:

> Namespace packages do not use an ordinary list for their __path__ attribute. They instead use a custom iterable type....

So here is the patch that I propose. For brevity, I am excluding the test, but it's only another 17 lines.

diff --git a/django/db/migrations/loader.py b/django/db/migrations/loader.py
index 95a5062ec9..fd3a92f779 100644
--- a/django/db/migrations/loader.py
+++ b/django/db/migrations/loader.py
@@ -88,15 +88,19 @@ class MigrationLoader:
                     continue
                 raise
             else:
-                # Empty directories are namespaces.
-                # getattr() needed on PY36 and older (replace w/attribute access).
-                if getattr(module, '__file__', None) is None:
-                    self.unmigrated_apps.add(app_config.label)
-                    continue
                 # Module is not a package (e.g. migrations.py).
                 if not hasattr(module, '__path__'):
                     self.unmigrated_apps.add(app_config.label)
                     continue
+                # Empty directories are namespaces. Namespace packages have no
+                # __file__ and don't use a list for __path__. See
+                # https://docs.python.org/3/reference/import.html#namespace-packages
+                if (
+                    getattr(module, '__file__', None) is None and
+                    not isinstance(module.__path__, list)
+                ):
+                    self.unmigrated_apps.add(app_config.label)
+                    continue
                 # Force a reload if it's already loaded (tests need this)
                 if was_loaded:
                     reload(module)

I hope the narrow nature of this patch is more acceptable than what you initially thought I was proposing. I am happy to answer further questions.

Thanks for taking the time to engage with me about this!

Mariusz Felisiak

unread,
Dec 22, 2020, 2:28:41 PM12/22/20
to Django developers (Contributions to Django itself)
Hi,

    Thanks for extra details. It's not about the number of LOC. I have the impression that more changes are needed to support "freezers" , see the previous discussion about PyOxidizer. Are we sure this is the only change that is needed? and how we can avoid regressions and don't break this in the future?

Best,
Mariusz

William Schwartz

unread,
Dec 22, 2020, 5:27:53 PM12/22/20
to Django developers (Contributions to Django itself)
Mariusz,

Thanks for your response!

> I have the impression that more changes are needed to support "freezers".... Are we sure this is the only change that is needed?

Django will indeed require changes beyond today's proposal for full freezer support. I am pursuing an incremental approach here—apologies for being unclear on this point. My intention for this proposal, like the one in #32177 whose PR you merged, is to remove one failure mode for frozen Django apps without adversely affecting any other users.

I led with this proposal rather than others that would make progress on freezer support because it makes Django more correct and compliant with Python's import API without affecting any existing users. That is, I believe the patch strictly improves Django regardless of PyOxidizer.

(If there's interest, I would be happy to start a new thread to explain what I patched/overrode in Django to get my Django app working in PyOxidizer.)

> how we can avoid regressions and don't break this in the future?

My PR will add the following test case to tests/migrations/test_loader.py:LoaderTests.

    def test_loading_package_without__file__(self):
        """Migrations can be found even if the package's __file__ is undefined."""
        from migrations import test_migrations
        # __file__ == __spec__.origin or the latter is None and former undefined
        file = test_migrations.__file__
        origin = test_migrations.__spec__.origin
        has_location = test_migrations.__spec__.has_location
        try:
            del test_migrations.__file__
            test_migrations.__spec__.origin = None
            test_migrations.__spec__.has_location = False
            self.test_load()
        finally:
            test_migrations.__file__ = file
            test_migrations.__spec__.origin = origin
            test_migrations.__spec__.has_location = has_location


No changes to the existing LoaderTests.test_loading_namespace_package are required.

William Schwartz

unread,
Dec 28, 2020, 5:14:45 PM12/28/20
to Django developers (Contributions to Django itself)
In the hope that I have allayed this group's initial concerns, I have created ticket #32302 and pull request GH-13820. If you still have reservations, I am happy to discuss this matter further. Getting this merged in time for Django 3.2 would be a huge relief!
Reply all
Reply to author
Forward
0 new messages