Feature proposal: Allow shadowing of abstract fields

Marten Kenbeek

unread,

Feb 6, 2015, 5:48:55 PM2/6/15

to django-d...@googlegroups.com

Hey,

While fiddling with django I noticed it would be trivial to add support for shadowing fields from abstract base models. As abstract fields don't exist in the database, you simply have to remove the check that reports name clashes for abstract models, and no longer override the field.

This would allow you to both override fields from third-party abstract models (e.g. change the username length of AbstractUser), and to override fields of common abstract models on a per-model basis. Previously you'd have to copy the original abstract model just to change a single field, or alter the field after the model definition. The first approach violates the DRY principle, the second approach can introduce subtle bugs if the field is somehow used before you changed it (rare, but it can happen).

This patch adds the feature. It seems to work correctly when using the models and creating/applying migrations, and passes the complete test suite.

What do you guys think about this feature?

Russell Keith-Magee

unread,

Feb 6, 2015, 7:17:10 PM2/6/15

to Django Developers

Hi Marten,

I can understand the motivation, and the approach you've taken makes sense. It definitely strikes me as a much better alternative to tickets like #24288 than a setting.

https://code.djangoproject.com/ticket/24288

The DRY issue isn't a huge problem for me - it's certainly ungainly to not have a clean way to override an abstract field, but I'm not sure I see a particularly Pythonic alternative.

My only real concern is the potential for accidentally overriding something. This isn't especially likely on a field like "username" on a User model - but a more esoteric field, like "is_staff" would be easy to add twice, because it might not be obvious that the abstract model is providing the field that you're overriding. Some sort of safety catch (e.g., requiring override_abstract=True in an overriding definition) might be a way around this.

So - I think it's worth kicking around; I'm definitely interested in hearing if others have a similar reaction.

As far as a final patch goes, we'd definitely need some tests to validate that the model has, in fact, been overridden. I understand this is probably a proof-of-concept patch - I'm just flagging it in case you thought "it passes the current test suite" would be enough.

Yours,

Russ Magee %-)

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/f846521f-8a16-41c7-9998-2bea7c93b3dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marten Kenbeek

unread,

Feb 6, 2015, 9:33:12 PM2/6/15

to django-d...@googlegroups.com

Hi Russ,

I can see your point on accidentally overriding fields, though I'm not sure I agree. In any other case, such as normal attributes and methods, there is no safety net for overriding attributes either. Any model that would be affected by this change would also raise an error on import without the patch, so existing functional code won't be affected. On the other hand, a new parameter for the field wouldn't be that much of a hassle to implement or work with. I'd be interested to hear what others think about this.

There are more than a few questions on stack overflow that expect this behaviour, even if the docs specifically mention that it won't work. If users intuitively try to override fields in this manner, I think it would be reasonable to allow this without any explicit arguments.

We can always restrict what you can override, at least as the default behaviour, e.g. so that you can only use subclasses of the original field. That would make code that depends on the abstract field less prone to bugs, though subtle bugs can still happen if you e.g. override the max length of a field.

This was indeed just a proof-of-concept. That remark was meant more like "it doesn't appear to break everything".

Marten

Op vrijdag 6 februari 2015 23:48:55 UTC+1 schreef Marten Kenbeek:

Aron Podrigal

unread,

Feb 7, 2015, 11:26:27 PM2/7/15

to django-d...@googlegroups.com

Hi,

I also think so, that overriding a field should not need any special protection.

But how would you deal with inherited managers on the class that might refer to some overridden field that has changed?

Loïc Bistuer

unread,

Feb 8, 2015, 2:19:57 AM2/8/15

to django-d...@googlegroups.com

That's what we've done in Django 1.7 for Form/ModelForm (#19617, #8620), and I completely agree that it should be possible to shadow fields from abstract models. The docs even suggest that this may be possible in the future; quoting the relevant section: "this [shadowing] is not permitted for attributes that are Field instances (at least, not at the moment)."

For consistency with forms - and because we've put a great deal of thinking into it - the implementation should be: you can shadow a field with another field, or you can remove a field using None as a sentinel value (see #22510 for more details).

I believe we have a safety net in the form of migrations, if you accidentally shadow a field, migrations will pick it up, so it's unlikely to go unnoticed.

I'd be quite happy to shepherd this patch.

Unit tests should cover those scenarios:

- An abstract model redefines or removes some fields from a parent abstract model.
- A concrete model redefines or removes some fields from a parent abstract model.
- Ensure that the standard MRO resolution applies when you do Concrete(AbstractA, AbstractB) with AbstractA redefining a field from AbstractB.

The last case may be tricky because if I remember well ModelBase iterates through the MRO in the wrong direction (i.e. bases should be iterated over in reverse).

We also need some tests to show how migrations behave.

--
Loïc

> --
> You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.

> To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/072149e1-c794-446d-957b-f5fc5df87096%40googlegroups.com.

Marten Kenbeek

unread,

Feb 8, 2015, 3:09:41 PM2/8/15

to django-d...@googlegroups.com

The general reaction seems to be a yes if we can work out the kinks, so I went ahead and created a ticket: https://code.djangoproject.com/ticket/24305

@Aron That's a good question. One option would be to lock certain fields, so that they can't be changed if they are an integral part of the model. That would be a simple solution, but that won't help for existing code that doesn't lock the fields. It won't break existing code, but it won't protect for errors either. The opt-in version (i.e. an 'unlock' attribute) would lock many fields which would otherwise be completely safe to overwrite.

Another option would be more elaborate "requirements" for a manager or some methods, i.e. allow the manager to specify the necessary class of a certain field or a minimum length. If the modeldoesn't meet the requirements, the manager or some of the methods will not be inherited. While it allows for more control, this option would greatly increase the complexity of the patch and requires more from the developers of custom managers. It can also cause issues when the requirements aren't up-to-date with the manager's methods.

We could also say that it is the users responsibility and don't provide special protection, in line with the fields themselves, but I guess that this would generally be more problematic for custom managers. It can also cause silent bugs when the manager's methods don't work as intended but won't raise an exception either, which is not a good idea imho.

I think the locking approach would be the easiest and most pragmatic method. I think it's still - in part - the users responsibility to confirm that a field can be overridden. The Django documentation could, where applicable, document the requirements on fields that can be overridden, i.e. that an AbstractUser's username must be a CharField (which isn't necessarily true, just an example).

@Loïc The bases are indeed traversed in the reversed order.

Op zondag 8 februari 2015 08:19:57 UTC+1 schreef Loïc Bistuer:

Marten Kenbeek

unread,

Feb 9, 2015, 4:25:22 PM2/9/15

to django-d...@googlegroups.com

I'd like some additional opinions on the following:

Say you have the following classes:

class Parent(models.Model):
    cousin_set = models.CharField(max_length=100)

    class Meta:
        abstract = True

class Child(Parent):
    pass

class Cousin(models.Model):
    child = models.ForeignKey(Child)

Obviously, the `cousin_set` field inherited from `Parent` clashes with the reverse relation from `Cousin`.

I can see the following options:

1) Allow the reverse object descriptor to overwrite the field.

2) Only allow the reverse object descriptor to overwrite the field if it is explicitly set to None on the target model.

3) Only allow the reverse object descriptor to overwrite the field if the foreignkey/m2m/o2o field itself has a flag set to explicitly override.

1) is consistent with the behaviour of local fields, but I think it will be problematic if other models can silently overwrite a field. 3) would still allow other models to affect local fields, but at least it has a fail-safe that prevents you from accidentally overriding fields. 2) would only allow the inheriting model itself to change which fields it inherits.

Problems caused by option 1) would be hard to debug when you don't know which code overrides your field, so I wouldn't do that. I think 2) would be the cleanest and most consistent way. Only local fields would override parent fields, but the sentinel value `None` would remove the field and free the name for reverse relations. I can also see the advantage of 3) over 2) when you don't have access to the model on the other side. However, I don't know enough about foreign key internals to know if 3) is at all feasible. What happens e.g. when only the target is loaded in a migration? Would it pick up that the remote foreign key overrides a local field? As adding reverse relations is a lazy, or at least delayed operation afaik, would it still be save to rely on that to override fields?

I believe my current plans for the patch would automatically create situation 2 without any extra work. The field would no longer exist on the child class when the reverse relation is added. Option 3) would require an additional patch to the code that adds the reverse relationship, but it allows for some extra options.

Any input? Additional options are also welcome.

Op zondag 8 februari 2015 21:09:41 UTC+1 schreef Marten Kenbeek:

Aron Podrigal

unread,

Feb 9, 2015, 8:28:58 PM2/9/15

to django-d...@googlegroups.com

Why should this be treated differently than the general behavior when realted_names clash when you have more than one foreign key to the same model? So as one would normally do

set the related_name explicitly to some other value.

setting the field to None is just the way of removing a field and has nothing more special related to the auto created reverse field descriptor.

about option 3 I didn't quite understand. can you explain a bit more?

Marten Kenbeek

unread,

Feb 10, 2015, 8:36:53 AM2/10/15

to django-d...@googlegroups.com

Hi Aron,

With option 3, this would work:

class Cousin(models.Model):
    child = models.ForeignKey(Child, allow_override=True)

but it would raise an error without `allow_override`.

I think my question really is: should we treat reverse relations as first-class fields, or not? If we do, 1) would be the logical choice but can cause problems. 3) would deal with those problems in a way and still allow a relation field from another model to override a local field. If we don't, that kind of implies 2). Removing the field with `None` would implicitly allow a reverse relation to take its place, but that reverse relation in and of itself would not be allowed to override any abstract fields.

This is just an example. In real code this would more likely happen with an explicit `related_name`. Renaming the related name and living with an unused `cousin_set` attribute is the current solution, but with this feature proposal it would be nice to also have a way to override the inherited field and use it for a reverse relation instead.

Op dinsdag 10 februari 2015 02:28:58 UTC+1 schreef Aron Podrigal:

Marten Kenbeek

unread,

Feb 12, 2015, 10:43:42 AM2/12/15

to django-d...@googlegroups.com

Current status:

Models in general seem to be working.
Trying to override a concrete (or explicitly locked) field with an abstract field will keep the concrete field, but not raise any errors. This occurs when inheriting from (AbstractBase, ConcreteBase).

I might still change this to raise an error, it migth cause a few headaches when fields don't appear while they're first in the mro.

Trying to override field.attname (e.g. foo_id) will fail checks unless you explicitly set foo = None.
Trying to override an abstract field with a reverse relation will fail checks unless you explicitly set foo = None.

What is still needed:

? More extensive model tests. During development a few more bugs popped up. I've written tests for them, but there might be some more.
Explicit migration tests. Existing migration tests pass, but don't include the new scenario's.
Documentation.

My WIP branch can be found at https://github.com/knbk/django/compare/ticket_24305.

Op dinsdag 10 februari 2015 14:36:53 UTC+1 schreef Marten Kenbeek:

Loïc Bistuer

unread,

Feb 12, 2015, 11:59:58 AM2/12/15

to django-d...@googlegroups.com

Hi Marten,

Sorry for the late response, it's been a busy week. Thanks for working on this, I'll try to have a look at your branch this weekend.

In the meantime I'll leave some comments below.

> On Feb 12, 2015, at 22:43, Marten Kenbeek <marte...@gmail.com> wrote:
>
> Current status:
> • Models in general seem to be working.
> • Trying to override a concrete (or explicitly locked) field with an abstract field will keep the concrete field, but not raise any errors. This occurs when inheriting from (AbstractBase, ConcreteBase).
> • I might still change this to raise an error, it migth cause a few headaches when fields don't appear while they're first in the mro.

This should definitely be an error. We recently added checks so two concrete parents can't have clashing fields, this is similar.

> • Trying to override field.attname (e.g. foo_id) will fail checks unless you explicitly set foo = None.

> • Trying to override an abstract field with a reverse relation will fail checks unless you explicitly set foo = None.

That's good.

> What is still needed:
> • ? More extensive model tests. During development a few more bugs popped up. I've written tests for them, but there might be some more.
> • Explicit migration tests. Existing migration tests pass, but don't include the new scenario's.
> • Documentation.
> My WIP branch can be found at https://github.com/knbk/django/compare/ticket_24305.
>

> Op dinsdag 10 februari 2015 14:36:53 UTC+1 schreef Marten Kenbeek:
> Hi Aron,
>
> With option 3, this would work:
>
> class Cousin(models.Model):
> child = models.ForeignKey(Child, allow_override=True)
>
> but it would raise an error without `allow_override`.
>
> I think my question really is: should we treat reverse relations as first-class fields, or not? If we do, 1) would be the logical choice but can cause problems. 3) would deal with those problems in a way and still allow a relation field from another model to override a local field. If we don't, that kind of implies 2). Removing the field with `None` would implicitly allow a reverse relation to take its place, but that reverse relation in and of itself would not be allowed to override any abstract fields.
>
> This is just an example. In real code this would more likely happen with an explicit `related_name`. Renaming the related name and living with an unused `cousin_set` attribute is the current solution, but with this feature proposal it would be nice to also have a way to override the inherited field and use it for a reverse relation instead.
>
>
> Op dinsdag 10 februari 2015 02:28:58 UTC+1 schreef Aron Podrigal:
> Why should this be treated differently than the general behavior when realted_names clash when you have more than one foreign key to the same model? So as one would normally do
> set the related_name explicitly to some other value.
> setting the field to None is just the way of removing a field and has nothing more special related to the auto created reverse field descriptor.
> about option 3 I didn't quite understand. can you explain a bit more?

+1 to what Aron said.

>
> On Monday, February 9, 2015 at 4:25:22 PM UTC-5, Marten Kenbeek wrote:
> I'd like some additional opinions on the following:
>
> Say you have the following classes:
>
> class Parent(models.Model):
> cousin_set = models.CharField(max_length=100)
>
> class Meta:
> abstract = True
>
> class Child(Parent):
> pass
>
> class Cousin(models.Model):
> child = models.ForeignKey(Child)
>
> Obviously, the `cousin_set` field inherited from `Parent` clashes with the reverse relation from `Cousin`.
>
> I can see the following options:
> 1) Allow the reverse object descriptor to overwrite the field.

No this would be very confusing.

> 2) Only allow the reverse object descriptor to overwrite the field if it is explicitly set to None on the target model.

Yes but as Aron pointed out we aren't specifically "allowing" the descriptor to override the field, it's just that the field was already removed (through None) so the reverse descriptor isn't clashing with anything.

> 3) Only allow the reverse object descriptor to overwrite the field if the foreignkey/m2m/o2o field itself has a flag set to explicitly override.

Let's not do that, it's clunky, I also don't like the locking mechanism mentioned above.

> To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/6e3e363f-bdc6-410b-b145-76085051e053%40googlegroups.com.

Marten Kenbeek

unread,

Feb 13, 2015, 9:28:53 AM2/13/15

to django-d...@googlegroups.com

Hi Loïc,

Thanks for the response. I will change my code to generate errors in the case of abstract fields shadowing concrete fields.

At the moment, the locking mechanism of fields is pretty much the core of this patch. It is explicitly set to `True` when fields are added to a concrete model, and it allows to easily check if a field can be changed. Without it, a proper, clean way would be needed to check the origin of the field. If for example an abstract model inheriting from a concrete model adds some fields, you should be able to shadow the new fields, but not the inherited fields. Making it internal would be clunky with the way cloning and serializing currently works.

I'm not too content with this anyway. As the default has to be `False` (for an opt-in behaviour for abstract fields, rather than an opt-out behaviour), this unnecessarily clutters migrations and other serialized values with `locked=True` for concrete classes. However, it's not trivial to consistently determine if a field is in any way inherited from a concrete class. I will look into a clean method to determine this.

I don't think we should remove it entirely, unless we provide another mechanism to prevent field changes for some abstract fields. E.g. django-polymorphic completely depends on the `polymorphic_ctype` field. That might be a rare name to use, but it shows that some code does depends on abstract fields. With the locking mechanism, we move the responsibility of ensuring the right behaviour of that field to the package developer rather than the website developer. I think a fail hard, fail fast approach is better for these situations than going along with the possibility of subtle or not-so-subtle bugs.

Op donderdag 12 februari 2015 17:59:58 UTC+1 schreef Loïc Bistuer:

Marten Kenbeek

unread,

Feb 16, 2015, 9:35:09 AM2/16/15

to django-d...@googlegroups.com

Okay, as things were getting more and more complicated I sat down for a while to get back to the essence of this patch. The result [1] is much, much cleaner and works perfectly. I'll start working on documentation as well.

[1] https://github.com/knbk/django/tree/ticket_24305_alternative

Op vrijdag 13 februari 2015 15:28:53 UTC+1 schreef Marten Kenbeek:

Loïc Bistuer

unread,

Feb 16, 2015, 10:26:30 AM2/16/15

to django-d...@googlegroups.com

Hi Marten,

As mentioned in my previous email I don't recommend an implementation based on locks.

Please look at how Form and ModelForm support inheritance and shadowing without it.

--
Loïc

> To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/621a3fc4-a90a-475c-ae12-97c980458bdb%40googlegroups.com.

Marten Kenbeek

unread,

Feb 16, 2015, 11:34:45 AM2/16/15

to django-d...@googlegroups.com

Hi Loïc,

The implementation itself no longer depends on locks. I wish it was as simple as form fields, but the idea is the same. The first commit (https://github.com/knbk/django/commit/7ac5b58587ea2a153766d1601965734731609cdf) in fact leaves out the lock feature. It's still an extra option in the second commit, to explicitly lock otherwise replaceable fields. I still think it's a good addition, but if you and others think it's not, I'll happily leave it out.

Slightly (but not entirely) unrelated:

At the moment, a lot of clashing fields don't throw an error in `ModelBase.__new__`, but rather return an error in `Model.check`. With backwards-compatibility in mind, what's the policy on changing this to throw an error in `ModelBase.__new__` instead? In both cases the models don't work, but it would change where and how the error is "raised". There's hardly any consistency either: if a new field clashes with a directly inherited field, it raises a `FieldError`, but when the clashing field is in a grandparent, or a directly inherited field clashes with another inherited field, there's no exception, just an error in `Model.check`.

I noticed this because I needed a lot of checks in my original patch to let some errors fall through and be caught by `Model.check` instead. I wouldn't mind looking into ways to refactor this, probably for a brand-new, unrelated patch, but I'm a bit hesitant about the errors. If we have to maintain the exact current behaviour, those extra checks for fall-through errors would more or less defeat the purpose of refactoring in the first place.

Marten

Op maandag 16 februari 2015 16:26:30 UTC+1 schreef Loïc Bistuer:

Aron Podrigal

unread,

Jul 24, 2015, 6:59:56 PM7/24/15

to Django developers (Contributions to Django itself)

Bumping up on this again, what are the plans for moving this ahead.

Tim Graham

unread,

Jul 24, 2015, 7:44:50 PM7/24/15

to Django developers (Contributions to Django itself), ar...@guaranteedplus.com

Loic indicated the latest approach isn't ideal and said he might work on the issue.

https://github.com/django/django/pull/4184

Podrigal, Aron

unread,

Jul 24, 2015, 8:03:59 PM7/24/15

to Django developers (Contributions to Django itself)

What was the approach Loic has planned? As Marten had an implementation here [1] just without the locking functionality.
Is all the work related to virtual fields done yet?

[1] https://github.com/knbk/django/commit/7ac5b58587ea2a153766d1601965734731609cdf

Tim Graham

unread,

Jul 24, 2015, 8:29:28 PM7/24/15

to Django developers (Contributions to Django itself), ar...@guaranteedplus.com

Oh, I guess Marten never got around to sending an updated pull request with that commit.

I don't know of any plans for work on virtual fields.

Reply all

Reply to author

Forward