[GSoC] Revamping validation framework and merging django-secure once again

1,660 views
Skip to first unread message

Christopher Medrela

unread,
Jun 3, 2013, 6:29:00 PM6/3/13
to django-d...@googlegroups.com
Hello!

I'm really happy that I've been accepted as a Google Summer of Code student. I
will work at revamping validation framework and merging django-secure this
summer. I created this thread to start discussion (especially about public
API) and get some feedback as well as to share progress. Any comments are
welcomed!

You can find my proposal as a gist here [1], but I'm going to describe my
ideas in short here.


My proposal consists of two parts: the first one is revamping validation
framework. Don't be confused with form validation - I mean validation of
(mainly) apps and models triggered before launching server (or directly by
`python manage.py validate` command). The main drawback of the existing
framework is that it's monolithic and therefore hard to customize. The second
part of my project is merging django-secure using the new refactored
framework. That will be a proof that the framework is customizable.

Let's start with validation. The validation is done every time you run your
Django application. It prevents from misconfigurations (for example missing
`upload_to` attribute of FieldFile) or invalid models (a model with a `id`
field without `primary_key` set to True). Some apps need custom validation,
for example admin app need to check all registered ModelAdmin instances. The
new validation framework will cover some other use cases. For example, your
apps will be able to check dependencies.

Let's start with validating a field. Consider existing FieldFile field. It
requires an `upload_to` attribute. How could you check if it's set using the
new framework?

class FieldFile(Field):
    (... lots of stuff ...)

    def validate_upload_to(self):
        if not self.upload_to:
            yield Error(obj=self, msg="required 'upload_to' attribute",
                explanation="Django need to know the directory where uploaded files should be stored.",
                hint="Add \"upload_to = 'path/to/directory/on/server'\" to model %(model)s in file %(model_file)s.")    

All methods starting with `validate_` will be called automagically at
validation stage. You don't have to trigger it manually. The methods shouldn't
have any parameters (excluding `self`). They should return list of errors and
warnings or they should yield each of them (like in the example).

You can see a new `Error` structure with the following fields: `obj`, `msg`,
`explanation` and `hint`. The last two attributes are separated from `msg` in
order to force Django developers to give its users really useful,
user-friendly error messages with a list of solutions and explanation of a
problem. I think that Django should be an intelligent framework that make a
conversation with its users.

The new framework will introduce a concept of warnings which are really
similar to errors expect that they allow you to run server. A warning have the
same fields: `obj`, `msg`, `explanation` and `hint`. Warnings will be used
while merging django-secure.

Not only fields can be validated, but also: models, managers and apps. In case
of models, a developer has to write a class method, because we validate a
model which is a class, not an instance. In case of apps, you can create
`validation.py` file which should contain functions.

After refactoring validation framework, I will focus on merging django-secure.
django-secure is a set of tests improving security of django applications. I
won't implement new features, that would be only fitting it to the new
framework. I will publish proposal of its new public API in the future.

Feel free to comment and criticize!

Carl Meyer

unread,
Jun 3, 2013, 7:00:27 PM6/3/13
to django-d...@googlegroups.com
Hi Christopher,

Overall, this looks like a great project and I look forward to the more
flexible validation hooks. As the author of django-secure, I do have one
concern with the implementation plan:

Django's existing validation checks are intended to be run in
development, as a debugging aid, but not in production (to avoid any
slowdown). Thus, by default they are run prior to many management
commands (such as syncdb, runserver), but are generally not run on
production deployments.

The checks in django-secure, on the other hand, are intended to be run
only under your production configuration, and are mostly useless/wrong
in development. Since runserver doesn't handle HTTPS, most people don't
use HTTPS in development, which means you can't set e.g.
SESSION_COOKIE_SECURE to True in development or your sessions will
break, which means that django-secure check will fail; same goes for
most of the other checks. Running django-secure's checks every time you
syncdb or runserver in development would be at best a waste of time and
at worst extremely annoying.

So I think the validation framework you build has to incorporate some
kind of distinction between these two types of validation, which have
very different purposes and should be run at different times:
development/debugging validation and production-configuration
validation. I'm not sure exactly what form this distinction should take:
perhaps validators are simply tagged as one type or the other and then
there are two different management commands?

Interested in your thoughts,

Carl

Christopher Medrela

unread,
Jun 11, 2013, 4:29:03 AM6/11/13
to django-d...@googlegroups.com
Hello!

*Making distinction between form validation and static validation.* I named
the process of static checks of apps, models etc. "validation". Unfortunately,
novices may confuse it with form validation. So I propose to rename it to
"verification". Functions/methods/classmethods (referred to as just functions)
starting with `validate_` will start with `verify_` now. And functions
discovering all other verification functions will be called `verify` instead
of `validate_all`. This is shorter than `validate_` (only 6 letters) and there
is no risk of confusing novices. I considered alternatives: "check" is too
general and "static_check" as well as "static_validate" are too long.

*Disabling/enabling parts of verification*
DEBUG setting is a hint about the environment; if you are working in
development environment, then it's likely that DEBUG is set to True; and you
have DEBUG set to False in production, I hope. By default, `verify_*`
functions will be run only when DEBUG is set to True. django-secure checks
will be decorated with new `run_only_in_production` decorator and will run
only when DEBUG is set to False.

However, I'm not sure if this is the best solution. There are some
disadvantages, i. e. I can imagine people saying "what to do if I want to run
security checks while DEBUG is set True?". Of course, they can set DEBUG to
False, but it's only a workaround. I afraid that it's only tip of the iceberg
-- we will probably need more control over which checks to run.

An another solution, much more flexible, is to pass around a "filter function"
which takes a `verify_*` function and says if the function should be called. A
drawback is that this forces developers to add an argument to every `verify_*`
function even though they don't want it. In order to avoid that we can
introspect the function and check if the function requires `filter` argument
and in that case call it without any argument, otherwise call it passing the
filter function as `filter` argument, but this is hacky, ugly and
non-pythonic; and if anybody wants to rewrite `verify` function (which collect
all `verify_` functions and run them), they need to know about that trick and
how to use it.

So we can split `verify_` functions into two groups: one which does the checks
and doesn't require the filter function and another one which finds all
`verify_` functions and run them. An example of the former is
`verify_upload_to_attribute` of `FieldFile`; and an example of the latter is
`verify` function. The latter requires the filter function as an argument. We
need to come out with name pattern for the former (I have no idea now, I will
propose something tomorrow). This is the most flexible solution, but it
increases complexity.

I prefer the last, most flexible solution. 

*Breaking points.* As Russell suggested me during private conversation, I will
try to merge my branch as often as possible. That should increase chance of
success. There is one obvious breaking points: after finishing revamping
validation.

Merging while working on the framework shouldn't be too hard. That is because
I will refactor bottom-up and keep all code and tests running and because I
won't touch any public API. Actually, merging can be done after every week.

First question: Django 1.6 is being released now and there is no 1.7
branch yet; will be there any 1.7 branch where I can merge to?

Another issue is if we can merge a branch when documentation is not finished.
I will create a new doc page at 3rd week and I will be completing it to the
end of 8th week. I guess that it's not allowed to merge code with partial doc.
A solution may be to merge code without doc and merge the doc only at the end.

My schedule didn't change, but I will paste it here:
  1. Rewriting tests of field validation (1 week)
  2. Rewriting field validation (1 week)
  3. Writing documentation (mainly overview) (0.5 week)
  4. Tests, implementation and documentation of models validation (1 week)
  5. Tests, implementation and documentation of apps validation (1 week)
  6. Rewriting validation of custom User model (0.5 week)
  7. Tests, implementation and documentation of manager validation (0.5 week)
  8. Rewriting validation of AdminModel and its tests (1 week)
  9. Rewriting mechanism of triggering validation framework (1 week)
  10. Finishing documentation (0.5 week)
I would like to make the first merging after the first week (rewriting tests of
field validation), because it will make me familiar with the process of
merging and will make next merging (which will be much bigger) easier. The
first week is only refactoring, no doc, no new features, so I guess it can be
merged.

If doc is not a problem then we can merge after 3. task (field validation
implemented) and after 6. task (models and app validation implemented). And of
course, in the end.

If it comes to the second part of my project (merging django-secure), it's
obvious there is only one merging point. :)

*Some changes to public API.* On last Friday (7 June 2013) I gave a small talk
about my project at pykonik (a meeting of Python programmers in Kraków where I
live). I was given some feedback.

I changed fields of Error and Warning classes. Previously I proposed splitting
error message into `msg` (short error message), `explanation` (why is some
situation disallowed?) and `hint` (list of solutions). We agreed that
`explanation` is redundant, so now I propose only two fields: `msg` and
`hint`. `hint` will be optional, so you can set it to None, but you have to do
it explicitly:

    Error("something is broken", hint=None)

I hope that this will force developers to thing carefully if they cannot give
any useful hint and that it will improve quality of error messages.

*django-securityDuring the meeting, somebody referred to django-security [1], 
and I will have a look at it soon. Implementing some features of django-security
may be a good idea if I finish merging django-secure earlier.

Russell Keith-Magee

unread,
Jun 12, 2013, 8:24:50 PM6/12/13
to django-d...@googlegroups.com
Hi Chris,

On Tue, Jun 11, 2013 at 4:29 PM, Christopher Medrela <chris....@gmail.com> wrote:
Hello!

*Making distinction between form validation and static validation.* I named
the process of static checks of apps, models etc. "validation". Unfortunately,
novices may confuse it with form validation. So I propose to rename it to
"verification". Functions/methods/classmethods (referred to as just functions)
starting with `validate_` will start with `verify_` now. And functions
discovering all other verification functions will be called `verify` instead
of `validate_all`. This is shorter than `validate_` (only 6 letters) and there
is no risk of confusing novices. I considered alternatives: "check" is too
general and "static_check" as well as "static_validate" are too long.

I can't say I'm completely in love with 'verify' as a name, but I can't say I've got any better options either. We definitely need to get away from "validate". I agree that "check" is a bit too generic, and appending something to 'check' is only going to make it unwieldy (although there's a certain beauty in having a generic name like 'check' for a generic 'checking' feature)

Lets go with verify for the moment; if the process of building this whole framework reveals another workable name, we'll be a search-and-replace away from a fix.
This is starting to sounds a bit like YAGNI (You Aint Gonna Need It) to me. I can see the flexibility that you're aiming to introduce, but frankly, I'm having difficulty thinking of a real world situation where this flexibility would actually need to be used.

There's an obvious "am I in production?" difference that needs to be accounted for, but I think the DEBUG=True/False should be enough to cover that. There's an edge case where I need to turn DEBUG off in development to test some particular feature, so there may be a need to have an option on the 'verify' command to ignore errors, or at least 'don't stop when you see an error'. The Django-secure approach of settings to silence specific errors may also be an approach here; if there's going to be a lot of these errors, collapsing 'acceptable errors' into a single setting might be an idea (analogous to the configuration of PyLint)

Adding a whole dynamic filtering framework for verification checks seems excessive to me unless you've got a specific use case in mind.

If we really want to protect ourselves from the future, we can take the same approach that model and form save() methods use -- define the official prototype to be verify(**kwargs). That way we retain the flexibility to introduce flags at some later date
 
*Breaking points.* As Russell suggested me during private conversation, I will
try to merge my branch as often as possible. That should increase chance of
success. There is one obvious breaking points: after finishing revamping
validation.

Merging while working on the framework shouldn't be too hard. That is because
I will refactor bottom-up and keep all code and tests running and because I
won't touch any public API. Actually, merging can be done after every week.

First question: Django 1.6 is being released now and there is no 1.7
branch yet; will be there any 1.7 branch where I can merge to?

We'll fork the Django 1.6 release branch once the beta is released (in theory, at some point in the next couple of weeks). At that point the Django 1.6 branch will be frozen for new features, and the 1.7 branch will be where new features land.
Merging incomplete features isn't an option, and having incomplete documentation counts as an incomplete feature. 

I'm also wary of the fact that we may need to change some architectural features during the development process. I was interested in the interim merge approach because having that granularity can help focus development effort onto deliverable chunks; however, it sounds like it will be difficult to take this approach for your project. We can assess this as we go, but lets work on the assumption that we're not going to do a mid-project merge.
 
*Some changes to public API.* On last Friday (7 June 2013) I gave a small talk
about my project at pykonik (a meeting of Python programmers in Kraków where I
live). I was given some feedback.

I changed fields of Error and Warning classes. Previously I proposed splitting
error message into `msg` (short error message), `explanation` (why is some
situation disallowed?) and `hint` (list of solutions). We agreed that
`explanation` is redundant, so now I propose only two fields: `msg` and
`hint`. `hint` will be optional, so you can set it to None, but you have to do
it explicitly:

    Error("something is broken", hint=None)

I hope that this will force developers to thing carefully if they cannot give
any useful hint and that it will improve quality of error messages.

I agree that having a message *and* a reason is overkill. I also like the idea of having an explicit API that encourages developers to provide a hint. My question would be whether that hint should be a string, or a list. If it's a list, you're encouraging the developer to think about multiple options for fixing the problem (e.g., rename the field *or* add a related_name), and having that structure in the error message itself gives you some flexibility in how to display the hints (e.g., you can print them one per line).
 
One extra thing we may need to add is an error identifier. If we're going to have a generic PyLint-like method to silence specific errors/warnings, it's important that we have a way to easily identify specific error types. PyLint assigns each rule an identifier (e.g., E121 - "Continuation line indentation is not a multiple of 4"), and if you decide that rule isn't important, you can  put "E121" in your error exclusion list.

*django-securityDuring the meeting, somebody referred to django-security [1], 
and I will have a look at it soon. Implementing some features of django-security
may be a good idea if I finish merging django-secure earlier.


I'm not familiar with django-security either; from a quick inspection, it seems to be more about security related features (such as password expiry) rather than security validation checks. Lets leave this on the todo pile for the moment. If we get to the end of the project and by some miracle we're ahead of schedule, we can look at this again. I'm sure we won't have a shortage of interesting things we could add as features.

Yours,
Russ Magee %-)

 

Christopher Medrela

unread,
Jun 26, 2013, 3:32:00 PM6/26/13
to django-d...@googlegroups.com
Hello!

I've deleted old `gsoc2013-verification` branch. Follow the new
`gsoc2013-checks` branch [1].


What did I do? I've rewritten field tests living in
`django.tests.invalid_models` package [2], inside `tests.py` file [3]. I've
renamed `invalid_models.invalid_models` submodule into `old_invalid_models`.
I've created new `invalid_models` submodule; it contains models required by the
rewritten tests.


Now I'm working on rewritting field checks. Now all checks are done in
`get_validation_errors` function [4]. They will be moved to relevant `Field`
subclasses during the current week. I've introduced `Warning` and `Error`
classes inside `django.core.checks` package.


Now I'm taking exams at university. I will have last exam on Friday (I hope), so
next week it will be easier to focus on GSoC and I will work full time.

Good-bye!

Christopher Medrela

unread,
Jul 3, 2013, 5:45:30 PM7/3/13
to django-d...@googlegroups.com
I'm finishing making field tests green. I'm a bit late, because the schedule
says that I should finish this job by the end of the previous week. My excuse is
that I had exams in the previous week so I couldn't focus on GSoC. Fortunately,
I passed all exams and now I can work full time on GSoC and catch up. This week
I will finish field tests (there are still 5 red tests and documentation needs to
written) and I will start working on model tests.

Today I had a conversation with Russell and we decided to simplify some issues.
For example, `check_*` methods won't yield errors -- they will just return list
of errors. What's more, `check` method will call all `check_*` methods
explicitly (now it uses introspection to find all methods starting with
`check_`).

Christopher Medrela

unread,
Jul 12, 2013, 7:45:09 AM7/12/13
to django-d...@googlegroups.com
Finally, I've rewritten all model and field tests and checks. Some new tests
were written. First draft of checking framework documentation was written. All
tests passes (at least at my computer). I've rebased my branch
(`gsoc2013-checks` [1]) against master. The branch is ready to deep review.


I've created pull request [2] for ticket [#19934]. The patch clarifies that
importing `django.utils.image` cannot result in ImportError. I've also created
ticket [#20735] about confusing ManyToManyField documentation. There is a pull
request too [3]. Both pull requests were merged.


1. I wrote some tests [4] where nose tests generators [5] would be really
helpful. I tried to emulate them. Is the code I wrote a good approach?


2. Two m2m through the same model are forbidden. The new error message is:

applabel.modelname: Two m2m relations through the same model.
The model has two many-to-many relations through the intermediary Membership
model, which is not permitted.
(No hint.)

Is there any reasonable hint we can provide except for suggesting duplicating
the intermediary model?

3. I've deleted `app_errors` [6] because it wasn't used anywhere (inside
Django). Was it left by accident or was there a reason for that?


4. I've renamed DatabaseValidation.validate_field to check_field and changed its
arguments. They were: `errors` -- ModelErrorCollection [7], `opts` -- meta of
the model of the field and `f` -- the field; The method returned None. Now there
is only `field` argument and the method returns list of errors. It's not public
API, but we should go through deprecation process.

How can we do that regarding changed arguments? We can call old `validate_field`
method where the backend specific validation is triggered [8]. By default, the
old method will call `check_field` method. The old one need to transform list of
errors into strings and add them to `errors` list. The problem is that at [8] we
need to convert the strings again to errors. That cannot be done in a graceful
way. Is there any other way to achieve deprecation?

Christopher Medrela

unread,
Jul 15, 2013, 8:17:40 AM7/15/13
to django-d...@googlegroups.com
Progress: I've implemented manager checks.

The schedule for the next 4 weeks is:

1. Manager checks (ref 4.1.7)(done)(0.5 week).
2. Entry point registering (ref 4.1.5) & rewriting mechanism of triggering
   checking framework (ref 4.1.9)(1.5 week).
3. Moving custom User model checks (ref 4.1.6)(0.5 week).
4. Rewriting AdminModel checks and tests (ref 4.1.8)(1 week).
5. Polishing doc (ref 4.1.10)(0.5 week).

(ref 4.1.x are references to my original proposal [1])


Let's talk about public API to register your own validation stuff. There will
be `register(callable)` function in `django.core.checks`. It will return
nothing. The callables will be stored in a global variable. They should obey
the same contract as `check` methods: they allow for `**kwargs` and return
list of errors and warnings. All registered callables will be called during
checking phase (i. e. when you type `manage.py validate` or before running
server).

This API allows us to register, among other things, app-specific checks. But
it's not necessary to write an app in order to do some checks.

`register` function will have two optional parameters: `run_in_production`
(default: False) and `run_in_development` (default: True) that let you specify
when the callable should be called. Most checks should be performed only in
development environment (which is equivalent to DEBUG==True). Security checks
makes sense only in production environment (<==> DEBUG==False).

I would be happy if I could hear your opinion!

Russell Keith-Magee

unread,
Jul 17, 2013, 4:08:40 AM7/17/13
to django-d...@googlegroups.com
Hi Chris,

On Fri, Jul 12, 2013 at 7:45 PM, Christopher Medrela <chris....@gmail.com> wrote:
Finally, I've rewritten all model and field tests and checks. Some new tests
were written. First draft of checking framework documentation was written. All
tests passes (at least at my computer). I've rebased my branch
(`gsoc2013-checks` [1]) against master. The branch is ready to deep review.


This is looking good! If you issue a pull request for this branch, I'll add some detailed comments there. When you make the pull request, make sure you add a comment noting that this is a work in progress associated with the GSoC, and isn't a candidate for merging (yet!).
 
I've created pull request [2] for ticket [#19934]. The patch clarifies that
importing `django.utils.image` cannot result in ImportError. I've also created
ticket [#20735] about confusing ManyToManyField documentation. There is a pull
request too [3]. Both pull requests were merged.


1. I wrote some tests [4] where nose tests generators [5] would be really
helpful. I tried to emulate them. Is the code I wrote a good approach?


I can see what you're doing here, but it's probably not the best approach. As implemented, there are two problems:

 1) there's going to be a redundant call to setUp and tearDown
 2) you only get one opportunity for a failure in all the accessor_check tests.

Given that you're only dealing with 6 tests (2 targets x 3 relatives) here, I'd be inclined to manually roll them out. When we've got Python 3.4 as a minimum dependency, we can start using subunit tests [1], but until then, we can live with a small amount of manual code rollout.

 
2. Two m2m through the same model are forbidden. The new error message is:

applabel.modelname: Two m2m relations through the same model.
The model has two many-to-many relations through the intermediary Membership
model, which is not permitted.
(No hint.)

Is there any reasonable hint we can provide except for suggesting duplicating
the intermediary model?

The hint is optional, so there's not much point having a hint unless it's actually providing useful information that isn't implicit in the error message. You can't have 2 m2m relations through the same intermediate model. The fix - don't have 2 m2m relations through the same intermediate model. 

The only practical advice we could give here would be to duplicate the intermediate model, but that doesn't strike me as something that is likely to be correct advice under most circumstances, so I'm hesitant to provide it as default advice.
 
3. I've deleted `app_errors` [6] because it wasn't used anywhere (inside
Django). Was it left by accident or was there a reason for that?


I'm not aware of anywhere that it is being used; I'm guessing it exists for historical reasons, but I have no idea what those are. If the tests still pass when you've removed this code, I see no reason to keep it around.
 
4. I've renamed DatabaseValidation.validate_field to check_field and changed its
arguments. They were: `errors` -- ModelErrorCollection [7], `opts` -- meta of
the model of the field and `f` -- the field; The method returned None. Now there
is only `field` argument and the method returns list of errors. It's not public
API, but we should go through deprecation process.

How can we do that regarding changed arguments? We can call old `validate_field`
method where the backend specific validation is triggered [8]. By default, the
old method will call `check_field` method. The old one need to transform list of
errors into strings and add them to `errors` list. The problem is that at [8] we
need to convert the strings again to errors. That cannot be done in a graceful
way. Is there any other way to achieve deprecation?


I'd suggest that we ship both APIs -- validate_field() unmodified to support older codebases (but internally documented to indicate that we're moving away from that approach), and a new check_field() method that does things the New Way.

As a migration aid, the default implementation of check_field() could invoke validate_field() and just convert the output into the new format -- it should be possible to retrieve all the required arguments from the data you already have available, so it's just a matter of taking the 'list of error strings' and converting it into a 'List of Error objects'. 

If a third party developer wants to provide richer support (including warnings, hints etc) then they can write a check_field() implementation in parallel to the existing validate_fields() call.

Yours,
Russ Magee %-)

Russell Keith-Magee

unread,
Jul 17, 2013, 4:20:47 AM7/17/13
to django-d...@googlegroups.com
On Mon, Jul 15, 2013 at 8:17 PM, Christopher Medrela <chris....@gmail.com> wrote:
Progress: I've implemented manager checks.

The schedule for the next 4 weeks is:

1. Manager checks (ref 4.1.7)(done)(0.5 week).
2. Entry point registering (ref 4.1.5) & rewriting mechanism of triggering
   checking framework (ref 4.1.9)(1.5 week).
3. Moving custom User model checks (ref 4.1.6)(0.5 week).
4. Rewriting AdminModel checks and tests (ref 4.1.8)(1 week).
5. Polishing doc (ref 4.1.10)(0.5 week).

(ref 4.1.x are references to my original proposal [1])


Let's talk about public API to register your own validation stuff. There will
be `register(callable)` function in `django.core.checks`. It will return
nothing. The callables will be stored in a global variable. They should obey
the same contract as `check` methods: they allow for `**kwargs` and return
list of errors and warnings. All registered callables will be called during
checking phase (i. e. when you type `manage.py validate` or before running
server).

I think you're on the right track here.

The only other viable approach I can think of would be a setting (SYSTEM_CHECKS) that is a list of functions to be invoked. However, this means installing a new app that contained a set of checks would mean modifying multiple settings (INSTALLED_APPS, SYSTEM_CHECKS, and possibly others for templates etc)

This is also true of the registration approach -- the only difference is where you put the call to register. However, looking into the medium-term future, app refactor will eventually land, and at that point, we'll have a natural point to put all these registrations -- the app definition itself would be able to register the additional checks. 

To that end, I think a registration based approach will ultimately be better. And it isn't something that most users will be exposed to anyway -- all the pre-existing checks can be pre-registered, and admin can do it's registration as part of the autodiscover process.

This API allows us to register, among other things, app-specific checks. But
it's not necessary to write an app in order to do some checks.

`register` function will have two optional parameters: `run_in_production`
(default: False) and `run_in_development` (default: True) that let you specify
when the callable should be called. Most checks should be performed only in
development environment (which is equivalent to DEBUG==True). Security checks
makes sense only in production environment (<==> DEBUG==False).

I would be happy if I could hear your opinion!

I don't think we need to worry about 'in production' level checks at time of registration. 

The default behaviour should always be to run all checks when ./manage.py check is invoked. There are some very specific checks (such as security checks) that only make sense in production, but they're the exception, not the rule. These checks could be skipped using a decorator when they're defined - essentially:

def only_in_production(fn):
    def _dec(*args, **kwargs):
        if settings.DEBUG == False:
            return fn(*args, **kwargs)
        return []
    return _dec

@only_in_production
def security_checks(…):
    …

This decorator could be applied to the security check, so it becomes a no-op when it is executed in development. This can be determined by the check itself - I can't think of any circumstance where a user registering a check would need to have control over whether a check would be executed at all.

Yours,
Russ Magee %-)

Christopher Medrela

unread,
Jul 17, 2013, 5:16:51 AM7/17/13
to django-d...@googlegroups.com
I've created a pull request [1] to make deep review easier. I've rolled out tests we were talking about and reverted `validate_field` -- there exist both `check_field` and `validate_field`, the first calling the second.

Christopher Medrela

unread,
Jul 23, 2013, 4:17:22 PM7/23/13
to django-d...@googlegroups.com
Progress: I've implemented registering entry points. Now there is
`django.core.checks.register(callable)` function. There is no
`run_in_development` and `run_in_production` arguments. I've also rewritten
mechanism of triggering checking framework -- `BaseCommand.validate` calls
`django.core.checks.run_checks` instead of `get_validation_errors` (which was
deleted) now.

Questions:

1. BaseCommand.validate have an `app` optional argument to run validation of a
particular app (if app==None then all checks are performed). Unfortunately, with
the current checking framework, we are not able to run checks for a particular
app. The attribute isn't used anywhere in Django except for three tests [1] and
I don't think that this is heavily used by third party commands. So I propose to
deprecate this attribute if it's possible.


2a. Warnings are printed with bold, yellow foreground now (errors use red
color). Is it a good choice?

2b. The rules of formatting error messages are that the error message (that may
be multiline) is followed by the hint in the next line, i. e.:

    app_label.model_label: Error message.
    Error message can be multilined. In that case, the first line should be a short description and the rest should be a long description.
    HINT: Hints are printed in a new line.

If hint is missing then there is no last line. If the invalid object is a field
or a manager, then the error message starts with
`app_label.model_label.field_or_manager_name: `. If it's neither a model, a
manager nor a field, then '?: ' is printed. Do you have any opinion about this
style?

Now I will focus on moving custom user model checks from Model class to auth app
(that should be easy), then rewriting admin checks and finally finishing first
iteration (mainly polishing docs).

Russell Keith-Magee

unread,
Jul 23, 2013, 11:40:09 PM7/23/13
to django-d...@googlegroups.com
On Wed, Jul 24, 2013 at 4:17 AM, Christopher Medrela <chris....@gmail.com> wrote:
Progress: I've implemented registering entry points. Now there is
`django.core.checks.register(callable)` function. There is no
`run_in_development` and `run_in_production` arguments. I've also rewritten
mechanism of triggering checking framework -- `BaseCommand.validate` calls
`django.core.checks.run_checks` instead of `get_validation_errors` (which was
deleted) now.

Questions:

1. BaseCommand.validate have an `app` optional argument to run validation of a
particular app (if app==None then all checks are performed). Unfortunately, with
the current checking framework, we are not able to run checks for a particular
app. The attribute isn't used anywhere in Django except for three tests [1] and
I don't think that this is heavily used by third party commands. So I propose to
deprecate this attribute if it's possible.


What's the technical cause of needing to drop this feature? Is it because checks are no longer guaranteed to be "app specific"? Or is there some other cause?

At the end of the day, I don't especially mind if we have to drop this as a feature for the new checks framework. Checks really are something that should be run in their entirety, and if you get any errors, then . However, I can see how some people might have use for running a subset of checks against a subset of the project. If we need to drop this feature, I want to make sure we're not dropping it for a good reason, not just because it would be harder to implement. 
 
2a. Warnings are printed with bold, yellow foreground now (errors use red
color). Is it a good choice?

Sounds good to me.
 
2b. The rules of formatting error messages are that the error message (that may
be multiline) is followed by the hint in the next line, i. e.:

    app_label.model_label: Error message.
    Error message can be multilined. In that case, the first line should be a short description and the rest should be a long description.
    HINT: Hints are printed in a new line.

If hint is missing then there is no last line. If the invalid object is a field
or a manager, then the error message starts with
`app_label.model_label.field_or_manager_name: `. If it's neither a model, a
manager nor a field, then '?: ' is printed. Do you have any opinion about this
style?
 
Sounds good to me. My only additional comment would be to indent the 2nd+ lines slightly, so that you can easily identify the first lines of a large number of multi-line errors/warnings. It might also be worth having different indentation levels for continuation, and for starting the hint. For example:

app_label.model_label: Error message.
        Error message can be multilined. In that case, the first line should be 
        a short description and the rest should be a long description.
    HINT: Hints are printed in a new line. If hints can be multiline,
        then they should also be indented to a new level
 

Yours,
Russ Magee %-)

Shai Berger

unread,
Jul 24, 2013, 8:12:39 PM7/24/13
to django-d...@googlegroups.com
Hi Christopher,

While you're dealing with model validation, I wonder if you can take a look at
this little example -- a minor failure in the current model validation:

class General(models.Model):
name = models.CharField(max_length=30)

class Special(General):
pass

class SpecialDetail(models.Model):
parent = models.ForeignKey(Special)
target = models.ForeignKey(General)


Model validation fails (as it should), but the error message is

specialdetail: accessor for field 'parent' clashes with
'Special.specialdetail_set'

when in fact the clash is with 'General.specialdetail_set' generated by the
'target' field (and inherited by Special). Of course, when isolated like this,
it is very easy to spot where the real problem is, but when the models have a
little more content in them, this can be (and actually was) quite perplexing.

Thanks,
Shai.

Russell Keith-Magee

unread,
Jul 24, 2013, 8:37:06 PM7/24/13
to django-d...@googlegroups.com
Hi Shai,

Could I get you to open this as a ticket so that it isn't forgotten?

Russ %-)


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.



Shai Berger

unread,
Jul 27, 2013, 5:33:13 PM7/27/13
to django-d...@googlegroups.com
On Thursday 25 July 2013 08:37:06 Russell Keith-Magee wrote:
>
> Could I get you to open this as a ticket so that it isn't forgotten?
>
https://code.djangoproject.com/ticket/20814

Thanks,
Shai.

Christopher Medrela

unread,
Aug 2, 2013, 4:50:22 AM8/2/13
to django-d...@googlegroups.com
Unfortunately, I'm a bit late. I didn't suspected that polishing code after
review takes so much time. Lots of my work from last Wednesday was small
improvements, but there are some vital changes:

The API will be consistent with API of logging module or message framework.
Being consistent means that system check framework will be familiar to lots of
developers. The change is that every message has its `level` now -- one of
CRITICAL, ERROR, WARNING, INFO or DEBUG integer values. Django users may add
their own levels (like SUPER_CRITICAL) since `level` is an integer.

This change affected output format. All messages are sorted by level (and then
by source of problem), so critical errors are first on the list and there is no
opportunity to miss an important error in the middle of long list of warnings.

There is some stuff that still need to be done as part of reviewing, i. e.
deprecating `BaseCommand.validate` in favour of `check` or adding tests for
`GenericForeignKey`. After finishing polishing I will focus on moving custom
User model checks to auth apps, and then on rewritting admin checks.

Note that there is a lot of discussion on the pull request [1].

Christopher Medrela

unread,
Aug 6, 2013, 4:03:05 PM8/6/13
to django-d...@googlegroups.com
I'm still working at polishing after reviewing. I've deprecated
`requires_model_validation` and `validate`. I've started at adding tests for
contenttype fields: `GenericForeignKey` and `GenericRelation`.

I've updated gsoc2013-checks-review branch [1]. Now it's the same as
gsoc2013-checks branch [2]. I will push new work to the latter while the former
will remain unchanged. I'm working at contenttype tests in
gsoc2013-checks-contenttypes [3] branch. The work is not finished yet and there
are some failing tests.

@Shai Berger: Thank you for creating the ticket. I'm sorry that I procrastinated accepting it -- I finally did it and proposed a patch. [4]


Questions:

1. Output formatting. We decided that every error/warning will take one line plus
additional one for a hint if it's provided. The justification is that a Django
user may type "grep HINT" to filter all hints. But now I think it's unpractical
since the lines with hints doesn't say which object is invalid. So we can: (1)
put hint in the same line as the error message or (2) change the format to sth
like this:

    applabel.modellabel: Error message.
    applabel.modellabel:         HINT: Hint.

2. Is it allowed to use `GenericRelation` pointing to a model if the model lacks
`GenericForeignKey`?

3. I've added unicode_literals import to django/core/management.py but this
affected `BaseCommand.verbosity` default value. In order not to break commands,
I left the attribute as a bytestring [5]. Changing it to an unicode breaks some admin_scipts tests, i. e. `CommandTypes.test_app_command` expects in stdout:

    ... options=[('pythonpath', None), ... ('verbosity', '1')]

while the command prints:

    ... options=[('pythonpath', None), ... ('verbosity', u'1')]

Russell Keith-Magee

unread,
Aug 6, 2013, 10:04:55 PM8/6/13
to django-d...@googlegroups.com
On Wed, Aug 7, 2013 at 4:03 AM, Christopher Medrela <chris....@gmail.com> wrote:
I'm still working at polishing after reviewing. I've deprecated
`requires_model_validation` and `validate`. I've started at adding tests for
contenttype fields: `GenericForeignKey` and `GenericRelation`.

I've updated gsoc2013-checks-review branch [1]. Now it's the same as
gsoc2013-checks branch [2]. I will push new work to the latter while the former
will remain unchanged. I'm working at contenttype tests in
gsoc2013-checks-contenttypes [3] branch. The work is not finished yet and there
are some failing tests.

@Shai Berger: Thank you for creating the ticket. I'm sorry that I procrastinated accepting it -- I finally did it and proposed a patch. [4]


Questions:

1. Output formatting. We decided that every error/warning will take one line plus
additional one for a hint if it's provided. The justification is that a Django
user may type "grep HINT" to filter all hints. But now I think it's unpractical
since the lines with hints doesn't say which object is invalid. So we can: (1)
put hint in the same line as the error message or (2) change the format to sth
like this:

    applabel.modellabel: Error message.
    applabel.modellabel:         HINT: Hint.

For my money, the label on the second line isn't needed. I know I gave grep as a use case, but in retrospect, that's probably isn't as useful as you'd think. The reason to have the newline break with an indent is so that theres a visual cue for any hint. It makes the hint stand out. Putting a prefix on that line obscures this.

applabel.modellabel: Error message
    HINT: the hint
 
2. Is it allowed to use `GenericRelation` pointing to a model if the model lacks
`GenericForeignKey`?

In theory, I suppose you could, as long as there was a pair of object and content type fields. However, in practice, I don't think this is a very likely use case. I'd be completely comfortable if you confirmed the existence of a GenericForeignKey. Raising this as a warning might be a good approach here.

3. I've added unicode_literals import to django/core/management.py but this
affected `BaseCommand.verbosity` default value. In order not to break commands,
I left the attribute as a bytestring [5]. Changing it to an unicode breaks some admin_scipts tests, i. e. `CommandTypes.test_app_command` expects in stdout:

    ... options=[('pythonpath', None), ... ('verbosity', '1')]

while the command prints:

    ... options=[('pythonpath', None), ... ('verbosity', u'1')]

I can see what you're doing here, but I'm not sure it's needed. u'1' == '1' under Python 2.7, and under Python 3 it's all unicode anyway. 

Yours,
Russ Magee %-)

Christopher Medrela

unread,
Aug 14, 2013, 12:26:39 PM8/14/13
to django-d...@googlegroups.com
Progress.

- Deprecated `requires_model_validation` flag and `validate` method (both
  `BaseCommand` members) in favour of new `requires_system_checks` flag and
  `check` method.

- Finished working at `contenttypes` tests.

- Improved code thanks to Preston Holmes comments. Deleted dead code and added
  some new tests to improve code coverage.

It'd be nice to have checks of clashes between GenericForeignKey and
accessors. I didn't implemented it because little time left and I need to
hurry up.

When it was easy, I've added new tests to improve code coverage. However, there
are still some tests that would be nice to have:

- Test for raising an error while using an unique varchars longer than 255
  characters under MySQL backend. [1]

- Test for `ImageField`. When `Pillow` is not installed and `ImageField` is in
  use, an error should be raised. This should be tested. [2]

- Test for raising warning/ImproperlyConfigured in `BaseCommand.__init__`. [3]


Filtering and silencing errors/warnings. We need two features: ability
to run only a subset of checks (aka filtering errors) and ability to silence
some errors.

Silencing is easy. Every warning will have a unique name like "W027". The format
of the name is letter "W" followed by a unique number. The system check
framework is open ended and third-party apps can register its own checks. For
warnings raised by these apps, "Wnumber.applabel" (e.g. "W001.myapp") style will
be used. Of course, everything can be changed and I'm open to yours suggestions,
so feel free to comment and criticize it.

There will be a new setting called `SILENCED_WARNINGS`. If you want to silence a
warning, you put its name in this setting, e.g.:

    SILENCED_ERRORS = [
        'W027',
    ]

Only light messages (warnings, infos and debugs) can be silenced.

Running only a subset of check is a more complex task. We can associate every
check with a set of tags (that can be done while registering checks). To run
only checks tagged "mytag" or "mysecondtag", you need to type:

    manage.py check mytag mysecondtag

However, we would like to run checks of an app (or a set of apps):

    manage.py check auth admin

This is hard, because checks are no longer app-specific. I propose to solve this
by passing an `apps` optional keyword argument to every check function. The
function should only validate specified apps (or all apps if apps==None). The
only problem is how to determine if we deal with a tag or app name? Consider
this command:

    manage.py check auth mytag

This should run all checks tagged "mytag". Only messages for `auth` app should
be reported. I propose to collect all tags while registering check functions and
if a string is one of them, then interpret it as a tag, otherwise assume it's an
app name (and check if an app with given name exists).

Problems/questions.

1. We decided to mimic message and logging framework, so every error is tagged
with a level. Its value is an integer indicating how important and serious is
a message. There are five predefined values: CRITICAL, ERROR, WARNING, INFO,
DEBUG. However, Preston Holmes noticed that this is unpractical because we
need only errors and warnings. I think we should discuss again if it's worth
mimicing these frameworks.

2. On the pull request we started discussing names. "System check" is better than
"check" but it suggest that it's connected with hardware. Preston proposed
"Startup checks". I don't have strong opinion. To be honest, I don't thing
"startup checks" is much more better than "system checks" so I will leave it as
it is, but I'm still open to suggestions and I would like to see yours opinion.

3. There are some problems with moving custom User model checks. Their first source
was that some tests override `INSTALLED_APPS` setting but don't override list of
registered checks. So checks of `auth` app were registered (because the app was
imported by some other tests), but this app wasn't installed. This ended in an
error, because checks of `auth` try to load User model which is, by default,
`auth.User` and `auth` is not installed. I've solved this by overriding list of
registered checks (I've introduced `override_system_checks` decorator).

However, there is still one red test -- `admin_scripts.test_complex_app` [4].
The problem is that this test spawns a new Django process and the decorator
cannot affect this new process. This test installs two apps (`complex_app` and
`simple_app`) and doesn't install `auth` app. However, `auth` app is imported,
because `complex_app` imports `admin` app which imports `auth` app. The
simplest solution is just to add `auth` app to `INSTALLED_APPS`.


This shows a drawback of the current registration approach. Every time an app
is imported but not installed, its checks are registered, but they shouln't
be. I think we should rethink how checks should be registered. An alternative
can be Russell's solution -- create a new setting `SYSTEM_CHECKS` containing
list of all checks. This has one small drawback -- when you want to install an
app with custom checks, you need to put it in both `INSTALLED_APPS` and
`SYSTEM_CHECKS`; but we can overcome it by implicit extending `SYSTEM_CHECKS`
with app-specific checks. So, I can write:

    INSTALLED_APPS = [
        'auth',
    ]

And all checks of `auth` app will be automagically registered. Of course, we
need some public API for retrieving app checks. We can reuse the existing
registering mechanism and add `app` argument to `checks.register` function
indicating that this one particular check should be run only if app `app` is
installed.

4. In my last post I mentioned that there is a problem with
`BaseCommand.verbosity`. I tried to convert its default value ('1') into an
unicode, but I failed. This test command [5] prints its arguments to stdout,
so the output is something like this:

    ... options=[('pythonpath', None), ... ('verbosity', '1')]

But when you change `verbosity` from a bytestring to an unicode then the output
is (only under Python 2):

    ... options=[('pythonpath', None), ... ('verbosity', u'1')]

Russell proposed to replace `sorted(options.items())` in [5] with:

    sorted((text_type(k), text_type(v)) for k, v in options.items())

This is a solution, but it makes tests less strict. There is no way to
distinguish between `None` value and "None" string -- both are printed in the
same way.

I spent some time to find a better approach but with no success. The least
insane way is to check if the object to print is an unicode and if it is,
print it in the Python 3 style (without starting "u"); otherwise print it
normally.

Note that this is a more general problem. I've hit the same problem when I tried
to write a test for `CheckMessage.__repr__` method. I wrote:

    def test_repr(self):
        e = Error("Error", hint="Hint", obj=None)
        expected = "<CheckMessage: level=40, msg='Error', hint='Hint', obj=None>"
        self.assertEqual(repr(e), expected)

But under Python 2, `expected` variable should be:

        expected = "<CheckMessage: level=40, msg=u'Error', hint=u'Hint', obj=None>"


5. I've deleted one filter in [6]. I think it's useless. If
`self.related.get_accessor_name` is None, then `self.rel.is_hidden()` is true,
so this is a dead if. Am I right?


6. When `ForeignKey.foreign_related_fields` is a list of more than one element? If
it's always a one-element list, then we don't need lines 1443-1457 in
`ForeignKey._check_unique_target`. [7]


7. It looks like the code I've deleted in this commit [8] was dead, because if you
refer to an inherited field, an FieldDoesNotExist exception is raised by
`get_field` method. Is that right?


8. I've added a new check. If you're using a `GenericRelation` but there is no
`GenericForeignKey` on the target model, an warning is raised. This check was
implemented in this commit [9]. It uses `vars` builtin function to check if the
target model has a `GenericForeignKey`. This is ugly, but I don't see a better
approach.


Schedule. There is little time to the end of GSoC -- only 5 weeks and I will
work 50% of full speed after September 6, because I'm going on holiday.

- Rewriting admin checks (1 week).
- Implementing filtering and silencing errors (1 week).
- Merging django-secure (2 weeks?!).
- Polishing doc, last review (1 week).

I will implement filtering and silencing errors before merging django-secure
because these are a must if we want to merge django-secure.

Russell Keith-Magee

unread,
Aug 15, 2013, 8:45:35 PM8/15/13
to django-d...@googlegroups.com
Hi Christopher,

On Wed, Aug 14, 2013 at 11:26 PM, Christopher Medrela <chris....@gmail.com> wrote:
Progress.

- Deprecated `requires_model_validation` flag and `validate` method (both
  `BaseCommand` members) in favour of new `requires_system_checks` flag and
  `check` method.

- Finished working at `contenttypes` tests.

- Improved code thanks to Preston Holmes comments. Deleted dead code and added
  some new tests to improve code coverage.

It'd be nice to have checks of clashes between GenericForeignKey and
accessors. I didn't implemented it because little time left and I need to
hurry up.

This one is entirely understandable; it's a complex area, but it's an area that isn't currently being validated. If someone wants to contribute validation for this once this GSoC project is over, it would make a nice contribution. It would be helpful if you log this sort of known omission as a bug, and flag the bug as an "easy pickings" -- that means when we get to a sprint and a developer who is new to Django contribution is looking for something to do, we can point them at this problem and suggest they take a look.
 
When it was easy, I've added new tests to improve code coverage. However, there
are still some tests that would be nice to have:

- Test for raising an error while using an unique varchars longer than 255
  characters under MySQL backend. [1]

- Test for `ImageField`. When `Pillow` is not installed and `ImageField` is in
  use, an error should be raised. This should be tested. [2]

- Test for raising warning/ImproperlyConfigured in `BaseCommand.__init__`. [3]



Same goes for these three -- if you can raise a ticket for these testing omissions, we can get someone else to look at them at a later date. 
 
Filtering and silencing errors/warnings. We need two features: ability
to run only a subset of checks (aka filtering errors) and ability to silence
some errors.

Silencing is easy. Every warning will have a unique name like "W027". The format
of the name is letter "W" followed by a unique number. The system check
framework is open ended and third-party apps can register its own checks. For
warnings raised by these apps, "Wnumber.applabel" (e.g. "W001.myapp") style will
be used. Of course, everything can be changed and I'm open to yours suggestions,
so feel free to comment and criticize it.

There will be a new setting called `SILENCED_WARNINGS`. If you want to silence a
warning, you put its name in this setting, e.g.:

    SILENCED_ERRORS = [
        'W027',
    ]

I'd suggest SILENCED_SYSTEM_CHECKS here -- in your example here you've already used SILENCED_WARNINGS and SILENCED_ERRORS, which points out the potential for confusion. Including the full name is also helpful, just to make clear that we're not silencing HTTP500s (other any other errors) here.

As for the numbering -- I'd be inclined to use "myapp.W123", not the other way around -- put the namespace in front of the error code.

Only light messages (warnings, infos and debugs) can be silenced.

Running only a subset of check is a more complex task. We can associate every
check with a set of tags (that can be done while registering checks). To run
only checks tagged "mytag" or "mysecondtag", you need to type:

    manage.py check mytag mysecondtag

However, we would like to run checks of an app (or a set of apps):

    manage.py check auth admin

This is hard, because checks are no longer app-specific. I propose to solve this
by passing an `apps` optional keyword argument to every check function. The
function should only validate specified apps (or all apps if apps==None). The
only problem is how to determine if we deal with a tag or app name? Consider
this command:

    manage.py check auth mytag

This should run all checks tagged "mytag". Only messages for `auth` app should
be reported. I propose to collect all tags while registering check functions and
if a string is one of them, then interpret it as a tag, otherwise assume it's an
app name (and check if an app with given name exists).
 
I'd be inclined to use command line arguments for the tags here -- you don't want to have ambiguity between the "mytag" tag and the mytag app.

./manage.py check mytag --check=mytag --check=upgrade --check=security

(not completely sure I like the --check name itself, but I hope the point is clear)

Problems/questions.

1. We decided to mimic message and logging framework, so every error is tagged
with a level. Its value is an integer indicating how important and serious is
a message. There are five predefined values: CRITICAL, ERROR, WARNING, INFO,
DEBUG. However, Preston Holmes noticed that this is unpractical because we
need only errors and warnings. I think we should discuss again if it's worth
mimicing these frameworks.

I disagree -- We may not *currently* have a use for CRITICAL or INFO, but I can see how they might be useful . For example, a missing ALLOWED_HOSTS setting isn't just an error - it's a serious problem that needs to be addressed. It's worth drawing *major* attention to problems of this nature. 

I can see how INFO might also be useful for advisories -- for example, warning someone that a particular combination of settings may lead to poor performance (e.g., db_index turned off on a foreign key).

Even if there wasn't a need for these levels, the basic concept behind the error framework strikes me as "close enough" to logging and messages that there's an implicit documentation benefit from following the same API design. 
 
2. On the pull request we started discussing names. "System check" is better than
"check" but it suggest that it's connected with hardware. Preston proposed
"Startup checks". I don't have strong opinion. To be honest, I don't thing
"startup checks" is much more better than "system checks" so I will leave it as
it is, but I'm still open to suggestions and I would like to see yours opinion.

My objection to "system checks" isn't as strong as Preston's; I could certainly live with "startup checks" as a name. There's a point at which this becomes a bikeshed argument, though.
I can see the problem you're referring to, but in practice, I don't think it will be that big an issue. 

The problem you describe is only an issue where:

 * An app has checks
 * The app library has been imported somewhere
 * The app *isn't* in INSTALLED_APPS

I'm having difficulty thinking of a situation where this is a problem in practice. In the case of auth, I can't think of any use case that would allow contrib.auth to be imported, but NOT be in INSTALLED_APPS -- if you're using *any* of auth -- even if you're using a custom User model -- you need to have auth in INSTALLED_APPS.

Looking slightly longer term, once App Refactor lands, registering checks is something that we can do as part of app configuration, which will remove the problem entirely. 

The specific problem with the admin_scripts test strikes me as something that is a problem with the test, not the concept.

4. In my last post I mentioned that there is a problem with
`BaseCommand.verbosity`. I tried to convert its default value ('1') into an
unicode, but I failed. This test command [5] prints its arguments to stdout,
so the output is something like this:

    ... options=[('pythonpath', None), ... ('verbosity', '1')]

But when you change `verbosity` from a bytestring to an unicode then the output
is (only under Python 2):

    ... options=[('pythonpath', None), ... ('verbosity', u'1')]

Russell proposed to replace `sorted(options.items())` in [5] with:

    sorted((text_type(k), text_type(v)) for k, v in options.items())

This is a solution, but it makes tests less strict. There is no way to
distinguish between `None` value and "None" string -- both are printed in the
same way.

I spent some time to find a better approach but with no success. The least
insane way is to check if the object to print is an unicode and if it is,
print it in the Python 3 style (without starting "u"); otherwise print it
normally.

How about using repr instead of text_type:
 
>>> repr(1)
'1'
>>> repr('1')
"'1'"
>>> repr(u'1')
"u'1'"
>>> repr(None)
'None'
>>> repr('None')
"'None'"
 
Note that this is a more general problem. I've hit the same problem when I tried
to write a test for `CheckMessage.__repr__` method. I wrote:

    def test_repr(self):
        e = Error("Error", hint="Hint", obj=None)
        expected = "<CheckMessage: level=40, msg='Error', hint='Hint', obj=None>"
        self.assertEqual(repr(e), expected)

But under Python 2, `expected` variable should be:

        expected = "<CheckMessage: level=40, msg=u'Error', hint=u'Hint', obj=None>"


Why not just have 2 tests -- a Python 2 test and a Python 3 test. Depending on the internal complexity, you can either use @skipIf decorators, or a "if sys.version" check internal to the test - I suspect the latter will probably be more appropriate.
 
5. I've deleted one filter in [6]. I think it's useless. If
`self.related.get_accessor_name` is None, then `self.rel.is_hidden()` is true,
so this is a dead if. Am I right?


I'm fairly certain you're correct on this. 
 
6. When `ForeignKey.foreign_related_fields` is a list of more than one element? If
it's always a one-element list, then we don't need lines 1443-1457 in
`ForeignKey._check_unique_target`. [7]



It's more than one element when you have a Generic Foreign Key, or any other 'combined' key. So, this is checking that your FK is pointing at something that is actually unique *and* can be the target of a foreign key.
 
7. It looks like the code I've deleted in this commit [8] was dead, because if you
refer to an inherited field, an FieldDoesNotExist exception is raised by
`get_field` method. Is that right?


I'm not sure I understand -- get_field() will also return fields from an inherited model. 

This may point to a missing test -- a check that both the fields in a unique_together or index_together are local, non-m2m fields.

8. I've added a new check. If you're using a `GenericRelation` but there is no
`GenericForeignKey` on the target model, an warning is raised. This check was
implemented in this commit [9]. It uses `vars` builtin function to check if the
target model has a `GenericForeignKey`. This is ugly, but I don't see a better
approach.


Hrm. I don't really like this, but I'm not sure I have a better option. A better approach would be to have GFKs turn up in get_fields, but it isn't your responsibility to fix the internal problems of Generic Foreign Keys. If we have to live with this, then we should at the very least document it as a FIXME, pointing at the underlying problem with _meta handling of GFKs.
 
Schedule. There is little time to the end of GSoC -- only 5 weeks and I will
work 50% of full speed after September 6, because I'm going on holiday.

- Rewriting admin checks (1 week).
- Implementing filtering and silencing errors (1 week).
- Merging django-secure (2 weeks?!).
- Polishing doc, last review (1 week).

I will implement filtering and silencing errors before merging django-secure
because these are a must if we want to merge django-secure.

Sounds good to me!

Yours
Russ Magee %-)

Russell Keith-Magee

unread,
Aug 16, 2013, 9:37:39 PM8/16/13
to django-d...@googlegroups.com
On Fri, Aug 16, 2013 at 7:45 AM, Russell Keith-Magee <rus...@keith-magee.com> wrote:

8. I've added a new check. If you're using a `GenericRelation` but there is no
`GenericForeignKey` on the target model, an warning is raised. This check was
implemented in this commit [9]. It uses `vars` builtin function to check if the
target model has a `GenericForeignKey`. This is ugly, but I don't see a better
approach.


Hrm. I don't really like this, but I'm not sure I have a better option. A better approach would be to have GFKs turn up in get_fields, but it isn't your responsibility to fix the internal problems of Generic Foreign Keys. If we have to live with this, then we should at the very least document it as a FIXME, pointing at the underlying problem with _meta handling of GFKs.
 
 
Michal Petruca just mailed the django-dev list with some discussion about changes he wants to make in his own GSoC project, which drew my attention to something I'd forgotten about.

Does _meta.virtual_fields -- contain the information you need? It looks like it should.

Russ %-)

Christopher Medrela

unread,
Aug 21, 2013, 4:43:31 PM8/21/13
to django-d...@googlegroups.com
Progress:

- Converted `BaseCommand.verbosity` from bytestring into an unicode.

- Integrated compatibility checks.

- Deprecated "validate" command. This command delegates to "check" command now.
  Changed "check" command -- it performs all system checks, including model
  validation and compatibility checks.

- Added test for check of uniqueness of field combination under ForeignObject.

Now I'm working at rewriting admin checks (at gsoc2013-checks-admin branch [1]).

Christopher Medrela

unread,
Aug 27, 2013, 5:16:59 PM8/27/13
to django-d...@googlegroups.com
1. One of my questions left unanswered on the pull request [1] (I mean this one
about documentation and `__str__` use.).

2. I've finished rewriting admin checks. I've renamed `admin_validation` to
`admin_checks`. I would like you to have a deep look at `fk_name` and
`exclude` checks [2] as well as `inline_formsets` tests [3] (especially error
messages).

2a. "applabel.modellabel[.fieldname]" is the format of error messages for models
and fields. How should the one for admin look like? I propose
"applabel.ModelAdminClassName".

2b. BaseModelAdmin, ModelAdmin and InlineModelAdmin classes and their checks
live in separated files. `options.py` defines these classes, but checks lives
in `checks.py`. We want to have these two issues separated, otherwise class
definitions would become too long. Python does not support open classes, so we
cannot just add some `_check_*` methods in `checks.py` to existing classes
defined in `options.py`.

The current approach is that `check` method is defined in `options.py`, but it
delegates to appropriate functions in `checks.py` [4]. Yes, I use functions --
there is no need to have validator class, because we don't need to store
anything in its instances. However, the code is really ugly. I wonder if there
is any better approach.

3. I've created a new ticket [#20974] about lack of mysql-specific checks.

4. I've rebased my branch against master.

Russell Keith-Magee

unread,
Aug 27, 2013, 10:37:22 PM8/27/13
to Django Developers
Hi Chris,

On Wed, Aug 28, 2013 at 5:16 AM, Christopher Medrela <chris....@gmail.com> wrote:
1. One of my questions left unanswered on the pull request [1] (I mean this one
about documentation and `__str__` use.).
 
I've left a comment on the pull request, I've given the same comment in a previous message on this thread. Short version; I think there's potential for other levels to be used.

(that said.. .I get the feeling I think I might have commented on the wrong comment -- it wasn't completely clear from the link you provided)

2. I've finished rewriting admin checks. I've renamed `admin_validation` to
`admin_checks`. I would like you to have a deep look at `fk_name` and
`exclude` checks [2] as well as `inline_formsets` tests [3] (especially error
messages).
 
I'm not sure I follow what you've done with the exclude and fk_name checks here -- it looks like you've added some new checks, but I can't make out exactly why those checks are needed.

I'm also not wild about the fact you've changed forms code along the way.

2a. "applabel.modellabel[.fieldname]" is the format of error messages for models
and fields. How should the one for admin look like? I propose
"applabel.ModelAdminClassName".

This sounds reasonable to me; the only other idea I could suggest would be to include a '.admin' namespace in there, so it's clear that the error is part of an admin registration, but I'm not convinced that's needed.
 
2b. BaseModelAdmin, ModelAdmin and InlineModelAdmin classes and their checks
live in separated files. `options.py` defines these classes, but checks lives
in `checks.py`. We want to have these two issues separated, otherwise class
definitions would become too long. Python does not support open classes, so we
cannot just add some `_check_*` methods in `checks.py` to existing classes
defined in `options.py`.

The current approach is that `check` method is defined in `options.py`, but it
delegates to appropriate functions in `checks.py` [4]. Yes, I use functions --
there is no need to have validator class, because we don't need to store
anything in its instances. However, the code is really ugly. I wonder if there
is any better approach.

Yes, there is. It's called object orientation and classes :-)

The existing validation code already does this fairly elegantly - the only difference for your code is that you need to return multiple errors, not just raise an exception. I'm not sure why you've tried to re-invent this particular wheel.
 
Russ %-)

Christopher Medrela

unread,
Sep 3, 2013, 7:25:05 AM9/3/13
to django-d...@googlegroups.com
1. Progress: I've made improvements to admin checks. I've also finished
implementing filtering and silencing checks. I've rebased my branch against
master again.

2. I'm afraid that there is too little time to merge django-secure. September 16 is
suggested 'pencils down' date, so there are only less than two weeks (12 days) +
one buffer week, because firm 'pencils down' date is September 23. Merging
django-secure in such little time is actually impossible -- we must through out
this task from schedule. I really apologize for that.

My plan is that this week I will work at documentation (at this moment there is
only a draft). I will also try to implement as much "nice to have" things as
possible. These are:
  • Writing tests which would be nice to have. I mean three tests:
    •   [#20974],
    •   test for raising error when ImageField is in use but Pillow is not installed,
    •   test for raising error in `BaseCommand.__init__`.
  • Moving out checks performed in `__init__`.
  • Checking clashes between accessors and GenericForeignKeys.
  • Checks for grouped `choices`.
The second week and the backup week is for deep review. Regarding the amount
of comments I got during the first review, I guess I need *at least* one week
for deep review.


3. As I said, I've implemented both filtering and silencing system checks.

You can run only a subset of checks like that:

    ./manage.py check auth -t tag1 --tag tag2

This command will run only these checks which are labeled with `tag1` or `tag2`
tags and only `auth` app will be validated. If there is no app name, all apps
are checked. If there is no tag name, all checks are performed.

Your check function can be labeled in this way:

    from django.core.checks import register, tag

    @tag('mytag')
    def my_check_function(apps, **kwargs):
        # apps is a list of apps that should be validated; if None, all apps
        # should be checked.
        return []

    register(my_check_function)

The `tag` decorator works only for entry-point functions, e.g. these one passed
to `register` function. It doesn't work for checks
functions/methods/classmethods which are called by entry-point functions
(directly or indirectly).

To silence a specific error/warning, you need to append its `id` to
SILENCED_SYSTEM_CHECKS setting. The `id` could be any string, but we use the
following convension: "E001" for core errors, "W002" for core warnings,
"applabel.E001" for errors raised by an custom app (including contrib apps, i.e.
"sites.E001"). The error/warning number is unique to an app, e.g. "sites.E001",
"admin.E001" and "E001" are all allowed, but "E001" and "W001" are not OK. You
should use "E001" and "W002". To create an error/warning with given id, pass it
as a keyword argument:

    Error('Message', hint=None, obj=invalid_object, id='myapp.E001')

4. More important changes in code:

- Introduced RAW_SETTINGS_MODULE [1]. I use it in compatibility checks to test
  if `TEST_RUNNER` setting was overriden [2].

- Introduced `BaseAppCache.get_models_from_apps` method [3]. This method returns
  all models of given set of apps (or of all apps if None is passed) and is used
  in `default_checks.py` [4].

- Admin changes are backward compatible. Admin validators are deprecated in
  favour of admin checks, i.e. `BaseValidator` is deprecated in favour of
  `BaseModelAdminChecks`. `BaseValidator` is an almost empty class now.
  `ModelAdmin.validator` class attribute is deprecated in favour of new `checks`
  attribute. If an ModelAdmin defines `validator`, we are not ignoring it.


5. I've left one comment on the pull request. (I mean this one about moving
registering admin checks to `autodiscover`.)

Russell Keith-Magee

unread,
Sep 4, 2013, 7:53:08 AM9/4/13
to Django Developers
Hi Christopher,

On Tue, Sep 3, 2013 at 9:25 PM, Christopher Medrela <chris....@gmail.com> wrote:
1. Progress: I've made improvements to admin checks. I've also finished
implementing filtering and silencing checks. I've rebased my branch against
master again.

Excellent - I'll take look and get you some feedback.
 
2. I'm afraid that there is too little time to merge django-secure. September 16 is
suggested 'pencils down' date, so there are only less than two weeks (12 days) +
one buffer week, because firm 'pencils down' date is September 23. Merging
django-secure in such little time is actually impossible -- we must through out
this task from schedule. I really apologize for that.

This is obviously disappointing, but it's better to deliver something complete than a half-attempt. If we can get the core framework into a good state, merging django-secure is a self-contained task that we can address as a follow-up commit.

Also -- the GSoC will come to an end, but that doesn't mean your contributions to Django have to… :-)
 
My plan is that this week I will work at documentation (at this moment there is
only a draft). I will also try to implement as much "nice to have" things as
possible. These are:
  • Writing tests which would be nice to have. I mean three tests:
    •   [#20974],
    •   test for raising error when ImageField is in use but Pillow is not installed,
    •   test for raising error in `BaseCommand.__init__`.
  • Moving out checks performed in `__init__`.
  • Checking clashes between accessors and GenericForeignKeys.
  • Checks for grouped `choices`.
The second week and the backup week is for deep review. Regarding the amount
of comments I got during the first review, I guess I need *at least* one week
for deep review.

This sounds like a reasonable plan -- it's a bunch of relatively easy individual tests, each of which will individually improve the current situation, but each of which could also be omitted if time becomes a factor.
 
When you're ready for another deep review, let me know and I'll run over the code again.


3. As I said, I've implemented both filtering and silencing system checks.

You can run only a subset of checks like that:

    ./manage.py check auth -t tag1 --tag tag2

This command will run only these checks which are labeled with `tag1` or `tag2`
tags and only `auth` app will be validated. If there is no app name, all apps
are checked. If there is no tag name, all checks are performed.

Your check function can be labeled in this way:

    from django.core.checks import register, tag

    @tag('mytag')
    def my_check_function(apps, **kwargs):
        # apps is a list of apps that should be validated; if None, all apps
        # should be checked.
        return []

    register(my_check_function)

The `tag` decorator works only for entry-point functions, e.g. these one passed
to `register` function. It doesn't work for checks
functions/methods/classmethods which are called by entry-point functions
(directly or indirectly).

I've just bounced this off Aymeric and he pointed out something interesting. We have a precedent for this in another area -- template tags. You can use a decorator to register a template tag, and that register decorator can also take arguments to control tag naming, etc.

Once he pointed this out, it seemed obvious that this is what we should be doing here, too -- after all, the use case is almost identical (registering a function with a central repository, with some optional arguments associated with the registration. 

So - instead of having a @tag decorator, we should be modifying register to be a decorator, taking optional arguments that lets you specify tags. Depending on how easy the internals for this work, we could do either: 

@register('mytag')
@register('othertag')
def my_check_function(apps, **kwargs):
    …

or 

@register('mytag', 'othertag')
def my_check_function(apps, **kwargs):
    …
 
And, of course, if you don't want to use decorator syntax, you can invoke the decorator directly.

To silence a specific error/warning, you need to append its `id` to
SILENCED_SYSTEM_CHECKS setting. The `id` could be any string, but we use the
following convension: "E001" for core errors, "W002" for core warnings,
"applabel.E001" for errors raised by an custom app (including contrib apps, i.e.
"sites.E001"). The error/warning number is unique to an app, e.g. "sites.E001",
"admin.E001" and "E001" are all allowed, but "E001" and "W001" are not OK. You
should use "E001" and "W002". To create an error/warning with given id, pass it
as a keyword argument:

    Error('Message', hint=None, obj=invalid_object, id='myapp.E001')

Looks good to me.
 
4. More important changes in code:

- Introduced RAW_SETTINGS_MODULE [1]. I use it in compatibility checks to test
  if `TEST_RUNNER` setting was overriden [2].

Have a look at the internals of the diffsettings management command -- I'm not sure RAW_SETTINGS_MODULE is needed.
 
- Introduced `BaseAppCache.get_models_from_apps` method [3]. This method returns
  all models of given set of apps (or of all apps if None is passed) and is used
  in `default_checks.py` [4].

I'm not sure I follow why this is needed -- or why it isn't just duplicating functionality from loading.get_models()?
 
- Admin changes are backward compatible. Admin validators are deprecated in
  favour of admin checks, i.e. `BaseValidator` is deprecated in favour of
  `BaseModelAdminChecks`. `BaseValidator` is an almost empty class now.
  `ModelAdmin.validator` class attribute is deprecated in favour of new `checks`
  attribute. If an ModelAdmin defines `validator`, we are not ignoring it.


Looks good on first inspection. 

5. I've left one comment on the pull request. (I mean this one about moving
registering admin checks to `autodiscover`.)

I've commented on the PR.

Yours,
Russ Magee %-)

Christopher Medrela

unread,
Sep 11, 2013, 5:39:43 PM9/11/13