I've just uploaded a first draft at a patch introducing logging into
Django [1]. I'm calling for feedback on this patch.
[1] http://code.djangoproject.com/attachment/ticket/12012/t12012-alpha1.diff
This patch is heavily drawn from the work that Vinay Sajip has done in
this area, but it also includes some significant changes to his most
recent patches.
The patch introduces two settings:
LOGGING_CONFIG - a callable that should be used to configure the
logging machinery
LOGGING - the configuration data to pass to the configuration function.
By default, LOGGING_CONFIG points to logging.config.dictConfig, which
means that LOGGING should be a dict-style logging configuration (as
introduce by Python 2.7). A copy of dictConfig is included in utils
for backwards compatibility. However, by changing LOGGING_CONFIG, you
could use any configuration scheme you like.
In addition, a new convention is introduced. If a Django app includes
a 'startup.py' as a sibling module to models.py, tests.py etc, then
this module will be loaded *exactly once* during the startup of the
app. It will be loaded *after* the settings have been initialized, but
*before* models.py has been imported.
The patch includes the changes to the new project settings.py to show
what LOGGING would look like in practice. This logging configuration
puts everything to console, except for errors in the request process,
which are mailed to the admin. This means redirecting 500 errors to
something other than your email box is now a 1 line configuration
item.
The patch also includes logging entries for everywhere in the code
where a 4XX or 5XX error is raised. 4XX's are raised as warnings;
5XX's as errors.
Lastly, there is a separate logger that logs, at the debug level,
every SQL statement that is executed. If you add a handler for
django.db.backends that catches the DEBUG level, you will get this
output.
There are no docs because I'm waiting for the design to be finalized
before I start writing; there are no tests because this is something
that is a little difficult to test.
At this point, I'm calling for feedback, particularly on the following:
* logging config as the last stage of settings loading. Is this the
right place? Can anyone think of a better place?
* The default logging configuration. Have I got the
propagate/override options right for sensible defaults (both in global
and new-project settings)?
* The logging calls that have been introduced. Is this too much or
too little? I'd rather err on the side of caution here (we can always
add extra logging later), but I'd like suggestions on anything else
people think should be logged. Also, are they logged at the right
level (e.g., is a 404 a warning?).
* The startup.py convention (and implementation). Is this necessary?
Does it address (or not address) any other prominent "on startup" use
cases? There's also a crossover here with Arthur's GSoC work; is the
startup code sufficiently important that we need it anyway, or can we
defer until the app refactor lands?
So - have at it. I'm really excited by this; so let me know everywhere
I've messed up so that we can get this into trunk.
Russ %-)
Great stuff! A cursory glance shows there isn't anything to log debug
output from url resolution.. something I think I need to add.. I'll
busy a ticket for it :)
D
> --
> You received this message because you are subscribed to the Google Groups "Django developers" group.
> To post to this group, send email to django-d...@googlegroups.com.
> To unsubscribe from this group, send email to django-develop...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
>
>
I'm not sure I completely follow - what exactly are you looking to
have logged? The pattern that was matched by the request?
As for raising a ticket - don't worry. About that at this point. For
the moment, just keep he discussion on the ticket; if we decide that
this is needed, we'll roll it into the commit for the main ticket. If
you have a patch, just stick it on #12012.
Yours,
Russ Magee %-)
On Saturday, September 25, 2010, David P. Novakovic
Actually what I'm used to do in my application is to have a logging
configuration in the settings.py livel and
import logging
logging.getLogger(__name__)
at the module level, so each module gets its own logger. In the
settings level and via a local_settings file one can seat each logging
level. Adding a startup.py at the application level could save me some
lines of code so I'm +1 on it :). For me it's also a a good idea as
removes some decisions about "where I have to put that code that needs
to be loaded once".
In a logging system I think is important to differentiate between
logging in development where we neeed a console log and logging in
production where we need a file log as a minimum, and I don't see how
can be easily managed unless the Django logging module has additional
configuration to add the default logging output.
The Django logging module should also show how "the standard way" to
create loggers for the Django applications, and I don't see an easy
way to do it. Perhaps it's just a matter of documenting it, I don't
know.
So I'd like to add 2 main doubts to the thread
1. How the logging process could be standardized for all Django apps?
2. How can we change the logging setting from
development/preproducction to a production configuration just changing
some sort of local_setting?
We could do some profile tests to see if
http://docs.python.org/library/logging.html?highlight=logging#logging.Logger.isEnabledFor
it's worth to be added in the code.
One last doubt, perhaps offtopic I have read logging module is slow,
good enough for 90% of applications but for the rest. Actually is
enough for me, but I can't see to deal with log on big systems
Regards,
--
Antoni Aloy López
Blog: http://trespams.com
Site: http://apsl.net
To expand on this a bit, the refactored app cache from Arthur's GSoC branch introduces a dedicated place for app related information to live in, the app class. With the dotted app path given to a new setting called APP_CLASSES it will be picked up during startup by the app cache, creating app instances.
Given the fact that the current implementation of the app cache does quite a few hoops to make it possible to implicitely import the models modules by iterating over INSTALLED_APPS, I'd like to propose to rather move the logging initialization to the app class initialization, instead of introducing another place of implicite importing -- to prevent having ambiguous locations for app specific code execution.
Only problem is that the inclusion of the refactored app cache into trunk wasn't planned for 1.3, which is why I suggest we recommend using the management.py for optional logging initialization in 1.3 and add app-level logging hooks in 1.4 when the new app cache lands.
Jannis
Yes, there is, and the logging calls I've included in the patch on
#12012 make use of it. All the logging calls take an 'extra' parameter
which can be used to insert relevant context into the logging call,
and can be used in a custom formatter. In all the logging calls
registering 4XX and 5XX messages, I've included the current request as
extra context.
Yours,
Russ Magee %-)
> Hi,
>
> On Sep 25, 1:25 pm, Jannis Leidel <jan...@leidel.info> wrote:
>> Given the fact that the current implementation of the app cache does quite a few hoops to make it possible to implicitely import the models modules by iterating over INSTALLED_APPS, I'd like to propose to rather move the logging initialization to the app class initialization, instead of introducing another place of implicite importing -- to prevent having ambiguous locations for app specific code execution.
>
> I agree, the only problem I see (not having looked at the app refactor
> branch yet + having no idea in which direction it is heading) is that
> if the app class is allowed to import other stuff. Cause this might
> easily end in models (or whatever) being imported and as a result we
> can't guarantee that it would happen as early as startup.py (though if
> someone imports critical stuff there we might get the same unwanted
> results).
Not sure I understand the problem, the app instances are initialized by the app cache, at the same time as the proposed logging patch would import the startup modules.
> Btw: Is there a wiki page describing exactly what the branch
> solves currently and which direction it will be heading?
Arthur wrote multiple status mails, please search the list archives.
Jannis
Well naturally you can't log what happens in the settings-file itself
:) I currently import logging at the end of settings.py but just
before importing local settings files, so that I can log the specifics
of the local settings.
HM
Is there a known exploit inherent in the use of threadlocals? No.
Django's position is entirely defensive. The potential for security
issues associated with threadlocals is entirely a function of how they
are used. Threadlocals are a vector by which it is possible, either
intentionally or accidentally, to allow data to leak from one request
to another. This is the starting point of an exploit. Since avoiding
this exploit requires you to actively look for potential ways that
your threadlocal usage can be abused, Django has made an architectural
decision to discourage their use.
On top of that, there is the purely architectural argument:
threadlocals are a global variable by any other name. They increase
coupling in the systems in which they are used. If an engineer came to
you and proposed a design that relied upon the use of global
variables, they would be soundly ridiculed; but for some reason,
calling that same global variable a "threadlocal" makes it socially
acceptable.
It is entirely possible to build a robust, testable web system without
the use of threadlocals. You just need to be thorough in your design
work.
> Also - even if Django itself doesn't use thread-locals, can you be
> sure that any third-party library you are using doesn't do so under
> the covers?
Truth be told, Django *does* use threadlocals in a couple of very
specific circumstances. These uses have been extensively audited to
make sure there is no exploit hidden in them.
It is possible to use threadlocals in a safe way. If people want to
pursue these architectural options, they are free to do so -- there is
nothing in Django actively hobbling the use of threadlocals. It's just
not a practice that we (as a project) are going to recommend.
Yours,
Russ Magee %-)
On Sat, 2010-09-25 at 14:16 +0800, Russell Keith-Magee wrote:
> At this point, I'm calling for feedback, particularly on the following:
>
> * logging config as the last stage of settings loading. Is this the
> right place? Can anyone think of a better place?
In the context of running a Django app, this means it will definitely
get called before the first request is dispatched, due to the need to
find the URLconf. From the context of command line apps, it is less
predictable in general, but I guess the command line app has to be
responsible for ensuring that logging gets set up first.
Anyway, this seems good to me, and I can't think of a better place. For
the probably rare case of putting logging calls in your settings.py
(like Hanne Moa does), there is nothing to stop you setting up logging
yourself inside settings.py, which Hanne must presumably do already.
The only problem is that logging could be set up twice. For this case,
I guess we could support some convention like 'LOGGING_CONFIG = None' to
stop Django configuring logging altogether and add a note in the docs.
This might also be useful for the command line app situation.
> * The default logging configuration. Have I got the
> propagate/override options right for sensible defaults (both in global
> and new-project settings)?
I don't really have much experience with the logging module, but they
seemed sensible to me.
> * The logging calls that have been introduced. Is this too much or
> too little? I'd rather err on the side of caution here (we can always
> add extra logging later), but I'd like suggestions on anything else
> people think should be logged. Also, are they logged at the right
> level (e.g., is a 404 a warning?).
These all look good to me, and cover the obvious cases. Warnings for
4XX codes seems good, since they could be a sign of something "going
wrong" - dead links or some kind of attack - while not necessarily being
an actual problem in the code, and it is useful to filter that category.
> * The startup.py convention (and implementation). Is this necessary?
> Does it address (or not address) any other prominent "on startup" use
> cases? There's also a crossover here with Arthur's GSoC work; is the
> startup code sufficiently important that we need it anyway, or can we
> defer until the app refactor lands?
I'm not sure what use cases this specifically targets. It seems that it
depends on 'AppCache._populate' being called. From what I can tell,
that isn't guaranteed to be called until some database query is made, or
it is called by an app like the admin. Without the admin I can create
an app that never populates AppCache but can serve responses.
I don't see anything that depends on this startup proposal though, so
I'd propose dropping that part until the other options have been
discussed.
Regards,
Luke
--
"Christ Jesus came in to the world to save sinners" (1 Timothy 1:15)
Luke Plant || http://lukeplant.me.uk/
This might be useful for switching over to the official logging
system, easily turning it on and off.
I've been using the logging system since before it was added to
python. There used to be problems with it hiding some exceptions but
that was in some hideously complex threads, multicast and
ldap-disaster, so it might have been something else hiding those
exceptions. I haven't seen any such in the duct tape-and-chewing gum
setup I use in my django projects right now.
If anyone is looking for inspiration: most modules longer than twenty
lines starts with
import logging
_LOG = logging.getLogger(__name__)
_LOG.info(__name__)
.. so I know that they imported correctly on the production-server.
I've also on occasion used decorators to log state on entry and exit
of views and functions, in addition to liberal sprinkling through code
that really ought to be refactored. I even add it to django itself in
order to understand what's going on.
HM
Sounds reasonable to me. I'll add handling for the None case.
>> * The startup.py convention (and implementation). Is this necessary?
>> Does it address (or not address) any other prominent "on startup" use
>> cases? There's also a crossover here with Arthur's GSoC work; is the
>> startup code sufficiently important that we need it anyway, or can we
>> defer until the app refactor lands?
>
> I'm not sure what use cases this specifically targets. It seems that it
> depends on 'AppCache._populate' being called. From what I can tell,
> that isn't guaranteed to be called until some database query is made, or
> it is called by an app like the admin. Without the admin I can create
> an app that never populates AppCache but can serve responses.
>
> I don't see anything that depends on this startup proposal though, so
> I'd propose dropping that part until the other options have been
> discussed.
The use case for logging is completely customized, programatic logging
configuration. If you look at the settings.py that Vinay uploaded to
#12012, you can see an example of what he had in mind. From my
tinkering with the patch, this isn't a particularly compelling use
case; a custom logging config function should be able to handle this
much better.
The larger use case is to introduce a consistent place to do per-app
configuration, like registering signal handlers. However, the
app-cache refactor work from Arthur has a much more consistent story
on this. I think your suggested approach may be right -- drop the
startup proposal from logging (which doesn't strictly need it), and
let Arthur's work pick it up.
Yours,
Russ Magee %-)
> So - have at it. I'm really excited by this; so let me know everywhere
> I've messed up so that we can get this into trunk.
Sorry, missed this before. A couple of comments...
Logging config as last stage of settings loading is where is seems to
have ended up in our apps at the moment anyway; I'd agree with whoever
it was who said that it should be made possible/simple to force the
config to happen earlier in settings (and then not happen twice), as
depending on the complexity of the settings, it may be necessary to log
from it.
I'm worried by the use of "warning" for all 4xx statuses. I think it
still makes sense to use the "original" syslog level definitions[*] as a
guide, and on there I'd suggest that some 4xx statuses would merit
"Info", some "Notice", and maybe one or two "Warning". "Notice" not
being included in Python's default logging, I guess that means I'd split
them between "Info" and "Warning".
My view is that the main use of these logs to me is to help me see when
someone is doing Bad Things (or trying to) to my system. I would be
wanting anything that indicated a targeted exploration of my server to
show up as "Warning", and anything that's most likely a random script
kiddie to be "Info". That certainly puts 404 in as "Info"; I see so many
hits looking for e.g. poorly-configured phpmyadmin installations, that
404s would swamp anything that I really needed to be looking at.
[*] Which are, according to 'man syslog' on this box:
LOG_EMERG system is unusable
LOG_ALERT action must be taken immediately
LOG_CRIT critical conditions
LOG_ERR error conditions
LOG_WARNING warning conditions
LOG_NOTICE normal, but significant, condition
LOG_INFO informational message
LOG_DEBUG debug-level message
Cheers,
Nick
--
Nick Phillips / +64 3 479 4195 / nick.p...@otago.ac.nz
# these statements are my own, not those of the University of Otago
> * The default logging configuration. Have I got the
> propagate/override options right for sensible defaults (both in global
> and new-project settings)?
I just noticed that the project template settings and the global
settings are different. The global settings use a null handler for debug
logs, while the project template config would send them to stderr. I
can see the value of having debug calls appear on the console while
running the development server, but most people will take the template
settings.py as a good starting point for deployment. Since we have
already defined DEBUG in the template settings.py, how about changing
the project template to have this:
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'handlers': {
'null': {
'level':'DEBUG',
'class':'django.utils.log.NullHandler',
},
'console':{
'level':'DEBUG',
'class':'logging.StreamHandler',
},
'mail_admins': {
'level': 'ERROR',
'class': 'django.utils.log.AdminEmailHandler'
}
},
'loggers': {
'django': {
'handlers':[DEBUG and 'console' or 'null'],
'propagate': True,
'level':'INFO',
},
'django.request':{
'handlers': ['mail_admins'],
'level': 'ERROR',
'propagate': True,
},
}
}
That will give a settings.py that is sensible for both development and
deployment.
Luke
--
"Cross country skiing is great if you live in a small country."
(Steven Wright)
Luke Plant || http://lukeplant.me.uk/
On Tue, Sep 28, 2010 at 9:19 AM, Nick Phillips
<nick.p...@otago.ac.nz> wrote:
> I'm worried by the use of "warning" for all 4xx statuses. I think it
> still makes sense to use the "original" syslog level definitions[*] as a
> guide, and on there I'd suggest that some 4xx statuses would merit
> "Info", some "Notice", and maybe one or two "Warning". "Notice" not
> being included in Python's default logging, I guess that means I'd split
> them between "Info" and "Warning".
>
> My view is that the main use of these logs to me is to help me see when
> someone is doing Bad Things (or trying to) to my system. I would be
> wanting anything that indicated a targeted exploration of my server to
> show up as "Warning", and anything that's most likely a random script
> kiddie to be "Info". That certainly puts 404 in as "Info"; I see so many
> hits looking for e.g. poorly-configured phpmyadmin installations, that
> 404s would swamp anything that I really needed to be looking at.
I'm split on this myself but I think making all 400 level responses warnings
would keep things consistent and help find potential security issues easier.
Something like
5xx = error
4xx = warning
3xx = info
2xx (>300) = debug
would be very easy to understand and the desired request logging is easy
to set up using logging levels.
--
Ian
I don't see how a 302 because someone posted something is any less debug thann the 200 to serve thhhe get.
Bikesheddinngly yours,
Alex
On Sep 28, 2010 11:45 AM, "Ian Lewis" <ianm...@gmail.com> wrote:
Hi,
On Tue, Sep 28, 2010 at 9:19 AM, Nick Phillips
<nick.p...@otago.ac.nz> wrote:
> I'm worried by ...
I'm split on this myself but I think making all 400 level responses warnings
would keep things consistent and help find potential security issues easier.
Something like
5xx = error
4xx = warning
3xx = info
2xx (>300) = debug
would be very easy to understand and the desired request logging is easy
to set up using logging levels.
--
Ian
http://www.ianlewis.org/
--
You received this message because you are subscribed to the Google Groups "Django developers" g...
Making all 4xx a Warning is a bad idea. When you're writing a RESTful
API, it's common to use these status codes the way they were intended
to be used. For example, if the user makes a POST add a comment, but
the form data is invalid a RESTful API won't return a 2xx, because the
request failed. The only reasonable codes are in in 4xx range. It's
not uncommon for users to badly fill out forms, so getting warnings
about it would just flood the log.
The logging level should be based on the cause (like CSRF validation
failure) not solely on the response's status code.
Regards,
Łukasz Rekucki
Hrm. I'm not sure I agree.
The reason for having different settings between global_settings and
template settings is support for old projects. If you've got a legacy
project, you won't have a LOGGING setting, so the logging needs to be
identical to what is produced right now -- which is to say, emails for
server errors, but everything else to the bitbucket.
I can see the value in adding the null handler as part of the project
template, but I'm not sure about the "DEBUG and 'console' or 'null'"
logic. To my mind, 'just write to the console' is a reasonable
default, regardless of whether you're in testing or production; in
practice, if you're in production, you're going to need to tweak
propagation, level, and probably push the handlers to something
better, too. Making the default as simple as possible (i.e., just say
console) makes more sense to me than trying to presuppose how a log
will be used in production.
That said, it's a bit of a moot point: there isn't anything written to
the 'django' logger at this point in time. In practice, the value of
the setting is pretty much irrelevant for the moment. 'null' is as
good as any, as long as it's clear that this is what is happening.
For completeness, we should probably add a logger entry for
'django.db.backends' that pushes everything to null.
So - here's a revised proposal, which can be shared between both
global and project template settings:
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'handlers': {
'null': {
'level':'DEBUG',
'class':'django.utils.log.NullHandler',
},
'console':{
'level':'DEBUG',
'class':'logging.StreamHandler',
},
'mail_admins': {
'level': 'ERROR',
'class': 'django.utils.log.AdminEmailHandler'
}
},
'loggers': {
'django': {
'handlers':['null'],
'propagate': True,
'level':'INFO',
},
'django.db.backends':{
'handlers': [],
'level': 'INFO',
'propagate': True,
},
'django.request':{
'handlers': ['mail_admins'],
'level': 'ERROR',
'propagate': True,
},
}
}
Yours,
Russ Magee %-)
Here's the list of 4XX responses that we currently raise:
400 Bad Request
403 Forbidden (due to permissions and CSRF)
404 Not Found
405 Method Not Allowed
410 Gone
412 Precondition Failed
These all strike me as messages appropriate for a warning -- they're
all slightly concerning indications that you're either under some sort
of attack, or at the very least that your users are having a bad
experience on your site.
This includes 404 -- manually entered URLs and annoying PHP hackbots
notwithstanding, your site shouldn't be generating 404s. If it is, you
should be investigating. The only argument I can see for 404 as an
INFO message is the prevalence; given that a 404 is often generated
without being a concern, it makes sense to make them easy to filter
out. However, IMHO, unilaterally ignoring 404s would be just as bad as
having too many. On top of that, any halfway decent log analysis tool
can filter these messages on a per-status code or per-URL basis.
In short, I'm not convinced it's worth making a special case of 404.
Yours
Russ Magee %-)
It's important to note that I've deliberately *not* included logging
calls for 3XX and 2XX responses. Django's logging isn't intended as a
replacement for Apache/webserver logging; it's there as a supplement.
The warning/error condidtions are the occasions when you're likely to
want/need the extra request data to work out what went wrong, hence
the logging calls. 3XX and 2XX calls are normal operation, so they
shouldn't require any special handling above and beyond what the
webserver does.
Yours,
Russ Magee %-)
> These all strike me as messages appropriate for a warning -- they're
> all slightly concerning indications that you're either under some sort
> of attack, or at the very least that your users are having a bad
> experience on your site.
>
> This includes 404 -- manually entered URLs and annoying PHP hackbots
> notwithstanding, your site shouldn't be generating 404s. If it is, you
> should be investigating. The only argument I can see for 404 as an
> INFO message is the prevalence; given that a 404 is often generated
> without being a concern, it makes sense to make them easy to filter
> out. However, IMHO, unilaterally ignoring 404s would be just as bad as
> having too many. On top of that, any halfway decent log analysis tool
> can filter these messages on a per-status code or per-URL basis.
FWIW I agree with everything you say up to this point - it's just the
conclusion that I differ on; I'd prefer to have the ability to use the
logging config to send the 404s somewhere else. Not to ignore them, just
to be able to deal with them separately as simply as possible. Since
their issue is buried within the framework and can't easily be
overridden, I think their ubiquity does justify the the lower priority.
However, as Alex pointed out, this is essentially bikeshedding. So,
since you're the one doing the work, at this point I'll shut up and wish
you well with it :-)
I think I may be able to keep everyone happy here. With a couple of
modifications, this is something that can be handled as a logging
filter. All we need to do is pass in the status code as part of the
extra data that is part of the log message; then you could write a
filter to transform the log however you wanted -- for example:
class Info404Filter(logging.Filter):
def filter(self, record):
if record.status_code == 404:
record.levelno = logging.INFO
record.levelname = logging.getLevelName(logging.INFO)
return True
would, if installed, convert all 404 messages into INFO level.
Yours,
Russ Magee %-)
> "Django developers" group.
> To post to this group, send email to django-d...@googlegroups.com.
> To unsubscribe from this group, send email to
> django-develop...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-developers?hl=en.
>
--
Ian
For the history and meaning of the term, see here:
Yours,
Russ Magee %-)