Proposal for 1.2: built-in logging with django.core.log

291 views
Skip to first unread message

Simon Willison

unread,
Sep 17, 2009, 4:25:12 AM9/17/09
to Django developers
I think we should add logging to Django in version 1.2, implemented as
a light-weight wrapper around the Python logging module
(django.core.log maybe?) plus code to write errors to the Apache error
log under the mod_python handler and environ['wsgi.errors'] under WSGI
(meaning mod_wsgi will write to the Apache error log as well).

Benefits of logging as a core Django service
============================================

Adding logging to Django core would provide the following benefits:

1. We'll be able to de-emphasise the current default "e-mail all
errors to someone" behaviour, which doesn't scale at all well.

2. Having logging in the core framework will mean people start
actually using it, which will make it easier for people to debug their
Django apps. Right now adding "print" statements to a Django app is a
common debugging technique, but it's messy (you have to remember to
take them out again) and error prone - some production environments
throw errors if an app attempts to write to stdout. It's also not
obvious - many developers are surprised when I show them the
technique.

3. Logging in Django core rather than a 3rd party app will encourage
reusable applications to log things in a predictable way, standard
way.

4. 3rd party debugging tools such as the debug toolbar will be able to
hook in to Django's default logging behaviour. This could also lead to
plenty of additional 3rd party innovation - imagine a tool that looks
out for logged SQL that took longer than X seconds, or one that groups
together similar log messages, or streams log messages to IRC...

5. Built-in support for logging reflects a growing reality of modern
Web development: more and more sites have interfaces with external web
service APIs, meaning there are plenty of things that could go wrong
that are outside the control of the developer. Failing gracefully and
logging what happened is the best way to deal with 3rd party problems
- much better than throwing a 500 and leaving no record of what went
wrong.

6. Most importantly from my point of view, when a sysadmin asks where
Django logs errors in production we'll have a good answer for them!

7. As a general rule, I believe you can never have too much
information about what's going on with your web application. I've
never thought to myself "the problem with this bug is I've got too
much information about it". As for large log files, disk space is
cheap - and pluggable backends could ensure logs were sensibly
rotated.

Places logging would be useful
==============================

- Unhandled exceptions that make it up to the top of the Django stack
(and would cause a 500 error to be returned in production)
- The development web server could use logging for showing processed
requests (where currently these are just printed to stdout).
- Failed attempts at signing in to the admin could be logged, making
security audits easier.
- We could replace (or complement) django.connection.queries with a
log of executed SQL. This would make the answer to the common question
"how do I see what SQL is being executed" much more obvious.
- Stuff that loads things from INSTALLED_APPS could log what is being
loaded, making it much easier to spot and debug errors caused by code
being incorrectly loaded.
- Likewise, the template engine could log which templates are being
loaded from where, making it easier to debug problems stemming from an
incorrectly configured TEMPLATE_DIRS setting.
- We could use logging to address the problems with the template
engine failing silently - maybe some template errors (the ones more
likely to be accidental than just people relying on the fail-silent
behaviour deliberately) should be logged as warnings.

Most of the above would be set to a low log level which by default
would not be handled, displayed or stored anywhere (logging.info or
similar). Maybe "./manage.py runserver --loglevel=info" could cause
such logs to be printed to the terminal while the development server
is running.

Problems and challenges
=======================

1. The Python logging module isn't very nicely designed - its Java
heritage shines through, and things like logging.basicConfig behave in
unintuitive ways (if you call basicConfig twice the second call fails
silently but has no effect). This is why I suggest wrapping it in our
own higher level interface.

2. There may be some performance overhead, especially if we replace
mechanisms like django.connection.queries with logging. This should be
negligble: here's a simple benchmark:

# ("hello " * 100) gives a 600 char string, long enough for a SQL
statement
>>> import timeit, logging
>>> t = timeit.Timer('logging.info("hello " * 100)', 'import logging')
>>> t.timeit(number=100) # one hundred statements
0.00061702728271484375
>>> t.timeit(number=1000000) # one million statements
6.458014965057373

That's 0.0006 of a second overhead for a page logging 100 SQL
statements. The performance overhead will go up if you attach a
handler, but that's fine - the whole point of a framework like
'logging' is that you can log as much as you like but only act on
messages above a certain logging level.

3. We risk people using logging where signals would be more
appropriate.

4. We might go too far, and make Django a "noisy" piece of software
which logs almost everything that happens within it. Let's be tasteful
about this.

5. People might leave logging on, then find their server disk has
filled up with log files and caused their site to crash.

6. Logging levels are confusing - what exactly is the difference
between warn, info, error, debug, critical and fatal? We would need to
document this and make decisions on which ones get used for what
within the framework.

What would it look like?
========================

Here's what I'm thinking at the moment (having given the API very
little thought). In your application code:

from django.core import log
# Log to the default channel:
log.debug('Retrieving RSS feed from %s' % url)
# Log to a channel specific to your app:
log.debug('Retrieving RSS feed from %s' % url, channel='myapp.rss')
try:
feed = httpfetch(url, timeout=3)
except socket.timeout:
log.info('Timeout fetching feed %s' % url)

In settings.py:

MIDDLEWARE_CLASSES = (
...
'django.middleware.LogErrorsToWSGI', # write exceptions to
wsgi.errors
)
LOGGING_MIDDLEWARE_LEVEL = 'info'

# If you want custom log handlers - not sure how these would interact
with
# channels and log levels yet
LOG_HANDLERS = (
'django.core.log.handlers.LogToDatabase',
'django.core.log.handlers.LogToEmail',
)

What do people think? I'd be happy to flesh this out in to a full spec
with running code.

Mat Clayton

unread,
Sep 17, 2009, 5:37:42 AM9/17/09
to django-d...@googlegroups.com
+1 for this, another random thought which doesn't do your long post justice. But what are everyone's thoughts about log aggregation, taking logs from X app servers and combining them into a single location, something like Facebook's Scribe. I assume this could be built in as a separate log handler, but it would be nice for the guys who need this functionality to be able to achieve it easily, probably not right for core though.

Mat
--
--
Matthew Clayton | Founder/CEO
Wakari Limited

twitter http://www.twitter.com/matclayton

email m...@wakari.co.uk
mobile +44 7872007851

skype matclayton

Horst Gutmann

unread,
Sep 17, 2009, 6:03:07 AM9/17/09
to django-d...@googlegroups.com
Definitely a +1 from me.

-- Horst

Ivan Sagalaev

unread,
Sep 17, 2009, 6:53:57 AM9/17/09
to django-d...@googlegroups.com
Hi Simon,

Simon Willison wrote:
> 1. We'll be able to de-emphasise the current default "e-mail all
> errors to someone" behaviour, which doesn't scale at all well.

In a recent thread[1] on a similar topic Russel has also emphasized that
we should improve documentation about doing logging.

> 3. Logging in Django core rather than a 3rd party app will encourage
> reusable applications to log things in a predictable way, standard
> way.

Talking about predictable and standard way I want to be sure that we
don't break existing logging.

I.e. we at Yandex now have many reusable Django apps that don't setup
loggers themselves but just log things into named loggers and expect
them to be setup by a project that uses them.

What I gather from your proposal is that you want the same model ("an
app logs, a project setups") plus a nice declarative syntax in
settings.py instead of boring creation of handlers, formatters and
loggers. Right?

> - We could replace (or complement) django.connection.queries with a
> log of executed SQL. This would make the answer to the common question
> "how do I see what SQL is being executed" much more obvious.

In the thread that I was referring to[1] we kind of agreed on using a
signal there. Then hooking a logger onto the signal is simple.

> 5. People might leave logging on, then find their server disk has
> filled up with log files and caused their site to crash.

We had this problem with standard logging. Then we switched to a
RotatingFileHandler which wasn't very good however because its behavior
is simplistic and is not controllable by admins with an interface that
they know, namely logrotate. Setting up logrotate also wasn't without
problems. When it rotates a file it should let an app know about it and
it uses SIG_HUP for that. However this instantly terminates Django's
flup-based FastCGI server which we use.

Now we've settled on a WatchedFileHandler ported from Python 2.6 logging
module. It watches for file descriptor change and doesn't require
SIG_HUP to pick up a new file. May be we should port it to Django and
use it as a default handler for logging to file system.

> 6. Logging levels are confusing - what exactly is the difference
> between warn, info, error, debug, critical and fatal? We would need to
> document this and make decisions on which ones get used for what
> within the framework.

May be we can just leave it for users to decide. It depends so much on
how much an app actually wants from logging.

The only standard thing I can think of is to have DEBUG = True imply
level = logging.DEBUG (which includes everything more sever). DEBUG =
False will imply logging.INFO then. What do you think?

> # If you want custom log handlers - not sure how these would interact
> with
> # channels and log levels yet
> LOG_HANDLERS = (
> 'django.core.log.handlers.LogToDatabase',
> 'django.core.log.handlers.LogToEmail',
> )

This is a hard problem really. Most handlers require different set of
arguments. File names, email credentials, system idents for SysLog etc.
Also there should be different formatters. For example there's no point
to waste space in a syslog message on a timestamp since syslog tracks it
itself...

[1]:
http://groups.google.com/group/django-developers/browse_frm/thread/9d0992e800cf7d68#

Russell Keith-Magee

unread,
Sep 17, 2009, 10:04:27 AM9/17/09
to django-d...@googlegroups.com

No disagreement here with any of these assertions.

In the absence of specifics, this makes me a little bit nervous. The
Python logging interface may be very Java-heavy and complex, but it is
a thoroughly known quantity, and it houses a lot of features.

I've seen several attempts to wrap Java loggers in a "nicer"
interface, and every one of them ended up hobbling some of the power
features of the logger. There is also the issue of our wrapper playing
nicely with the loggers already being used in the wild.

I'm also not entirely convinced that the answer here isn't just
documentation. The documentation for log4j has historically been
pretty awful, and while Python's documentation is an improvement, it
could certainly be better IMHO. Good documentation for how to use
logging in the context of Django could go a long way.

> 3. We risk people using logging where signals would be more
> appropriate.

This may be a better way to approach the problem - more details below.

Details notwithstanding, I'm +1 to the idea of adding logging to the
core framework - or, at least, making it easier to use logs for
reporting internal state and error conditions instead of email).

As for likely roadblocks: I've been led to believe that Adrian has
objections to framework-level logging. I have no idea as to the nature
of his objection, but ticket #5415 indicates that he is (or has been,
historically) in favor of adding signals that could be used for
logging or debugging purposes.

Yours,
Russ Magee %-)

Russell Keith-Magee

unread,
Sep 17, 2009, 10:04:32 AM9/17/09
to django-d...@googlegroups.com
On Thu, Sep 17, 2009 at 6:53 PM, Ivan Sagalaev
<man...@softwaremaniacs.org> wrote:
>
> Hi Simon,
>
> Simon Willison wrote:
>> 1. We'll be able to de-emphasise the current default "e-mail all
>> errors to someone" behaviour, which doesn't scale at all well.
>
> In a recent thread[1] on a similar topic Russel has also emphasized that
> we should improve documentation about doing logging.

To clarify - I think that documentation is the very least we should
do. As your comments indicate, there are a lot of things you need to
do in order to get logging right, so we should at the very least
provide some documentation on how to do it right.

Yours,
Russ Magee %-)

Andrew Gwozdziewycz

unread,
Sep 17, 2009, 10:33:55 AM9/17/09
to django-d...@googlegroups.com
On Thu, Sep 17, 2009 at 10:04 AM, Russell Keith-Magee
<freakb...@gmail.com> wrote:

> As for likely roadblocks: I've been led to believe that Adrian has
> objections to framework-level logging. I have no idea as to the nature
> of his objection, but ticket #5415 indicates that he is (or has been,
> historically) in favor of adding signals that could be used for
> logging or debugging purposes.

I'm in favor of the signals approach as part of the core framework. In
addition, django.contrib.logging, which could provide a generic
solution, good for the
80 or 90% case would really rock.

--
http://www.apgwoz.com

Eric Florenzano

unread,
Sep 17, 2009, 1:40:19 PM9/17/09
to Django developers
On Sep 17, 1:25 am, Simon Willison <si...@simonwillison.net> wrote:
> 1. We'll be able to de-emphasise the current default "e-mail all
> errors to someone" behaviour, which doesn't scale at all well.

I'm a big fan of this proposal, for exactly this reason.

+1

Thanks,
Eric Florenzano

SeanOC

unread,
Sep 17, 2009, 2:38:34 PM9/17/09
to Django developers
+1 on the logging proposal. The stock python logging module is
definitely a bit of a finicky and confusing creature, especially for
people coming to Python for the first time with Django.

-0 On the signals based approach. I would be wary of the potential
performance overhead of replacing logging with signals and/or
implement logging in certain high traffic areas with signals. It
would definitely be interesting to see some proof of concept
performance tests (i.e. what Simon did with the handlerless logging)
before going too far down that path.

-Sean O'Connor

apollo13

unread,
Sep 17, 2009, 4:32:08 PM9/17/09
to Django developers
On Sep 17, 10:25 am, Simon Willison <si...@simonwillison.net> wrote:
> That's 0.0006 of a second overhead for a page logging 100 SQL
> statements. The performance overhead will go up if you attach a
> handler, but that's fine - the whole point of a framework like
> 'logging' is that you can log as much as you like but only act on
> messages above a certain logging level.
I woudln't worry about performance, lately I had a project doing
massive logging, which resulted in a performance loss, but replacing
my logging functions with simple lambda functions doing nothing (eg
replace django.core.log.warn with lambda *args, *kwargs: pass if
loglevel is error) gave me quite a performance boost.

Aide from that: +1 on the proposal.

Simon Willison

unread,
Sep 17, 2009, 4:41:26 PM9/17/09
to Django developers
On Sep 17, 4:04 pm, Russell Keith-Magee <freakboy3...@gmail.com>
wrote: 
> I've seen several attempts to wrap Java loggers in a "nicer"
> interface, and every one of them ended up hobbling some of the power
> features of the logger. There is also the issue of our wrapper playing
> nicely with the loggers already being used in the wild.

I should clarify - by "lightweight wrapper" I basically mean a pre-
configured log setup and a standard place to import the logger from -
and maybe a tiny bit of syntactic sugar if it will make the common
case more palatable. I'm mostly just interested in making the logging
module an encouraged technique within the Django world. It should
definitely play nicely with any already-existant logging code.

Graham Dumpleton

unread,
Sep 17, 2009, 6:50:22 PM9/17/09
to Django developers


On Sep 17, 6:25 pm, Simon Willison <si...@simonwillison.net> wrote:
> I think we should add logging to Django in version 1.2, implemented as
> a light-weight wrapper around the Python logging module
> (django.core.log maybe?) plus code to write errors to the Apache error
> log under the mod_python handler and environ['wsgi.errors'] under WSGI
> (meaningmod_wsgiwill write to the Apache error log as well).

It isn't necessarily practical to use environ['wsgi.errors'] as that
exists only for life of that request. Thus, anything done at time of
module imports or in background threads wouldn't have access to it.
You are better of just using sys.stderr.

Graham

Eric Holscher

unread,
Sep 18, 2009, 12:21:12 PM9/18/09
to django-d...@googlegroups.com
I have looked into Logging before for another project, and I found that SQLAlchemy's support seemed to be a pretty good model to follow. They define all of their loggers under the sqlalchemy namespace, and then you can configure different handlers for different things[1]:

import logging

logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
logging.getLogger('sqlalchemy.orm.unitofwork').setLevel(logging.DEBUG)

I think that this would be necessary to have in Django, so that for instance, I could listen to the django.orm logs, and not the django.http, or listen to them with different handlers/levels.

Their implementation[2] is a little confusing to me, but I think that having some prior art like this will allow us to better understand what we need, and how to accomplish it, so I thought I would throw it out there.

1: http://www.sqlalchemy.org/docs/05/dbengine.html#configuring-logging
2: http://www.sqlalchemy.org/trac/browser/sqlalchemy/trunk/lib/sqlalchemy/log.py

--
Eric Holscher
Web Developer at The World Company in Lawrence, Ks
http://ericholscher.com

Simon Willison

unread,
Sep 18, 2009, 4:58:53 PM9/18/09
to Django developers
On Sep 18, 6:21 pm, Eric Holscher <eric.holsc...@gmail.com> wrote:
> I have looked into Logging before for another project, and I found that
> SQLAlchemy's support seemed to be a pretty good model to follow. They define
> all of their loggers under the sqlalchemy namespace, and then you can
> configure different handlers for different things[1]:
>
> I think that this would be necessary to have in Django, so that for
> instance, I could listen to the django.orm logs, and not the django.http, or
> listen to them with different handlers/levels.

Yes, absolutely - this looks like exactly the right model.

Vinay Sajip

unread,
Sep 29, 2009, 4:36:16 AM9/29/09
to Django developers


On Sep 17, 9:25 am, Simon Willison <si...@simonwillison.net> wrote:
> Problems and challenges
> =======================
>
> 1. The Python logging module isn't very nicely designed - its Java
> heritage shines through, and things like logging.basicConfig behave in
> unintuitive ways (if you call basicConfig twice the second call fails
> silently but has no effect). This is why I suggest wrapping it in our
> own higher level interface.

Simon, I'm the author of Python's logging package. Sorry for the delay
in replying, I've been away from this list awhile. I think the "Java
heritage shines through" is just FUD. basicConfig's behaviour is fully
documented here:

http://docs.python.org/library/logging.html#logging.basicConfig

Including the fact that it sometimes (by design) has no effect.

There are a lot of people for whom logging just means writing to a
file, and that's why they have difficulty understanding why logging is
designed as it is. I would suggest you take a quick look at

http://plumberjack.blogspot.com/2009/09/python-logging-101.html

and then tell me why you think Python logging isn't well designed for
its purpose. You can do basic logging with two lines of setup (one
line if you ignore the import):

import logging
logging.basicConfig(level=logging.DEBUG,filename='/path/to/my/log',
format='%(asctime)s %(message)s')

and then

logging.getLogger(__name__).debug("Just checking this works")

Not too sure where the Java heritage is there, or where the hard part
is.

>
> 2. There may be some performance overhead, especially if we replace
> mechanisms like django.connection.queries with logging. This should be
> negligble: here's a simple benchmark:
>
> # ("hello " * 100) gives a 600 char string, long enough for a SQL
> statement>>> import timeit, logging
> >>> t = timeit.Timer('logging.info("hello " * 100)', 'import logging')
> >>> t.timeit(number=100) # one hundred statements
>
> 0.00061702728271484375>>> t.timeit(number=1000000) # one million statements
>
> 6.458014965057373
>
> That's 0.0006 of a second overhead for a page logging 100 SQL
> statements. The performance overhead will go up if you attach a
> handler, but that's fine - the whole point of a framework like
> 'logging' is that you can log as much as you like but only act on
> messages above a certain logging level.

A quick-and-dirty measurement showed me that a logging call (which
writes to file) takes on the order of 57 microseconds, which can be
reduced to around 50 microseconds if you forego collecting stack
frame, thread and process informaion. Not too shabby, though perhaps
not appropriate for extremely high-performance use cases. I hasten to
add, it's not a scientific benchmark.

>
> 3. We risk people using logging where signals would be more
> appropriate.
>

They're for entirely different purposes so I can't imagine this will
happen too often.

> 4. We might go too far, and make Django a "noisy" piece of software
> which logs almost everything that happens within it. Let's be tasteful
> about this.
>

One thing about the logging design (which perhaps makes it appear
complicated) is that developers can shape the logging in such a way
that the verbosity in different parts can be turned on and off pretty
much at will, and even without restarting the server in some cases.
So, with a little care in how things are arranged, this needn't
happen.

> 5. People might leave logging on, then find their server disk has
> filled up with log files and caused their site to crash.
>
> 6. Logging levels are confusing - what exactly is the difference
> between warn, info, error, debug, critical and fatal? We would need to
> document this and make decisions on which ones get used for what
> within the framework.

DEBUG: Detailed information, of no interest when everything is working
well but invaluable when diagnosing problems.
INFO: Affirmations that things are working as expected, e.g. "service
has started" or "indexing run complete". Often ignored.
WARNING: There may be a problem in the near future, and this gives
advance warning of it. But the application is able to proceed
normally.
ERROR: The application has been unable to proceed as expected, due to
the problem being logged.
CRITICAL: This is a serious error, and some kind of application
meltdown might be imminent.

I'll be happy to clarify further if needed.

> What would it look like?
> ========================
>
> Here's what I'm thinking at the moment (having given the API very
> little thought). In your application code:
>
> from django.core import log

I'm not sure it's a good idea to have a wrapper, as I don't believe
it's needed and may restrict the level of control you typically need
to have over logging. You'll not be convinced by my just saying so -
therefore, I'll be happy to work with you to understand what (in
specific areas, including at the module level) you're trying to
achieve, and explaining the best way to achieve it.

> # Log to the default channel:
> log.debug('Retrieving RSS feed from %s' % url)
> # Log to a channel specific to your app:
> log.debug('Retrieving RSS feed from %s' % url, channel='myapp.rss')
> try:
>     feed = httpfetch(url, timeout=3)
> except socket.timeout:
>     log.info('Timeout fetching feed %s' % url)
>
> In settings.py:
>
> MIDDLEWARE_CLASSES = (
>     ...
>     'django.middleware.LogErrorsToWSGI', # write exceptions to
> wsgi.errors
> )
> LOGGING_MIDDLEWARE_LEVEL = 'info'
>
> # If you want custom log handlers - not sure how these would interact
> with
> # channels and log levels yet
> LOG_HANDLERS = (
>     'django.core.log.handlers.LogToDatabase',
>     'django.core.log.handlers.LogToEmail',
> )
>
> What do people think? I'd be happy to flesh this out in to a full spec
> with running code.

Please email me and I will be happy to help with this.

Regards,

Vinay Sajip

Vinay Sajip

unread,
Sep 29, 2009, 4:39:49 AM9/29/09
to Django developers


On Sep 17, 10:37 am, Mat Clayton <m...@wakari.co.uk> wrote:
> +1 for this, another random thought which doesn't do your long post justice.
> But what are everyone's thoughts about log aggregation, taking logs from X
> app servers and combining them into a single location, something like
> Facebook's Scribe. I assume this could be built in as a separate log
> handler, but it would be nice for the guys who need this functionality to be
> able to achieve it easily, probably not right for core though.

Mat, logging's design already allows you to do this aggregation - you
can have logging events collected in multiple locations. See the
Python documentation here for pointers:

http://docs.python.org/library/logging.html#sending-and-receiving-logging-events-across-a-network

Regards,

Vinay Sajip

Vinay Sajip

unread,
Sep 29, 2009, 4:53:16 AM9/29/09
to Django developers


On Sep 17, 11:53 am, Ivan Sagalaev <man...@softwaremaniacs.org> wrote:
> Talking about predictable and standard way I want to be sure that we
> don't break existing logging.
>
> I.e. we at Yandex now have many reusable Django apps that don't setup
> loggers themselves but just log things into named loggers and expect
> them to be setup by a project that uses them.

That's normal. The pattern I use is:

In each module which needs to use logging, instantiate a module-global
logger using

logger = logging.getLogger(__name__)

and log to it in the module's code. The configuration happens in
settings.py.

> What I gather from your proposal is that you want the same model ("an
> app logs, a project setups") plus a nice declarative syntax in
> settings.py instead of boring creation of handlers, formatters and
> loggers. Right?
>

Actually you don't need much in settings.py, and Django doesn't need
to grow any code of its own to "wrap" logging. You can either
configure logging programmatically (for which I use basicConfig, in
simple setups) or using a configuration file (ConfigParser-based,
fully documented in Python docs) for more complex setups.

> > - We could replace (or complement) django.connection.queries with a
> > log of executed SQL. This would make the answer to the common question
> > "how do I see what SQL is being executed" much more obvious.
>

Yes, I do this with a patched CursorDebugWrapper. You can direct the
SQL to a separate file which contains only the SQL events and not
other logging events.

>
> We had this problem with standard logging. Then we switched to a
> RotatingFileHandler which wasn't very good however because its behavior
> is simplistic and is not controllable by admins with an interface that
> they know, namely logrotate. Setting up logrotate also wasn't without
> problems. When it rotates a file it should let an app know about it and
> it uses SIG_HUP for that. However this instantly terminates Django's
> flup-based FastCGI server which we use.
>
> Now we've settled on a WatchedFileHandler ported from Python 2.6 logging
> module. It watches for file descriptor change and doesn't require
> SIG_HUP to pick up a new file. May be we should port it to Django and
> use it as a default handler for logging to file system.

Why "port it to Django"? Do you mean, copy it into Django? I'm not
sure it should be the default - not everybody uses logrotate. I'd
leave this sort of decision for code in settings.py.

Regards,


Vinay Sajip

Vinay Sajip

unread,
Sep 29, 2009, 5:09:21 AM9/29/09
to Django developers


On Sep 17, 3:04 pm, Russell Keith-Magee <freakboy3...@gmail.com>
wrote:
> In the absence of specifics, this makes me a little bit nervous. The
> Python logging interface may be very Java-heavy and complex, but it is
> a thoroughly known quantity, and it houses a lot of features.

See my comment about Java-heavy being FUD in Simon's initial post.

>
> I've seen several attempts to wrap Java loggers in a "nicer"
> interface, and every one of them ended up hobbling some of the power
> features of the logger. There is also the issue of our wrapper playing
> nicely with the loggers already being used in the wild.
>

Absolutely agree. Wrapping is the wrong way to go, and not even
needed. I use logging with Django all the time and see no need to have
any special code in Django to support logging. Where necessary, I've
patched my Django to include logging statements.

> I'm also not entirely convinced that the answer here isn't just
> documentation. The documentation for log4j has historically been
> pretty awful, and while Python's documentation is an improvement, it
> could certainly be better IMHO. Good documentation for how to use
> logging in the context of Django could go a long way.
>

I'm working with Doug Hellmann (PyMOTW) to try and improve the layout
of the logging documentation in Python. I'm not asking for patches
(though it would be nice), but if you can give *specific* criticisms
(e.g. what you think is missing, or unclear) then that will focus our
efforts.

>
> Details notwithstanding, I'm +1 to the idea of adding logging to the
> core framework - or, at least, making it easier to use logs for
> reporting internal state and error conditions instead of email).
>

You can have your cake and eat it. It's perfectly feasible in Python
logging to send only certain events to nominated email addresses (all
configurable at run-time, so emails can be turned on/off, sent to
different/additional destinations etc.) as well as e.g. logging
tracebacks to file for the same events.

> As for likely roadblocks: I've been led to believe that Adrian has
> objections to framework-level logging. I have no idea as to the nature
> of his objection, but ticket #5415 indicates that he is (or has been,
> historically) in favor of adding signals that could be used for
> logging or debugging purposes.
>

They (logging and signals) are two different things. Python logging
allows you to consider the dimensions "What happened?", "Where did it
happen?", "How important is it?" and "Who wants to know?"
intelligently, and in particular it treats "What happened" and "Who
wants to know?" orthogonally. You get a lot of ways of getting
information to *people* whereas signals is more about letting *code*
know what's going on.

Unfortunately, a lot of people have got the impression that Python
logging is "Java-like" and "not Pythonic" just because I acknowledged
some good ideas in log4j. It's not as if Python people have a monopoly
on good ideas, is it? This "Java heritage" perception sometimes leads
to prejudice against logging, for no good reason that I can see. Of
course there might be grievances - for example, people complain about
"slow". As a single logging call which just has a file handler is of
the order of some tens of microseconds, I don't know how bad this is -
what do we compare against? The Tornado webserver (used by FriendFeed)
is a high-performance solution which uses Python logging. SQLAlchemy
uses Python logging. They are careful to consider performance and as a
consequence logging doesn't present a problem in practice.

Regards,

Vinay Sajip

Vinay Sajip

unread,
Sep 29, 2009, 5:10:25 AM9/29/09
to Django developers


On Sep 17, 3:04 pm, Russell Keith-Magee <freakboy3...@gmail.com>
wrote:
> To clarify - I think that documentation is the very least we should
> do. As your comments indicate, there are a lot of things you need to
> do in order to get logging right, so we should at the very least
> provide some documentation on how to do it right.
>
As I posted in an earlier response to Simon, I'm happy to help with
this.

Regards,

Vinay Sajip

Vinay Sajip

unread,
Sep 29, 2009, 5:15:00 AM9/29/09
to Django developers


On Sep 17, 9:41 pm, Simon Willison <si...@simonwillison.net> wrote:
> I should clarify - by "lightweight wrapper" I basically mean a pre-
> configured log setup and a standard place to import the logger from -

There's no "the logger". Each module should have its own logger, this
allows you to control the verbosity of logging to at least the module
level and potentially with finer granularity than this.

> and maybe a tiny bit of syntactic sugar if it will make the common
> case more palatable. I'm mostly just interested in making the logging
> module an encouraged technique within the Django world. It should
> definitely play nicely with any already-existant logging code.

+1, there are already patterns for doing this which work, involve no
need for django.contrib.log or similar and I'll happily work with the
core devs to make this happen.

Regards,

Vinay Sajip

Russell Keith-Magee

unread,
Sep 29, 2009, 8:53:06 AM9/29/09
to django-d...@googlegroups.com
On Tue, Sep 29, 2009 at 5:09 PM, Vinay Sajip <vinay...@yahoo.co.uk> wrote:
>
> On Sep 17, 3:04 pm, Russell Keith-Magee <freakboy3...@gmail.com>
> wrote:
>> In the absence of specifics, this makes me a little bit nervous. The
>> Python logging interface may be very Java-heavy and complex, but it is
>> a thoroughly known quantity, and it houses a lot of features.
>
> See my comment about Java-heavy being FUD in Simon's initial post.

First off - let me reinforce that I'm in your camp here - I like
Python's logger, and I think we should be adding logging to Django.
Any hesitation I have expressed is mostly a function of institutional
inertia, especially with regards to Adrian's historical position on
logging.

However, I would point out that IMHO, FUD is an accurate description
of the state of play - though probably not in the way you probably
meant.

Python's logging api _looks_ a lot like log4j in parts. This is at
least partially because there's a limit to how many ways you can
express 'log.debug()' before you start to copy. However, as a result,
there's a lot of Fear, Uncertainty and Doubt as to whether a framework
that apparently has Java heritage is going to be any good in Python.
Don't forget that a lot of us (myself included) got into writing
Python to get away from the stupidities of the Java world. Those scars
are deep, and aren't going away in a hurry. Speaking personally, log4j
is responsible for a lot of those scars, due in no small part to the
abysmal documentation for that project.

>> I'm also not entirely convinced that the answer here isn't just
>> documentation. The documentation for log4j has historically been
>> pretty awful, and while Python's documentation is an improvement, it
>> could certainly be better IMHO. Good documentation for how to use
>> logging in the context of Django could go a long way.
>>
>
> I'm working with Doug Hellmann (PyMOTW) to try and improve the layout
> of the logging documentation in Python. I'm not asking for patches
> (though it would be nice), but if you can give *specific* criticisms
> (e.g. what you think is missing, or unclear) then that will focus our
> efforts.

My comment was actually directed at Django's documentation, which is
currently silent on the issue of logging - and probably shouldn't be.

However, since you're interested in feedback, my suggestion would be
to look at every defense you've made of logging in this thread (and
any other threads where you've had similar arguments), and work out
why the current docs have allowed those viewpoints to be established
as apparent fact. Some examples:

* Acknowledge that there is some Java heritage, but point out that
this doesn't mean it's a bad thing, and that there is a lot that
_isn't_ Java based about Python's logger.

* Highlight the important architectural picture. As you noted in
another reply - the logger and the handler are quite separate, and
this gives a lot of power. However, the existence and significance of
that architectural separation isn't really a major feature of the
current docs. At present, the architectural bits are buried inside API
discussion, but understanding this architecture is important if you're
going to understand why logging works the way it does, and understand
that logging isn't just putting lines into a file.

* Make the simple example actually simple. IMHO, a single-file simple
logging example is good for exactly 2 things:
- showing how to configure the simplest possible case of logging
- explaining the "why don't I have any output" problem.
Tasks like configuring the logger to use a rotating file handler are
important, but can wait for much later - once issues of basic usage
and architecture have been established.

* Better examples of how logging works in the real world. All the
examples focus on single file projects. Most of the complexities I've
had with logging stem from how to use it in a multiple-file project,
yet as far as I can make out, there is very little discussion of how
logging should be used in a real multiple-file project.
- Should I have one logger instance per module? One per conceptual "task"?
- You've used "logging.getLogger(__name__)" in this thread, but this
pattern isn't mentioned once in the docs. Is this best practice, or a
quick-and-dirty hack?
- When I have multiple loggers across multiple files, how do I
configure logging? Should I be putting logging.config.fileConfig() at
the start of every python file, or should I put the logging config
into a single python file somewhere that configures logging, and
import that module as needed?

>> Details notwithstanding, I'm +1 to the idea of adding logging to the
>> core framework - or, at least, making it easier to use logs for
>> reporting internal state and error conditions instead of email).
>>
>
> You can have your cake and eat it. It's perfectly feasible in Python
> logging to send only certain events to nominated email addresses (all
> configurable at run-time, so emails can be turned on/off, sent to
> different/additional destinations etc.) as well as e.g. logging
> tracebacks to file for the same events.

Agreed.

>> As for likely roadblocks: I've been led to believe that Adrian has
>> objections to framework-level logging. I have no idea as to the nature
>> of his objection, but ticket #5415 indicates that he is (or has been,
>> historically) in favor of adding signals that could be used for
>> logging or debugging purposes.
>>
>
> They (logging and signals) are two different things. Python logging
> allows you to consider the dimensions "What happened?", "Where did it
> happen?", "How important is it?" and "Who wants to know?"
> intelligently, and in particular it treats "What happened" and "Who
> wants to know?" orthogonally. You get a lot of ways of getting
> information to *people* whereas signals is more about letting *code*
> know what's going on.

Granted, although the two aren't mutually exclusive. After all, the
way you let someone know "what happened" is with code; It isn't hard
to think of a setup where we emit a signal everywhere that we might
want to log, and then attach a logging signal handler to those signal.

I'm not suggesting that this would be a good architecture - merely a
possible one.

> Unfortunately, a lot of people have got the impression that Python
> logging is "Java-like" and "not Pythonic" just because I acknowledged
> some good ideas in log4j. It's not as if Python people have a monopoly
> on good ideas, is it? This "Java heritage" perception sometimes leads
> to prejudice against logging, for no good reason that I can see.

An idea isn't bad just because it comes from the Java world, but in
fairness - the Java world does have a history of producing some pretty
dumb ideas. "Based on a Java API" isn't generally a complement in the
Python world :-)

I understand entirely the frustration of having a project perceived
"the wrong way" by the public - Django Evolution has been stuck with a
"magic" moniker for reasons that I can't begin to fathom. However, at
the end of the day, you can't blame the community for their
perceptions. One or two people might accidentally misunderstand
something, but when it starts happening systemically, you need to
start looking inward for the cause.

> Of
> course there might be grievances - for example, people complain about
> "slow". As a single logging call which just has a file handler is of
> the order of some tens of microseconds, I don't know how bad this is -
> what do we compare against? The Tornado webserver (used by FriendFeed)
> is a high-performance solution which uses Python logging. SQLAlchemy
> uses Python logging. They are careful to consider performance and as a
> consequence logging doesn't present a problem in practice.

For the record, speed actually isn't one of my major concerns.

Yours,
Russ Magee %-)

Waylan Limberg

unread,
Sep 29, 2009, 9:00:42 AM9/29/09
to django-d...@googlegroups.com

The hard part is that basicConfig only works like that back to Python
2.4 yet Django supports 2.3. When I added logging to Python-Markdown,
this was the hardest part. Figuring out how to configure logging so
that it works in 2.3 as well. The documentation is not exactly helpful
in that regard.

In fact, it was for this very reason that we added our own wrapper
around logging. It didn't seem reasonable for our users to go through
the same pain that we did. Sure we got a few things wrong at first,
but with the help of a few people in the community we worked those out
and our wrapper seems to work ok now. Yes - ok - I get the sense it
could be better.

Ever since then, any mention of logging leaves a bad taste in my
mouth. Perhaps if I was working only in 2.6 or such, this wouldn't be
an issue, but we have promised support back to 2.3.

Of course, it is possible that I'm missing something obvious.

--
----
\X/ /-\ `/ |_ /-\ |\|
Waylan Limberg

Russell Keith-Magee

unread,
Sep 29, 2009, 9:06:48 AM9/29/09
to django-d...@googlegroups.com
...

> Of course, it is possible that I'm missing something obvious.

As luck would have it, you are :-)

Django 1.2 will drop formal support for Python 2.3.

Yours,
Russ Magee %-)

Ivan Sagalaev

unread,
Sep 29, 2009, 9:25:33 AM9/29/09
to django-d...@googlegroups.com
Vinay Sajip wrote:
> Actually you don't need much in settings.py, and Django doesn't need
> to grow any code of its own to "wrap" logging. You can either
> configure logging programmatically (for which I use basicConfig, in
> simple setups) or using a configuration file (ConfigParser-based,
> fully documented in Python docs) for more complex setups.

Thanks! Didn't know that. However see my further comment.

>> Now we've settled on a WatchedFileHandler ported from Python 2.6 logging
>> module. It watches for file descriptor change and doesn't require
>> SIG_HUP to pick up a new file. May be we should port it to Django and
>> use it as a default handler for logging to file system.
>
> Why "port it to Django"? Do you mean, copy it into Django? I'm not
> sure it should be the default - not everybody uses logrotate. I'd
> leave this sort of decision for code in settings.py.

Using WatchedFileHandler is a safe default because it works as
FileHandler, just doesn't break with logrotate. I don't know of any
disadvantages of WatchedFileHandler before the old FileHandler. So I
don't think there's much value in giving people this choice in settings
because non-default behavior will be rare (and still possible anyway).

One of the reasons why I propose Django's own settings structure for
logging is because we can choose better defaults for logging and have
more compact syntax for them. Standard Python logging configuration has
a noticable gap between very simplistic basicConfig which configures
only a root channel and a verbose imperative definition of handler
objects, formatter objects and logger objects. I've found that my usage
of logging inevitably falls in between: I often need a few logging
channels but I almost never, say, reuse handler objects between them.

Here's a variant of a simple config that I had in mind lately:

LOGGING = {
'errors': {
'handler': 'django.logging.FileLogger', # WatchedFileLogger copy
'filename': '...',
'level': 'debug',
},
'maintenance': {
'handler': 'logging.handlers.HTTPHandler',
'host': '...',
'url': '....',
'format': '....'
},
}

Top-level keys are logger names. Values are dicts describing handlers.
These dicts have several keys that Django knows about:

- 'handler': a handler class. It's imported like any other stringified
classes in settings

- 'level': a level keyword that is translated into logging.* constants.
This is done to not make users import logging by hand.

- 'format': a format string for the logging.Formatter object. We can
have a more sensible default for this than the one in Python logging. Or
not :-)

These keys are pop'd out of the dict and the rest is used as **kwargs to
the handler class instantiation.

Django's default setup may look like this:

LOGGING = {
'': {'handler': 'logging.StreamHandler'}
}

This has an advantage of always configuring a root logger to avoid an
infamous warning from Python logging when the logger doesn't have any
handlers defined. Users wanting to configure all the logging themselves
may null-out this using `LOGGING = {}`.

Ivan Sagalaev

unread,
Sep 29, 2009, 9:35:36 AM9/29/09
to django-d...@googlegroups.com
Ivan Sagalaev wrote:
> Standard Python logging configuration has
> a noticable gap between very simplistic basicConfig which configures
> only a root channel and a verbose imperative definition of handler
> objects, formatter objects and logger objects.

Forgot one thing. As it stands now Django has this "historic" behavior
when it imports and executes settings module twice. This results in
breakage when you setup loggers and handlers by hand. We now circumvent
this by having a helper function that memoizes configured loggers and
call it from settings.py. Having a declarative config we can hide this
inside of Django and not scare users.

Vinay Sajip

unread,
Sep 29, 2009, 10:05:55 AM9/29/09
to Django developers

On Sep 29, 2:25 pm, Ivan Sagalaev <man...@softwaremaniacs.org> wrote:
>
> Using WatchedFileHandler is a safe default because it works as
> FileHandler, just doesn't break with logrotate. I don't know of any
> disadvantages of WatchedFileHandler before the old FileHandler. So I
> don't think there's much value in giving people this choice in settings
> because non-default behavior will be rare (and still possible anyway).
>

It's similar to Django's support for, say, simplejson. I think it's
reasonable for Django to alias WatchedFileHandler so that it's
available either bound to logging's own implementation (in
sufficiently recent versions of Python) or else a copy in Django's own
code. Then people can use it if they want to, even in older Python
versions.
I have no big problem with a configuration scheme such as you suggest
- if it's felt that a lot of Django users are not Python-savvy enough
and need some hand-holding, then you'd perhaps need something like
this. I usually configure the logging system using its own
configuration file format (ConfigParser based, and supported by the
stdlib so no additional Django code required) or using YAML (where
it's already being used for other configuration, and when having a
PyYAML dependency is not a problem.) Either way it's declarative and
not too painful. My reservation with Django's own take on it is simply
that it goes against TOOWTDI and the Zen of Python, a little at least.

Regards,

Vinay Sajip

Vinay Sajip

unread,
Sep 29, 2009, 10:09:03 AM9/29/09
to Django developers


On Sep 29, 2:35 pm, Ivan Sagalaev <man...@softwaremaniacs.org> wrote:
> Forgot one thing. As it stands now Django has this "historic" behavior
> when it imports and executes settings module twice. This results in
> breakage when you setup loggers and handlers by hand. We now circumvent
> this by having a helper function that memoizes configured loggers and
> call it from settings.py. Having a declarative config we can hide this
> inside of Django and not scare users.

This is how the way basicConfig works actually helps - it's a no-op if
you call it again. (It wasn't done that way for use with Django
particularly - just to make life easier for newbies and casual users.
Sure, an internal function would hide this from users.

Regards,

Vinay Sajip

Vinay Sajip

unread,
Sep 29, 2009, 10:59:45 AM9/29/09
to Django developers


On Sep 29, 1:53 pm, Russell Keith-Magee <freakboy3...@gmail.com>
wrote:
>
> First off - let me reinforce that I'm in your camp here - I like
> Python's logger, and I think we should be adding logging to Django.
> Any hesitation I have expressed is mostly a function of institutional
> inertia, especially with regards to Adrian's historical position on
> logging.
>

That's great. I agree about adding logging to Django, and would like
to help if I can. It would be good to understand what underlies
Adrian's historical position on logging.

> However, I would point out that IMHO, FUD is an accurate description
> of the state of play - though probably not in the way you probably
> meant.
>
> Python's logging api _looks_ a lot like log4j in parts. This is at
> least partially because there's a limit to how many ways you can
> express 'log.debug()' before you start to copy. However, as a result,
> there's a lot of Fear, Uncertainty and Doubt as to whether a framework
> that apparently has Java heritage is going to be any good in Python.
> Don't forget that a lot of us (myself included) got into writing
> Python to get away from the stupidities of the Java world. Those scars
> are deep, and aren't going away in a hurry. Speaking personally, log4j
> is responsible for a lot of those scars, due in no small part to the
> abysmal documentation for that project.
>

We're on the same page, I think. Python's similarity to log4j is, I
feel, skin deep. Just as our having features in common with monkeys
and apes doesn't *make* us monkeys and apes, so also with Python
logging and log4j. Compare and contrast: log4j is around 160 source
files and 16K SLOC, whereas Python logging is 3 source files and under
1500 SLOC! Notice the order of magnitude difference. Functionally,
Python logging pretty much provides the same functionality as log4j,
but it's a lot simpler internally. Python logging is *not* a port of
log4j, is written in as Pythonic a way as I know how (given that it
was written when it was), and got a lot of peer review from the smart
people on python-dev before going in (and got changed here and there
to satisy concerns raised during the review process). Nevertheless,
FUD (and I think I did mean it in that sense - Fear, Uncertainty and
Doubt) needs to allayed. I'll be happy to try and do this, please feel
free to ask any specific questions or make any specific criticisms and
I'll do my best to deal with them.

> My comment was actually directed at Django's documentation, which is
> currently silent on the issue of logging - and probably shouldn't be.

Right.

> However, since you're interested in feedback, my suggestion would be
> to look at every defense you've made of logging in this thread (and
> any other threads where you've had similar arguments), and work out
> why the current docs have allowed those viewpoints to be established
> as apparent fact. Some examples:
>
> * Acknowledge that there is some Java heritage, but point out that
> this doesn't mean it's a bad thing, and that there is a lot that
> _isn't_ Java based about Python's logger.
>

That's easier to do when people raise specific points, rather than
talk about Java heritage in an arm-waving way, as if it's an offshoot
of the Black Death ;-)

If you want an example of Python written in the Java style, look at
Apache QPid. Python logging ain't that.

> * Highlight the important architectural picture. As you noted in
> another reply - the logger and the handler are quite separate, and
> this gives a lot of power. However, the existence and significance of
> that architectural separation isn't really a major feature of the
> current docs. At present, the architectural bits are buried inside API

That's because the docs are really pitched mainly as a a reference
guide.

> discussion, but understanding this architecture is important if you're
> going to understand why logging works the way it does, and understand
> that logging isn't just putting lines into a file.
>

I've recently created a blog about Python logging, where I talk about
logging from first principles and try to show why the design of Python
logging is as it is. It's not perfect, but it's a start.

http://plumberjack.blogspot.com/2009/09/python-logging-101.html

> * Make the simple example actually simple. IMHO, a single-file simple
> logging example is good for exactly 2 things:
> - showing how to configure the simplest possible case of logging
> - explaining the "why don't I have any output" problem.

I'm not sure what you're getting at. Sometimes, those two things is
all that people want to know at that time.

> Tasks like configuring the logger to use a rotating file handler are
> important, but can wait for much later - once issues of basic usage
> and architecture have been established.

Sure, and I hope with Doug Hellmann's input we can get the Python
logging docs to be laid out in a more logical order and imbued with
more clarity. Doug's PyMOTW series sets a high bar for documentation
quality - concise, yet clear.

> * Better examples of how logging works in the real world. All the
> examples focus on single file projects. Most of the complexities I've
> had with logging stem from how to use it in a multiple-file project,
> yet as far as I can make out, there is very little discussion of how
> logging should be used in a real multiple-file project.

Partly because every system is different. I agree, it can't hurt to
add some more complex examples.

> - Should I have one logger instance per module? One per conceptual "task"?
> - You've used "logging.getLogger(__name__)" in this thread, but this
> pattern isn't mentioned once in the docs. Is this best practice, or a
> quick-and-dirty hack?

True, it should be perhaps clarified as a best practice, but I tend to
be wary of being too prescriptive.

> - When I have multiple loggers across multiple files, how do I
> configure logging? Should I be putting logging.config.fileConfig() at
> the start of every python file, or should I put the logging config
> into a single python file somewhere that configures logging, and
> import that module as needed?
>

This is all very good feedback and I will be thinking about how to
address this in the docs.

> > They (logging and signals) are two different things. Python logging
> > allows you to consider the dimensions "What happened?", "Where did it
> > happen?", "How important is it?" and "Who wants to know?"
> > intelligently, and in particular it treats "What happened" and "Who
> > wants to know?" orthogonally. You get a lot of ways of getting
> > information to *people* whereas signals is more about letting *code*
> > know what's going on.
>
> Granted, although the two aren't mutually exclusive. After all, the
> way you let someone know "what happened" is with code; It isn't hard
> to think of a setup where we emit a signal everywhere that we might
> want to log, and then attach a logging signal handler to those signal.
>
> I'm not suggesting that this would be a good architecture - merely a
> possible one.
>

I agree - it's possible, but perhaps not the most simple or intuitive.
If you can log something directly from where you are, why send a
signal to some other code which, while processing that signal, will
log something for you?

>
> An idea isn't bad just because it comes from the Java world, but in
> fairness - the Java world does have a history of producing some pretty
> dumb ideas. "Based on a Java API" isn't generally a complement in the
> Python world :-)

I said based on their ideas, not based on their code. I didn't even
look at the log4j code when implementing Python logging. But as you
said, logger.debug("Message") is not quintessential-Java-and-a-million-
miles-from-Python.

>
> I understand entirely the frustration of having a project perceived
> "the wrong way" by the public - Django Evolution has been stuck with a
> "magic" moniker for reasons that I can't begin to fathom. However, at
> the end of the day, you can't blame the community for their
> perceptions. One or two people might accidentally misunderstand
> something, but when it starts happening systemically, you need to
> start looking inward for the cause.
>

Agreed. People have prejudices, you just have to work away at removing
misconceptions.

>
> For the record, speed actually isn't one of my major concerns.
>

Thanks for all the feedback. It's good to get so much after a long
time in cold turkey ;-)

Regards,

Vinay Sajip

Simon Willison

unread,
Sep 29, 2009, 1:07:17 PM9/29/09
to Django developers
On Sep 29, 9:36 am, Vinay Sajip <vinay_sa...@yahoo.co.uk> wrote:
> Simon, I'm the author of Python's logging package. Sorry for the delay
> in replying, I've been away from this list awhile. I think the "Java
> heritage shines through" is just FUD.

Hi Vinjay,

Thanks for dropping by. Firstly, I'm sorry for disparaging of your
work. I mis-spoke when I said the logging module wasn't "nicely