Logging format decision

24 views
Skip to first unread message

Ivan Sagalaev

unread,
Jan 9, 2010, 3:45:02 PM1/9/10
to django-d...@googlegroups.com
Hello everyone!

There was a discussion here some time ago about logging in Django that
led to the ticket 12012 [1] where Simon has done most (all?) of the
implementation.

There is still a disagreement on the format of the configuration dict
for it. Simon has implemented it from my ad-hoc proposal [2]. Vinay
Sajip has an alternative suggestion [3]. I think a core-dev ruling is
due here.

To state my own opinion, I sill like my format better because it defines
each logging channel in one place without needing to separately define
formatters and handlers. I understand that it makes it less powerful but
I'd rather have a very simple format here -- we still can configure
logging imperatively in advanced cases.

I'll be happy to do some docs and tests for the ticket once it's decided
(as far as I can see Simon's branch lack them).

[1]: http://code.djangoproject.com/ticket/12012
[2]:
http://groups.google.com/group/django-developers/msg/7452fac3ae3b8175
(look for item 4)
[3]: http://mail.python.org/pipermail/python-dev/2009-October/092782.html

Russell Keith-Magee

unread,
Jan 13, 2010, 8:57:38 PM1/13/10
to django-d...@googlegroups.com
On Sun, Jan 10, 2010 at 4:45 AM, Ivan Sagalaev
<man...@softwaremaniacs.org> wrote:
> Hello everyone!
>
> There was a discussion here some time ago about logging in Django that led
> to the ticket 12012 [1] where Simon has done most (all?) of the
> implementation.
>
> There is still a disagreement on the format of the configuration dict for
> it. Simon has implemented it from my ad-hoc proposal [2]. Vinay Sajip has an
> alternative suggestion [3]. I think a core-dev ruling is due here.

Hi Ivan,

Sorry for taking so long to reply to this. Logging was on my todo list
for 1.2, but unfortunately, I didn't have enough time to drive the
logging proposal as well as the other features that I wanted to get
into 1.2.

Now that the alpha is out, the major feature freeze is in place, so
the opportunity for logging in 1.2 has passed. Although logging isn't
especially complicated in itself, we need to do it right. I don't want
to rush anything into trunk just so I can tick a box on a feature
list.

That said - I really do want to see logging in Django's trunk. If we
can sort out the details in the background and get a good
implementation ready, it could easily be the first feature committed
for v1.3.

> To state my own opinion, I sill like my format better because it defines
> each logging channel in one place without needing to separately define
> formatters and handlers. I understand that it makes it less powerful but I'd
> rather have a very simple format here -- we still can configure logging
> imperatively in advanced cases.

I like Vinay's suggestion for exactly the reason you dislike it. The
definition of a handler is independent of the loggers that use that
handler. Simon's example in (4) hides a lot of details behind ellipses
- if you expand those ellipses, you discover that there is a lot of
repeated typing of '/tmp/django-sql.log' etc if you want all your logs
to appear in the same file. Under Vinay's proposal, you define a
single file log handler, and instruct each of the loggers to use that
handler.

The other reason to like Vinay's format is that this is the format
he's trying to push into Python trunk. If Python is going to adopt a
configuration format, it makes sense that we should adhere to it. The
last thing we need is another NIH argument getting leveled at Django
by the wider Python community :-)

The other issue that I think still needs to be addressed is the
internal usage of logging by Django itself. The last time I looked at
it, Simon's branch provided the plumbing to support logging, but
didn't really make extensive use of that plumbing. In particular, the
logging framework should be able to supplement/replace anywhere that
mail_admins is currently being used, so that the current 'send an
email' error notification process can be configured to 'log the error
AND send an email' or 'just log the error'. If you want to help
advance this ticket, updating the patch to demonstrate how Django will
use logs internally will be an important step.

Yours,
Russ Magee %-)

Simon Willison

unread,
Jan 15, 2010, 8:47:28 AM1/15/10
to Django developers
On Jan 14, 1:57 am, Russell Keith-Magee <freakboy3...@gmail.com>
wrote:

> The other issue that I think still needs to be addressed is the
> internal usage of logging by Django itself.

My biggest hangup with the logging implementation was figuring out
exactly when the logging configuration code should actually run. The
problem is that Django's startup execution order is poorly defined -
stuff relating to settings is lazily evaluated the first time it's
needed, and there's no real documented place to put code that needs to
be executed once (the fact so many registration things end up being
called from urls.py is a good indicator that this is a problem).

This has put me off using signal handlers in my own code in the past,
and it also came up with the logging work.

I think Django needs a defined and documented execution order on
startup. I'm not entirely clear why we've avoided this in the past -
I've read http://www.b-list.org/weblog/2007/nov/05/server-startup/ but
it hasn't convinced me that a defined startup order for things like
registration patterns, signal handlers, logging configuration is a bad
thing.

Personally, I'd like to see this happen as part of the instantiation
of a single Django "site object" which handles all requests to the
server. We almost do this at the moment - if you look at the code in
http://code.djangoproject.com/browser/django/trunk/django/core/handlers/
each handler basically instantiates an object once which then has its
get_response() method called for every incoming request. Unfortunately
the time at which this object is instantiated seems to be pretty ad-
hoc - how and when it happens depends on if you're running under WSGI,
mod_python or the development server.

Defining Django's execution order is a pretty big job, but it would be
a great thing to achieve for 1.3. The obvious thing to tie it to would
be INSTALLED_APPS - it could be as simple as having an API promise
that the first thing Django does when it "starts" is to run through
each application in INSTALLED_APPS looking for and importing an
autoload.py module. I imagine we'll find that the time at which models
are registered is critical (what if autoload.py wants to use a model
that hasn't been loaded in to the AppCache yet?) and may need to do
more than one pass through INSTALLED_APPS. Plenty of details to figure
out.

Cheers,

Simon

Message has been deleted

Vinay Sajip

unread,
Jan 15, 2010, 10:08:01 AM1/15/10
to Django developers
On Jan 15, 1:47 pm, Simon Willison <si...@simonwillison.net> wrote:
> My biggest hangup with the logging implementation was figuring out
> exactly when the logging configuration code should actually run. The
> problem is that Django's startup execution order is poorly defined -
> stuff relating to settings is lazily evaluated the first time it's
> needed, and there's no real documented place to put code that needs to
> be executed once (the fact so many registration things end up being
> called from urls.py is a good indicator that this is a problem).
>
> This has put me off using signal handlers in my own code in the past,
> and it also came up with the logging work.
>
> I think Django needs a defined and documented execution order on
> startup. I'm not entirely clear why we've avoided this in the past -
> I've readhttp://www.b-list.org/weblog/2007/nov/05/server-startup/but
> it hasn't convinced me that a defined startup order for things like
> registration patterns, signal handlers, logging configuration is a bad
> thing.
>
> Personally, I'd like to see this happen as part of the instantiation
> of a single Django "site object" which handles all requests to the
> server. We almost do this at the moment - if you look at the code inhttp://code.djangoproject.com/browser/django/trunk/django/core/handlers/

> each handler basically instantiates an object once which then has its
> get_response() method called for every incoming request. Unfortunately
> the time at which this object is instantiated seems to be pretty ad-
> hoc - how and when it happens depends on if you're running under WSGI,
> mod_python or the development server.
>
> Defining Django's execution order is a pretty big job, but it would be
> a great thing to achieve for 1.3. The obvious thing to tie it to would
> be INSTALLED_APPS - it could be as simple as having an API promise
> that the first thing Django does when it "starts" is to run through
> each application in INSTALLED_APPS looking for and importing an
> autoload.py module. I imagine we'll find that the time at which models
> are registered is critical (what if autoload.py wants to use a model
> that hasn't been loaded in to the AppCache yet?) and may need to do
> more than one pass through INSTALLED_APPS. Plenty of details to figure
> out.

You make many valid points, Simon, but at least as far as the narrower
goal of setting up logging is concerned, I had a look at this back in
October and came up with a candidate solution which worked for me. I'd
be grateful if you took a look at it and highlight any drawbacks of
the approach. The one point we know must be reached before any request
handling takes place is the reading of settings.py, so I based my
approach on that. From what I can see, the approach I've outlined is
not just suitable for setting up logging but also doing model-related
stuff such as listening for e.g. class_prepared signals etc. from the
app loading machinery.

Here's a link to the October post:

http://groups.google.com/group/django-developers/msg/90b81d9f927d0bf0

Regards,

Vinay Sajip

Vinay Sajip

unread,
Jan 15, 2010, 7:17:50 PM1/15/10
to Django developers
Further to my earlier post - I've put the whole branch on Launchpad:

https://code.launchpad.net/~vinay-sajip/django/logging

There's nothing much there at the moment, apart from some code as a
proof of concept. Tested with the following code tacked on at the end
of a vanilla settings.py:

import logging

def init_logging():
#Configure root to write to file
logging.basicConfig(level=logging.DEBUG, filename='logtest.log',
filemode='w', format='%(asctime)s %
(levelname)-8s %(name)-13s %(message)s')
alogger = logging.getLogger("logtest")
alogger.setLevel(logging.INFO)
logger = logging.getLogger("logtest.foo")
logger.setLevel(logging.WARNING)
logger = logging.getLogger("logtest.bar")
logger.setLevel(logging.ERROR)
#Call dummy function in django.utils.log
from django.utils.log import configure_from_dict
configure_from_dict({'version' : '0'})
alogger.info("Called configure_from_dict (just to show we can)")

def test_logging():
from random import choice
levels = (logging.DEBUG, logging.INFO, logging.WARNING,
logging.ERROR, logging.CRITICAL)
loggers = ('', 'logtest', 'logtest.foo', 'logtest.bar')
for i in xrange(1000):
level = choice(levels)
logger = logging.getLogger(choice(loggers))
logger.log(level, "Message #%d", i + 1)


def listener(sender, *args, **kwargs):
logger = logging.getLogger("logtest")
logger.info("class_prepared listener called: %s", sender.__name__)

def pre_model_callback():
from django.db.models.signals import class_prepared
logger = logging.getLogger("logtest")
logger.info("Adding class_prepared listener ...")
class_prepared.connect(listener)

def post_model_callback():
from django.contrib.auth.models import User
logger = logging.getLogger("logtest")
try:
umsg = "%(username)s (%(first_name)s %(last_name)s)"
users = ", ".join([umsg % u.__dict__ for u in User.objects.all
()])
logger.info("ORM works: all users: %s" % users)
except:
logger.exception("Unable to get all users")

BOOTSTRAP_CALLBACKS = (
init_logging,
#test_logging,
)

PRE_MODEL_CALLBACKS = (
pre_model_callback,
)

POST_MODEL_CALLBACKS = (
post_model_callback,
)

Ivan Sagalaev

unread,
Jan 16, 2010, 4:17:30 PM1/16/10
to django-d...@googlegroups.com
Russell Keith-Magee wrote:
> That said - I really do want to see logging in Django's trunk. If we
> can sort out the details in the background and get a good
> implementation ready, it could easily be the first feature committed
> for v1.3.

Agreed!

> I like Vinay's suggestion for exactly the reason you dislike it. The
> definition of a handler is independent of the loggers that use that
> handler. Simon's example in (4) hides a lot of details behind ellipses
> - if you expand those ellipses, you discover that there is a lot of
> repeated typing of '/tmp/django-sql.log' etc if you want all your logs
> to appear in the same file. Under Vinay's proposal, you define a
> single file log handler, and instruct each of the loggers to use that
> handler.

Actually, I don't see this problem of repetition that you're describing
so I thought that we could use some examples closer to real world to
sort this out. May be I just don't see something obvious to you or have
misunderstood Vinay's schema. I will refer to two formats we're
discussing as "explicit handlers" (Vinay's) and "implicit handlers"
(what Simon has implemented). Better suggestions for naming are welcome :-).

1. The first use-case is the development environment where all you want
is everything logged in one file with DEBUG level.

"Implicit handlers":

LOGGING = {
'': {
'handler': 'django.logging.FileLogger',
'filename': '/home/user/myproject/all.log',
'level': 'debug',
},
}

"Explicit handlers":

LOGGING = {
'formatters': {
'all': '%(asctime)s %(name)-15s %(levelname)-8s %(message)s',
},
'handlers': {
'all': {
'class': 'django.logging.FileLogger',
'filename': '/home/user/myproject/all.log',
'formatter': 'all',
'level': 'debug',
},
},
'loggers': {
'': {
'handlers': ['all'],
},
},
}

2. A production environment. This actually what we have for one of our
services:

- a file for uncaught server errors that are logged by a function
listening to `got_request_exception` signal which uses 'exception' label
to log
- a file for maintenance messages that come from cron scripts that all
use 'maintenance.*' labels to log
- a file for everything else

"Implicit handlers":

_log_path = '/var/log/service/'

LOGGING = {
'exception': {
'handler': 'django.logging.FileLogger',
'filename': _log_path + 'exception.log',
'format': '%(asctime)s %(levelname)-8s %(message)s',
'level': 'warning',
'propagate': False,
},
'maintenance': {
'handler': 'django.logging.FileLogger',
'filename': _log_path + 'maintenance.log',
'format': '%(asctime)s %(name)-15s %(levelname)-8s
%(message)s',
'level': 'info',
'propagate': False,
},
'': {
'handler': 'django.logging.FileLogger',
'filename': _log_path + 'info.log',
'format': '%(asctime)s %(name)-15s %(levelname)-8s
%(message)s',
'level': 'info',
},
}

(Note: "propagate" is not implemented right now in Simon's code but is
quite easy to add).

"Explicit handlers":

_log_path = '/var/log/service/'

LOGGING = {
'formatters': {
'exception': '%(asctime)s %(levelname)-8s %(message)s',
'default': '%(asctime)s %(name)-15s %(levelname)-8s
%(message)s',
},
'handlers': {
'exception': {
'class': 'django.logging.FileLogger',
'filename': _log_path + 'exception.log',
'formatter': 'exception',
'level': 'warning',
},
'maintenance': {
'class': 'django.logging.FileLogger',
'filename': _log_path + 'maintenance.log',
'formatter': 'default',
'level': 'info',
},
'info': {
'class': 'django.logging.FileLogger',
'filename': _log_path + 'info.log',
'formatter': 'default',
'level': 'info',
}
},
'loggers': {
'exception': {'handlers': ['exception'], 'propagate': False},
'maintenance': {'handlers': ['maintenance'], 'propagate':
False},
'': {'handlers': ['info']},
},
}

In both examples "implicit handlers" give shorter code and, to me, are
more readable.

The main point that I'm trying to argue is that having handlers and
loggers *separate* is not needed in real world scenarios. If you want
your logging in several files you will have to define several handlers
and several loggers anyway. "Implicit handlers" format accept this fact
and couples both handler and logger definitions in one place.

The only thing that "explicit handlers" really buys you in a second
example is not having to repeat a default formatter string twice. And
this can easily be done for "implicit handlers" by sticking it to a temp
var (I would do it if I had more than just two repetitions:

_default_format = '...'

LOGGING = {
...
'format': _default_format,
...
'format': _default_format,
}

I perfectly understand that "implicit handlers" are less flexible. You
can't have different log levels for a logger and a handler for example.
Or you can't have one shared handler for two totally unrelated loggers
from your different apps. Or you can't have two handlers for one logger.
But the fact is that I, personally, never needed this flexibility in
practice. And we still allow people to do all these things anyway by
configuring logging directly.

> The other reason to like Vinay's format is that this is the format
> he's trying to push into Python trunk. If Python is going to adopt a
> configuration format, it makes sense that we should adhere to it. The
> last thing we need is another NIH argument getting leveled at Django
> by the wider Python community :-)

I understand this point. This is why I'm going to argue for "implicit
handlers" just to the point when I see that I made myself clear, not
until everyone agrees with me :-). May be consistency here is more
important than convenience.

However, may be we can work out a format that suit both parties. What I
want to pursue here is "simple thing should be simple, hard things
possible". Here's another ad-hoc attempt at having both things together:

LOGGING = {
'_handlers': {
'maintenance': { ... },
}
'app1': {'handlers': ['maintenance']},
'app2': {'handlers': ['maintenance']},
'exception': {
'handler': 'django.logging.FileLogger',
'filename': '/var/log/service/exception.log',
'format': '%(asctime)s %(levelname)-8s %(message)s',
'level': 'warning',
'propagate': False,
},
}

This is a variant of "implicit handlers" but with a special key
'_handlelrs' that defines named handlers. The same can be done for
'_formatters' etc. Parsing code is not incredibly complicated:

- find all named handlers and formatters, remove these parts from the dict
- treat everything else as loggers
- for each logger, check if its 'handler' or 'handlers' refer to a named
handler, use it; otherwise treat 'handler' as a class string
- same for formatters

What do you guys think?

Vinay Sajip

unread,
Jan 17, 2010, 4:58:58 AM1/17/10
to Django developers
On Jan 16, 9:17 pm, Ivan Sagalaev <man...@softwaremaniacs.org> wrote:
> The main point that I'm trying to argue is that having handlers and
> loggers *separate* is not needed in real world scenarios. If you want
> your logging in several files you will have to define several handlers
> and several loggers anyway. "Implicit handlers" format accept this fact
> and couples both handler and logger definitions in one place.

I don't agree with your statement that "having handlers and loggers
separate is not needed in real world scenarios". Maybe in your
experience you haven't needed it, but I certainly have on more than
one occasion. What I've found is that everyone's logging needs are
different, and the rationale for having handlers and loggers as
separate concepts is described in

http://plumberjack.blogspot.com/2009/09/python-logging-101.html

> I understand this point. This is why I'm going to argue for "implicit
> handlers" just to the point when I see that I made myself clear, not
> until everyone agrees with me :-). May be consistency here is more
> important than convenience.
>
> However, may be we can work out a format that suit both parties. What I

Well, I believe I've already covered this in PEP 391:

http://www.python.org/dev/peps/pep-0391/#user-defined-objects

You can instantiate stuff either using the default schema or use your
own factories to construct the logging objects you need. I would say,
if a simplifying layer is needed, it can be built on top of the
proposed PEP 391 changes, and perhaps some usage patterns would become
apparent over time to be so common as to justify putting in the core.
Although the PEP refers to "user-defined objects", you can use a
factory to instantiate any logging object, including instances of
existing logging classes.

And, don't forget the Zen of Python - "Explicit is better than
implicit" ;-)

Regards,

Vinay Sajip

Russell Keith-Magee

unread,
Jan 17, 2010, 5:40:31 AM1/17/10
to django-d...@googlegroups.com

I can't argue that your version is shorter - but the terseness is only
possible because of the example that you have chosen (and the features
you've chosen to not exploit).

You've got a 1-1 correspondence between loggers and handlers, which
misses the point that there can be multiple handlers per logger (e.g.,
log all critical messages to a file, AND email the sysadmin), and/or
multiple loggers per handler (e.g., errors from the view subsystem and
errors from the http subsystem should both be written to 'django.log')

You've also defined a production server that uses just two loggers -
exception and info - for the entire server. Again, this isn't
especially realistic - in practice, you have a large number of fine
grained loggers for different parts of the system. Vinay's reference
on Logging 101 gives a good coverage of how logging should be used.

>> The other reason to like Vinay's format is that this is the format
>> he's trying to push into Python trunk. If Python is going to adopt a
>> configuration format, it makes sense that we should adhere to it. The
>> last thing we need is another NIH argument getting leveled at Django
>> by the wider Python community :-)
>
> I understand this point. This is why I'm going to argue for "implicit
> handlers" just to the point when I see that I made myself clear, not until
> everyone agrees with me :-). May be consistency here is more important than
> convenience.

I think I understand what you're advocating - I just think you've
missed some very important use cases. If you restrict loggers to the
subset of possible usage that you show in your demos, then yes, your
format is sufficient and (arguably) preferable. However, if you
restricted yourself to that subset, you'd be missing a lot of the
benefits that Python logging module has to offer.

Yours,
Russ Magee %-)

Ivan Sagalaev

unread,
Jan 17, 2010, 9:49:51 AM1/17/10
to django-d...@googlegroups.com
Russell Keith-Magee wrote:
> You've also defined a production server that uses just two loggers -
> exception and info - for the entire server. Again, this isn't
> especially realistic - in practice, you have a large number of fine
> grained loggers for different parts of the system.

In fact, most of our production services use just two loggers: one for
exceptions and one for everything else. So that one with maintenance was
the most interesting :-). But see below...

> I think I understand what you're advocating - I just think you've
> missed some very important use cases.

And I do understand what you and Vinay are saying. I really don't miss
those use-cases. I intentionally try to avoid them in practice to not
complicate things.

But since you both feel that they are indeed important so be it, let's
make it this way.

Reply all
Reply to author
Forward
0 new messages