Simon Willison wrote:
> 1. We'll be able to de-emphasise the current default "e-mail all
> errors to someone" behaviour, which doesn't scale at all well.
In a recent thread[1] on a similar topic Russel has also emphasized that
we should improve documentation about doing logging.
> 3. Logging in Django core rather than a 3rd party app will encourage
> reusable applications to log things in a predictable way, standard
> way.
Talking about predictable and standard way I want to be sure that we
don't break existing logging.
I.e. we at Yandex now have many reusable Django apps that don't setup
loggers themselves but just log things into named loggers and expect
them to be setup by a project that uses them.
What I gather from your proposal is that you want the same model ("an
app logs, a project setups") plus a nice declarative syntax in
settings.py instead of boring creation of handlers, formatters and
loggers. Right?
> - We could replace (or complement) django.connection.queries with a
> log of executed SQL. This would make the answer to the common question
> "how do I see what SQL is being executed" much more obvious.
In the thread that I was referring to[1] we kind of agreed on using a
signal there. Then hooking a logger onto the signal is simple.
> 5. People might leave logging on, then find their server disk has
> filled up with log files and caused their site to crash.
We had this problem with standard logging. Then we switched to a
RotatingFileHandler which wasn't very good however because its behavior
is simplistic and is not controllable by admins with an interface that
they know, namely logrotate. Setting up logrotate also wasn't without
problems. When it rotates a file it should let an app know about it and
it uses SIG_HUP for that. However this instantly terminates Django's
flup-based FastCGI server which we use.
Now we've settled on a WatchedFileHandler ported from Python 2.6 logging
module. It watches for file descriptor change and doesn't require
SIG_HUP to pick up a new file. May be we should port it to Django and
use it as a default handler for logging to file system.
> 6. Logging levels are confusing - what exactly is the difference
> between warn, info, error, debug, critical and fatal? We would need to
> document this and make decisions on which ones get used for what
> within the framework.
May be we can just leave it for users to decide. It depends so much on
how much an app actually wants from logging.
The only standard thing I can think of is to have DEBUG = True imply
level = logging.DEBUG (which includes everything more sever). DEBUG =
False will imply logging.INFO then. What do you think?
> # If you want custom log handlers - not sure how these would interact
> with
> # channels and log levels yet
> LOG_HANDLERS = (
> 'django.core.log.handlers.LogToDatabase',
> 'django.core.log.handlers.LogToEmail',
> )
This is a hard problem really. Most handlers require different set of
arguments. File names, email credentials, system idents for SysLog etc.
Also there should be different formatters. For example there's no point
to waste space in a syslog message on a timestamp since syslog tracks it
itself...
[1]:
http://groups.google.com/group/django-developers/browse_frm/thread/9d0992e800cf7d68#
No disagreement here with any of these assertions.
In the absence of specifics, this makes me a little bit nervous. The
Python logging interface may be very Java-heavy and complex, but it is
a thoroughly known quantity, and it houses a lot of features.
I've seen several attempts to wrap Java loggers in a "nicer"
interface, and every one of them ended up hobbling some of the power
features of the logger. There is also the issue of our wrapper playing
nicely with the loggers already being used in the wild.
I'm also not entirely convinced that the answer here isn't just
documentation. The documentation for log4j has historically been
pretty awful, and while Python's documentation is an improvement, it
could certainly be better IMHO. Good documentation for how to use
logging in the context of Django could go a long way.
> 3. We risk people using logging where signals would be more
> appropriate.
This may be a better way to approach the problem - more details below.
Details notwithstanding, I'm +1 to the idea of adding logging to the
core framework - or, at least, making it easier to use logs for
reporting internal state and error conditions instead of email).
As for likely roadblocks: I've been led to believe that Adrian has
objections to framework-level logging. I have no idea as to the nature
of his objection, but ticket #5415 indicates that he is (or has been,
historically) in favor of adding signals that could be used for
logging or debugging purposes.
Yours,
Russ Magee %-)
To clarify - I think that documentation is the very least we should
do. As your comments indicate, there are a lot of things you need to
do in order to get logging right, so we should at the very least
provide some documentation on how to do it right.
Yours,
Russ Magee %-)
> As for likely roadblocks: I've been led to believe that Adrian has
> objections to framework-level logging. I have no idea as to the nature
> of his objection, but ticket #5415 indicates that he is (or has been,
> historically) in favor of adding signals that could be used for
> logging or debugging purposes.
I'm in favor of the signals approach as part of the core framework. In
addition, django.contrib.logging, which could provide a generic
solution, good for the
80 or 90% case would really rock.
First off - let me reinforce that I'm in your camp here - I like
Python's logger, and I think we should be adding logging to Django.
Any hesitation I have expressed is mostly a function of institutional
inertia, especially with regards to Adrian's historical position on
logging.
However, I would point out that IMHO, FUD is an accurate description
of the state of play - though probably not in the way you probably
meant.
Python's logging api _looks_ a lot like log4j in parts. This is at
least partially because there's a limit to how many ways you can
express 'log.debug()' before you start to copy. However, as a result,
there's a lot of Fear, Uncertainty and Doubt as to whether a framework
that apparently has Java heritage is going to be any good in Python.
Don't forget that a lot of us (myself included) got into writing
Python to get away from the stupidities of the Java world. Those scars
are deep, and aren't going away in a hurry. Speaking personally, log4j
is responsible for a lot of those scars, due in no small part to the
abysmal documentation for that project.
>> I'm also not entirely convinced that the answer here isn't just
>> documentation. The documentation for log4j has historically been
>> pretty awful, and while Python's documentation is an improvement, it
>> could certainly be better IMHO. Good documentation for how to use
>> logging in the context of Django could go a long way.
>>
>
> I'm working with Doug Hellmann (PyMOTW) to try and improve the layout
> of the logging documentation in Python. I'm not asking for patches
> (though it would be nice), but if you can give *specific* criticisms
> (e.g. what you think is missing, or unclear) then that will focus our
> efforts.
My comment was actually directed at Django's documentation, which is
currently silent on the issue of logging - and probably shouldn't be.
However, since you're interested in feedback, my suggestion would be
to look at every defense you've made of logging in this thread (and
any other threads where you've had similar arguments), and work out
why the current docs have allowed those viewpoints to be established
as apparent fact. Some examples:
* Acknowledge that there is some Java heritage, but point out that
this doesn't mean it's a bad thing, and that there is a lot that
_isn't_ Java based about Python's logger.
* Highlight the important architectural picture. As you noted in
another reply - the logger and the handler are quite separate, and
this gives a lot of power. However, the existence and significance of
that architectural separation isn't really a major feature of the
current docs. At present, the architectural bits are buried inside API
discussion, but understanding this architecture is important if you're
going to understand why logging works the way it does, and understand
that logging isn't just putting lines into a file.
* Make the simple example actually simple. IMHO, a single-file simple
logging example is good for exactly 2 things:
- showing how to configure the simplest possible case of logging
- explaining the "why don't I have any output" problem.
Tasks like configuring the logger to use a rotating file handler are
important, but can wait for much later - once issues of basic usage
and architecture have been established.
* Better examples of how logging works in the real world. All the
examples focus on single file projects. Most of the complexities I've
had with logging stem from how to use it in a multiple-file project,
yet as far as I can make out, there is very little discussion of how
logging should be used in a real multiple-file project.
- Should I have one logger instance per module? One per conceptual "task"?
- You've used "logging.getLogger(__name__)" in this thread, but this
pattern isn't mentioned once in the docs. Is this best practice, or a
quick-and-dirty hack?
- When I have multiple loggers across multiple files, how do I
configure logging? Should I be putting logging.config.fileConfig() at
the start of every python file, or should I put the logging config
into a single python file somewhere that configures logging, and
import that module as needed?
>> Details notwithstanding, I'm +1 to the idea of adding logging to the
>> core framework - or, at least, making it easier to use logs for
>> reporting internal state and error conditions instead of email).
>>
>
> You can have your cake and eat it. It's perfectly feasible in Python
> logging to send only certain events to nominated email addresses (all
> configurable at run-time, so emails can be turned on/off, sent to
> different/additional destinations etc.) as well as e.g. logging
> tracebacks to file for the same events.
Agreed.
>> As for likely roadblocks: I've been led to believe that Adrian has
>> objections to framework-level logging. I have no idea as to the nature
>> of his objection, but ticket #5415 indicates that he is (or has been,
>> historically) in favor of adding signals that could be used for
>> logging or debugging purposes.
>>
>
> They (logging and signals) are two different things. Python logging
> allows you to consider the dimensions "What happened?", "Where did it
> happen?", "How important is it?" and "Who wants to know?"
> intelligently, and in particular it treats "What happened" and "Who
> wants to know?" orthogonally. You get a lot of ways of getting
> information to *people* whereas signals is more about letting *code*
> know what's going on.
Granted, although the two aren't mutually exclusive. After all, the
way you let someone know "what happened" is with code; It isn't hard
to think of a setup where we emit a signal everywhere that we might
want to log, and then attach a logging signal handler to those signal.
I'm not suggesting that this would be a good architecture - merely a
possible one.
> Unfortunately, a lot of people have got the impression that Python
> logging is "Java-like" and "not Pythonic" just because I acknowledged
> some good ideas in log4j. It's not as if Python people have a monopoly
> on good ideas, is it? This "Java heritage" perception sometimes leads
> to prejudice against logging, for no good reason that I can see.
An idea isn't bad just because it comes from the Java world, but in
fairness - the Java world does have a history of producing some pretty
dumb ideas. "Based on a Java API" isn't generally a complement in the
Python world :-)
I understand entirely the frustration of having a project perceived
"the wrong way" by the public - Django Evolution has been stuck with a
"magic" moniker for reasons that I can't begin to fathom. However, at
the end of the day, you can't blame the community for their
perceptions. One or two people might accidentally misunderstand
something, but when it starts happening systemically, you need to
start looking inward for the cause.
> Of
> course there might be grievances - for example, people complain about
> "slow". As a single logging call which just has a file handler is of
> the order of some tens of microseconds, I don't know how bad this is -
> what do we compare against? The Tornado webserver (used by FriendFeed)
> is a high-performance solution which uses Python logging. SQLAlchemy
> uses Python logging. They are careful to consider performance and as a
> consequence logging doesn't present a problem in practice.
For the record, speed actually isn't one of my major concerns.
Yours,
Russ Magee %-)
The hard part is that basicConfig only works like that back to Python
2.4 yet Django supports 2.3. When I added logging to Python-Markdown,
this was the hardest part. Figuring out how to configure logging so
that it works in 2.3 as well. The documentation is not exactly helpful
in that regard.
In fact, it was for this very reason that we added our own wrapper
around logging. It didn't seem reasonable for our users to go through
the same pain that we did. Sure we got a few things wrong at first,
but with the help of a few people in the community we worked those out
and our wrapper seems to work ok now. Yes - ok - I get the sense it
could be better.
Ever since then, any mention of logging leaves a bad taste in my
mouth. Perhaps if I was working only in 2.6 or such, this wouldn't be
an issue, but we have promised support back to 2.3.
Of course, it is possible that I'm missing something obvious.
--
----
\X/ /-\ `/ |_ /-\ |\|
Waylan Limberg
As luck would have it, you are :-)
Django 1.2 will drop formal support for Python 2.3.
Yours,
Russ Magee %-)
Thanks! Didn't know that. However see my further comment.
>> Now we've settled on a WatchedFileHandler ported from Python 2.6 logging
>> module. It watches for file descriptor change and doesn't require
>> SIG_HUP to pick up a new file. May be we should port it to Django and
>> use it as a default handler for logging to file system.
>
> Why "port it to Django"? Do you mean, copy it into Django? I'm not
> sure it should be the default - not everybody uses logrotate. I'd
> leave this sort of decision for code in settings.py.
Using WatchedFileHandler is a safe default because it works as
FileHandler, just doesn't break with logrotate. I don't know of any
disadvantages of WatchedFileHandler before the old FileHandler. So I
don't think there's much value in giving people this choice in settings
because non-default behavior will be rare (and still possible anyway).
One of the reasons why I propose Django's own settings structure for
logging is because we can choose better defaults for logging and have
more compact syntax for them. Standard Python logging configuration has
a noticable gap between very simplistic basicConfig which configures
only a root channel and a verbose imperative definition of handler
objects, formatter objects and logger objects. I've found that my usage
of logging inevitably falls in between: I often need a few logging
channels but I almost never, say, reuse handler objects between them.
Here's a variant of a simple config that I had in mind lately:
LOGGING = {
'errors': {
'handler': 'django.logging.FileLogger', # WatchedFileLogger copy
'filename': '...',
'level': 'debug',
},
'maintenance': {
'handler': 'logging.handlers.HTTPHandler',
'host': '...',
'url': '....',
'format': '....'
},
}
Top-level keys are logger names. Values are dicts describing handlers.
These dicts have several keys that Django knows about:
- 'handler': a handler class. It's imported like any other stringified
classes in settings
- 'level': a level keyword that is translated into logging.* constants.
This is done to not make users import logging by hand.
- 'format': a format string for the logging.Formatter object. We can
have a more sensible default for this than the one in Python logging. Or
not :-)
These keys are pop'd out of the dict and the rest is used as **kwargs to
the handler class instantiation.
Django's default setup may look like this:
LOGGING = {
'': {'handler': 'logging.StreamHandler'}
}
This has an advantage of always configuring a root logger to avoid an
infamous warning from Python logging when the logger doesn't have any
handlers defined. Users wanting to configure all the logging themselves
may null-out this using `LOGGING = {}`.
Forgot one thing. As it stands now Django has this "historic" behavior
when it imports and executes settings module twice. This results in
breakage when you setup loggers and handlers by hand. We now circumvent
this by having a helper function that memoizes configured loggers and
call it from settings.py. Having a declarative config we can hide this
inside of Django and not scare users.