I would like to log a search query that user does in my app to a file instead of
database. What would be the safest way to do that considering the
website might be accessed ~100+ times simultaneously from different processes?
Do I setup some kind of global logger?
Do I lock a file, then unlock it? Will data be lost on request that
are waiting for file to be unlocked?
How is apache log doing it? How is that different from using flock and
saving to one file from multiple processes?
I would like to setup the process to save as
search_query_20091105.txt? or save to one file and rotate the file at
midnight?
Thanks,
Lucas
--
Automotive Recall Database - See if you vehicle has a recall
http://lucasmanual.com/recall
--
Setup CalendarServer for your company.
http://lucasmanual.com/mywiki/CalendarServer
Automotive Recall Database - See if you vehicle has a recall
http://lucasmanual.com/recall
Good question and one I have been meaning to do some more research on myself.
> Do I setup some kind of global logger?
> Do I lock a file, then unlock it? Will data be lost on request that
> are waiting for file to be unlocked?
>
> How is apache log doing it? How is that different from using flock and
> saving to one file from multiple processes?
From what I have been able to tell from my prior looks at Apache code
it doesn't use locking. It seems to rely on ability of operating
system to allow multiple processes to write to a file without loss of
data, by opening file with O_APPEND.
I have tried to do some searching for any information which says what
sort or guarantees exist in regard to multiple files writing to same
file when using O_APPEND, but can't remember what I found about it.
One thing, and I don't know to what degree it matters, is that for
Apache the error log files is only opened once in the Apache parent
parent process and that open file descriptor is inherited across the
fork of processes handling requests.
In other words, it is not the case that each process opens the log
file of their own accord as would be the case in your situation.
> I would like to setup the process to save as
> search_query_20091105.txt? or save to one file and rotate the file at
> midnight?
Rotating log files is a bit more complicated. In Apache one would use
a piped logger, but that is very much dependent on the file descriptor
being opened in the parent process and inherited as you only want one
instance of the process into which you are logging.
One possibility you could entertain is having a separate logging
process and have the web application process create a socket
connection to it and the logging process combines all the streams on a
line by line basis, or if you use message boundaries on logged
messages, interleaved based on whole message context.
If you aren't recycling daemon processes on regular basis, you could
just create separate log files for each where named based on
mod_wsgi.process_group and the process ID. Ie.,
import mod_wsgi, os
filename = "%s:%s" % (mod_wsgi.process_group, os.getpid())
Graham