mod_cache + mod_proxy excessive RAM usage workarounds

19 views
Skip to first unread message

rebelvideo

unread,
May 18, 2008, 3:05:18 AM5/18/08
to mod_cache

mod_proxy causes a few problems on a relatively busy server. it caches
content to ram so if you are loading a few large files in parallel,
which is to be expected in a normal environment, it quickly consumes
ram until the process is silently killed.

Thus leaving no clues to the problem

I am looking for alternatives....

Would I be correct in assuming that if there are multiple concurrent
requests for the same file and it is not already cached that mod_cache
will issue multitple concurrent request to mod_proxy, rather than
check to see that the file has already been requested?

I have tried to disable mod_proxy altogether and instead fetch the
file via mod_staticfile using NFS (just by setting the correct
document root)

Unfortunately it appears mod_cache will not cache content that is
fetched via mod_static.
Perhaps because mod_staticfile does not produce any expires headers,
or perhaps its because without mod_proxy enabled mod_cache is also
disabled? or perhaps mod_static simply bypasses all loaded mods and
writes direct to the client leaving mod_proxy with nothing to cache.

Is there a way I can force mod_cache to cache a file returned by
mod_staticfile?

Under what conditions will mod_cache refuse to cache a file? ( apart
from expires headers, etc)

I have tried to determine what path a file travels through mod proxy,
the purpose being to change the memory allocation to mmapped files.
Unfortunately lighttpd appears to run on callbacks and its not obvious
(to me) where I can intercept the memory allocation routines to
replace them.

Perhaps if I wrote my own custom backend for mod_proxy_core?

Hongyu

unread,
May 18, 2008, 3:43:49 AM5/18/08
to mod_...@googlegroups.com
There has a way which isn't elegant to work around the problem:

You can run 'curl http://website.cache.server/what_we_want_to_save_files' to let mod_cache cache the files manually.

I call this method as "PUSH CACHING" :)

Chris Hamono

unread,
May 18, 2008, 7:25:43 AM5/18/08
to mod_...@googlegroups.com

Unfortunately I can't do that. we have literally terrabytes worth of files and the cache would quickly over flow the reason for setting up mod cache is to only cache the files that are required. Unfortunately I don't know ahead of time which files are going to be wanted.

I had thought of setting up a fastcgi process to fetch the file into the cache on the very first request but the first person to request the file would get a failed response and unless I set up some sort of memcache to track every request I would have to "push cache" every request even if they were duplicates.

I have also noticed some files simply refuse to cache. I am really not sure why. thats why i hoped you could specify under what conditions mod_cache wont cache.

If I enable logs on the backend server I see the same file requested over and over. each time with a 200 response code and a valid content-length (debugging output for an example is below)

I am not really looking for elegant or one line fixes, I am well past that now. I don't mind coding up an entire module. Its just that I am really not sure where to start.

A nice flow diagram of the way lighttpd processes requests would really help. particularly something that shows the order in which modules are called and how lighty deals with multiple modules that register the same handlers.

Tomorrow I will set up a test server and try and trace the entire process. enabling debugging output on the current server sends an impossible amount of data to the logs and tends to confuse more than help.


Chris

=============================================================
debugging output from the backend server showing what is sent to mod_proxy_core
=============================================================
response.c.226: (trace) -- splitting Request-URI
response.c.227: (trace) Request-URI  : /source/video.wmv
response.c.228: (trace) URI-scheme   : http
response.c.229: (trace) URI-authority: 65.171.60.17
response.c.230: (trace) URI-path     : /source/video.wmv
response.c.231: (trace) URI-query    : (null)
response.c.285: (trace) -- sanitizing URI
response.c.286: (trace) URI-path     : /source/video.wmv
mod_access.c.138: (trace) -- handling file in mod_access
response.c.402: (trace) -- before doc_root
response.c.403: (trace) Doc-Root     : /mnt/rawfiles
response.c.404: (trace) Rel-Path     : /source/video.wmv
response.c.405: (trace) Path         : (null)
response.c.458: (trace) -- after doc_root
response.c.459: (trace) Doc-Root     : /mnt/rawfiles
response.c.460: (trace) Rel-Path     : /source/video.wmv
response.c.461: (trace) Path         : /mnt/rawfiles/source/video.wmv
response.c.480: (trace) -- logical -> physical
response.c.481: (trace) Doc-Root     : /mnt/rawfiles
response.c.482: (trace) Rel-Path     : /source/video.wmv
response.c.483: (trace) Path         : /mnt/rawfiles/source/video.wmv
response.c.501: (trace) -- handling physical path
response.c.502: (trace) Path         : /mnt/rawfiles/source/video.wmv
response.c.510: (trace) -- file found
response.c.511: (trace) Path         : /mnt/rawfiles/source/video.wmv
response.c.663: (trace) -- handling subrequest
response.c.664: (trace) Path         : /mnt/rawfiles/source/video.wmv
mod_access.c.199: (trace) -- handling file in mod_access
mod_staticfile.c.327: (trace) -- checking file for static file
mod_staticfile.c.352: (trace) -- handling file as static file
plugin.c.386: (trace) -- plugins_call_...: plugin 'staticfile' returns 2
response.c.675: (trace) -- subrequest finished
2008-05-18 00:28:10: (response.c.137) Response-Header:
HTTP/1.1 200 OK
Content-Type: source/x-ms-wmv
ETag: "-625156571"
Accept-Ranges: bytes
Last-Modified: Sun, 18 May 2008 06:13:32 GMT
Content-Length: 23231024
Date: Sun, 18 May 2008 07:28:10 GMT
Server: lighttpd/1.5.0

2008/5/18 Hongyu <shel...@gmail.com>:

Hongyu

unread,
May 18, 2008, 8:11:43 AM5/18/08
to mod_...@googlegroups.com
Other module can set con->write_cache_file = 1 to let mod_cache to cache response.

It isn't good to modify mod_staticfile to set con->write_cache_file because mod_cache passes request to mod_staticfile if CACHE HIT and cause mod_cache<->mod_staticfile loops.

It's better to write another lighttpd module to check con->use_cache_file and set con->write_cache_file = 1 when needed.

Please refer modified mod_proxy.c for more detail.

Chris Hamono

unread,
May 18, 2008, 7:25:52 PM5/18/08
to mod_...@googlegroups.com
Excellent

Thats exactly the sort of information I am looking for

I can validate the source, if its within the cache I can ignore it, if its external to the cache I can set con->write_cache = 1

Thank you so much for the pointer it saves me a lot of research.  :)


Chris

2008/5/18 Hongyu <shel...@gmail.com>:

Chris Hamono

unread,
May 20, 2008, 9:39:48 AM5/20/08
to mod_...@googlegroups.com
I am hoping for a little more help if possible Hongyu

I have successfully modded the files to write the local file to cache , it was a fair bit more complex than simply setting con->write_cache = 1 :(

But I managed to get it done

My next problem is I had to change mod_cache_handle_response_filter to write to the temporary file. FILE_CHUNK's were being ignored thus even after managing to set con->write_cache = 1 successfully nothing was being cached.

I created a simple read write loop to copy the file from the source (NFS) to the destination (temp file) this works quite well except that I am concerned that while caching many large files this write loop will cause lighttpd to stall and not serve any requests until the write loop is complete.

I noticed that mod_cache_handle_response_filter is called multiple times when transferring large files. thus I could write only a part of the file in each of these call's allowing lighty to process other requests.

I have surmised that one of the problems with mod_proxy(core)/mod_cache is that it is used a lot on high volume sites, these sites would I assume like ours, throttle outgoing data in an attempt to control bandwidth.

Because this throttling can make the time it takes to completely service a request a lot longer it means that it takes a lot longer before mod_cache gets the opportunity to save the file. therefore if within this period 100 requests come in for the same file mod_proxy has to fetch 100 copies, before its is eventually cached.

I have altered mod_cache to rename the temporary file at the earliest possible time. I am hoping that this won't increase latency too much and may improve performance by increasing the cache hit rate. I am still concerned that copying the file in one go rather than in small chunks will delay the server but am hoping the trade off will be worth it.



2008/5/19 Chris Hamono <cha...@gmail.com>:
Reply all
Reply to author
Forward
0 new messages