memcached vs. file-based caching

2,681 views
Skip to first unread message

SimonT

unread,
Jan 23, 2009, 8:52:58 AM1/23/09
to memcached
Hi - we are looking at the possibility of moving to use memcached for
a high volume website. At the moment, there is an existing file-based
cache that can be used to serialise data, page fragments etc. used by
the site. Ther is in the region of 3 GB of data currently cached.
There seems to be a great deal of support for memcached across the
whole dev community but having done a bit of reading, it seems that
there are arguments for file-based solution over memcached in terms of
speed when the caching done is per node (as we do currently). For
example:

http://www.mysqlperformanceblog.com/2006/08/09/cache-performance-comparison/
http://www.rooftopsolutions.nl/article/107

My understanding is that memcached (or similar distributed cache)
really comes into it's own when several web nodes share the same cache
cluster.

Anybody care to comment? In a high concurrency situation, does
memcached perform comparitively better? Are there any other factors we
should be considering?

Perrin Harkins

unread,
Jan 23, 2009, 9:27:20 AM1/23/09
to memc...@googlegroups.com
On Fri, Jan 23, 2009 at 8:52 AM, SimonT <simonth...@hotmail.com> wrote:
> My understanding is that memcached (or similar distributed cache)
> really comes into it's own when several web nodes share the same cache
> cluster.

That's correct. There is the overhead of the TCP connections. On a
local machine, something like BerkeleyDB or mmap'ed files will beat
memcached.

- Perrin

Jeremy Dunck

unread,
Jan 23, 2009, 10:21:10 AM1/23/09
to memc...@googlegroups.com
On Fri, Jan 23, 2009 at 7:52 AM, SimonT <simonth...@hotmail.com> wrote:
> it seems that
> there are arguments for file-based solution over memcached in terms of
> speed when the caching done is per node (as we do currently). For
> example:

File-based makes sense per-node if the majority of per-node cache is
not redundant.

In general, file-based makes sense if:
* memory is at a premium
* latency to other nodes is high
* shared access to specific keys is easily partitioned to nodes
* disk bandwidth dwarfs cache bandwidth

...


> Anybody care to comment? In a high concurrency situation, does
> memcached perform comparitively better? Are there any other factors we
> should be considering?

If you have a lot of writes, disk is going to bottleneck before memory/network.

Xaxo

unread,
Jan 25, 2009, 10:51:50 AM1/25/09
to memcached


On Jan 23, 4:21 pm, Jeremy Dunck <jdu...@gmail.com> wrote:
> In general, file-based makes sense if:
>   * memory is at a premium
>   * latency to other nodes is high
>   * shared access to specific keys is easily partitioned to nodes
>   * disk bandwidth dwarfs cache bandwidth
......
> If you have a lot of writes, disk is going to bottleneck before memory/network.

file-based caching != disk based caching

you can have your filesystem in memory (tmpfs) or you can map files
into memmory and then create filesystems over there if you want to
keep your data upon reboot.

Jeremy Dunck

unread,
Jan 25, 2009, 10:55:49 AM1/25/09
to memc...@googlegroups.com
On Sun, Jan 25, 2009 at 9:51 AM, Xaxo <idi...@gmail.com> wrote:
> On Jan 23, 4:21 pm, Jeremy Dunck <jdu...@gmail.com> wrote:
>> In general, file-based makes sense if:
>> * memory is at a premium
>> * latency to other nodes is high
>> * shared access to specific keys is easily partitioned to nodes
>> * disk bandwidth dwarfs cache bandwidth
> ......
>> If you have a lot of writes, disk is going to bottleneck before memory/network.
>
> file-based caching != disk based caching
>

Well, OK, if you want to go that far, you could also use mogilefs or
HDFS or many other not-really-files approaches. If you're using
tmpfs, I guess that shows memory is not short-- you still have to
serialize bits, so why not run a single memcached node local?

Brian Moon

unread,
Jan 25, 2009, 11:01:19 AM1/25/09
to memc...@googlegroups.com
> Well, OK, if you want to go that far, you could also use mogilefs or
> HDFS or many other not-really-files approaches. If you're using
> tmpfs, I guess that shows memory is not short-- you still have to
> serialize bits, so why not run a single memcached node local?

Or, use an in-process memory caching system. PHP has several available
for example. Not sure what platform you are using.

--

Brian Moon
Senior Web Engineer
------------------------------
When you care enough to spend the very least.
http://dealnews.com/

Perrin Harkins

unread,
Jan 25, 2009, 1:05:18 PM1/25/09
to memc...@googlegroups.com

No need to get into tmpfs. The OS will already cache as much of the
files in RAM as it can, and things like BerkeleyDB manage their own
shared RAM cache.

Running a single memcached node locally is significantly slower than
BerkeleyDB or mmap'ed files. The communication overhead just kills it
compared to in-process calls with an efficient system. The advantage
of memcached is in sharing between machines.

- Perrin

Reply all
Reply to author
Forward
0 new messages