Thanks,
David
--
lift, the secure, simple, powerful web framework http://liftweb.net
Collaborative Task Management http://much4.us
On Sat, Feb 23, 2008 at 7:57 PM, David Pollak <feeder.of...@gmail.com> wrote:
Folks,
Over the last week, I've been diving into the world of memcached and how it helps high traffic sites scale.
For those unfamiliar with memcached, it's a mechanism for caching "stuff". It's widely used at sites like Facebook to store sessions, partially rendered pages, etc. memcached is a process that listens on a port for particular requests (add, set, get, incr, remove) and performs these operations on name/value pairs. Names are up to 256 bytes. Values can be any arbitrary set of bytes. Memcached is easily distributed (the clients select the memcached server based on a hash of the key) and supports failover (if the primary memcached server for a given hash is not available, a secondaries will be consulted.)
There are clients to memcached written for most languages (Perl, Python, Ruby, PHP, Java, .Net, etc.)
The C implementation of memcached is wicked fast and wicked stable. There's a persistent version that stores records in a Berkeley DB backend which is about 10 times slower than the C in memory version (which is in an of itself wicked impressive.) There's a Java implementation of memcached that claims to be 50% as fast as the C version (pretty impressive).
Steve Yen and I were chatting a while back about how memcached is stupid. It's a byte-store, but it's subject to a lot of problems... basically, knowing when to dirty a cache item is tough. Cache over-writes are a problem (2 or 3 processes generating the same results to cache meaning 2 or 3 simultaneous database accesses for the same data.)
The semantics of memcached are similar to those of REST and if the key is meaningful and can be parsed into a question, putting smarts (as well as the persistence that memcachedb does) behind the memcached wire protocol might lead to a new way of looking at distributed, scalable applications.
Put another way, if the memcached wire protocol is the front for an Actor-like mesh of computing, then there's a powerful abstraction that is unfrightening to the world of web developers who use memcached on a daily basis. It would also mean that one can migrate a web application from PHP, Rails, etc. by moving the logic out of the web code and into this smart memcached thingy. Then, migrate the front end to lift.
I'm thinking that there might be a very exciting project to add this memcached stuff to lift. Would anyone out there be interested in spending time on such a project with me? Would anyone out there see a use of such a thing?
Hmmm, so Memcachd is like an equivalent to DHT? (Distributed Hash Tables)
I know that I've heared alot of buzz about Prevayler ( http://en.wikipedia.org/wiki/Prevayler )
Is there a great enough need to go the "CORBA-way" to have language/platform independance, or would something like Prevayler work? Or Terracotta?
What are the needs? What are the Pros and Cons of each listed offerings?
As memory storage gets larger and larger, RDBMSes (which are designed for systems with "little" main memory and large disks) become less and less optimal.
It's an interesting topic indeed.
Cheers,
-V
Thanks,
David
--
lift, the secure, simple, powerful web framework http://liftweb.net
Collaborative Task Management http://much4.us
--
_____________________________________
/ \
/lift/ committer (www.liftweb.net)
SGS member (Scala Group Sweden)
SEJUG member (Swedish Java User Group)
\_____________________________________/
On 2/23/08, Viktor Klang <viktor...@gmail.com> wrote:On Sat, Feb 23, 2008 at 7:57 PM, David Pollak <feeder.of...@gmail.com> wrote:
Folks,
Over the last week, I've been diving into the world of memcached and how it helps high traffic sites scale.
For those unfamiliar with memcached, it's a mechanism for caching "stuff". It's widely used at sites like Facebook to store sessions, partially rendered pages, etc. memcached is a process that listens on a port for particular requests (add, set, get, incr, remove) and performs these operations on name/value pairs. Names are up to 256 bytes. Values can be any arbitrary set of bytes. Memcached is easily distributed (the clients select the memcached server based on a hash of the key) and supports failover (if the primary memcached server for a given hash is not available, a secondaries will be consulted.)
There are clients to memcached written for most languages (Perl, Python, Ruby, PHP, Java, .Net, etc.)
The C implementation of memcached is wicked fast and wicked stable. There's a persistent version that stores records in a Berkeley DB backend which is about 10 times slower than the C in memory version (which is in an of itself wicked impressive.) There's a Java implementation of memcached that claims to be 50% as fast as the C version (pretty impressive).
Steve Yen and I were chatting a while back about how memcached is stupid. It's a byte-store, but it's subject to a lot of problems... basically, knowing when to dirty a cache item is tough. Cache over-writes are a problem (2 or 3 processes generating the same results to cache meaning 2 or 3 simultaneous database accesses for the same data.)
The semantics of memcached are similar to those of REST and if the key is meaningful and can be parsed into a question, putting smarts (as well as the persistence that memcachedb does) behind the memcached wire protocol might lead to a new way of looking at distributed, scalable applications.
Put another way, if the memcached wire protocol is the front for an Actor-like mesh of computing, then there's a powerful abstraction that is unfrightening to the world of web developers who use memcached on a daily basis. It would also mean that one can migrate a web application from PHP, Rails, etc. by moving the logic out of the web code and into this smart memcached thingy. Then, migrate the front end to lift.
I'm thinking that there might be a very exciting project to add this memcached stuff to lift. Would anyone out there be interested in spending time on such a project with me? Would anyone out there see a use of such a thing?
Hmmm, so Memcachd is like an equivalent to DHT? (Distributed Hash Tables)
I know that I've heared alot of buzz about Prevayler ( http://en.wikipedia.org/wiki/Prevayler )
Is there a great enough need to go the "CORBA-way" to have language/platform independance, or would something like Prevayler work? Or Terracotta?
I was hoping that Terracotta and Actors would provide a solution. This avenue has become less attractive for two reasons: the performance of Terracotta and Actors has not materialized (the current max is 3000 messages per second and I need to see a mesh with at least 1M messages per second) and matching a wire protocol with something that's very common and has excellent client support has significant advantages.
I wouldn't call it a CORBA way. There's no IDL file. The memached
wire protocol is extremely simple I believe. That's why it's easy to
implement a client in any language.
>
> I was hoping that Terracotta and Actors would provide a solution.
> This avenue has become less attractive for two reasons: the
> performance of Terracotta and Actors has not materialized (the
> current max is 3000 messages per second and I need to see a mesh
> with at least 1M messages per second) and matching a wire protocol
> with something that's very common and has excellent client support
> has significant advantages.
>
> Have the bottleneck been identified? (Is it Terracotta that is the
> bottleneck or is it the Actors-library?)
> I agree that using already present standards is a good idea.
> Do you happen to know how Memcachd when it comes to versioning?
There is no versioning in memcached. It's just a hash table.
We're using it for our internal deployment also.
Blair
--
Blair Zajac, Ph.D.
CTO, OrcaWare Technologies
<bl...@orcaware.com>
Subversion training, consulting and support
http://www.orcaware.com/svn/
There is no versioning in memcached. It's just a hash table.
>
> I was hoping that Terracotta and Actors would provide a solution.
> This avenue has become less attractive for two reasons: the
> performance of Terracotta and Actors has not materialized (the
> current max is 3000 messages per second and I need to see a mesh
> with at least 1M messages per second) and matching a wire protocol
> with something that's very common and has excellent client support
> has significant advantages.
>
> Have the bottleneck been identified? (Is it Terracotta that is the
> bottleneck or is it the Actors-library?)
> I agree that using already present standards is a good idea.
> Do you happen to know how Memcachd when it comes to versioning?
We're using it for our internal deployment also.
Blair
--
Blair Zajac, Ph.D.
CTO, OrcaWare Technologies
<bl...@orcaware.com>
Subversion training, consulting and support
http://www.orcaware.com/svn/
> I agree that using already present standards is a good idea.There is no versioning in memcached. It's just a hash table.
> Do you happen to know how Memcachd when it comes to versioning?
Okay, so it's opaque-last-commit-wins.
No, not yet. We're still developing our app.
The only trick I heard is that for boxes hosting application servers
which are CPU intensive you can put memcached on them also, since
memcached is memory intensive but not CPU intensive. So we have 10
application servers and each Java process gets 2 Gigs of RAM, leaving
another 2 free. So if we run a memcached process on each one and give
it a gig, then we have 10 gigs of distributed memory.
Blair
Blair
>
> > We're using it for our internal deployment also.
> >
> > Cool! Got any tips, tricks or wisdom to share?
>
> No, not yet. We're still developing our app.
>
> The only trick I heard is that for boxes hosting application servers
> which are CPU intensive you can put memcached on them also, since
> memcached is memory intensive but not CPU intensive. So we have 10
> application servers and each Java process gets 2 Gigs of RAM, leaving
> another 2 free. So if we run a memcached process on each one and give
> it a gig, then we have 10 gigs of distributed memory.
>
> That's neat :)
> How fail-safe is it? Is it replicated in some specific manner?
No, it's just a LRU cache to a backend much slower database that has
immutable data in it. So I can safely cache data in memcached. It
should be very fast.
Blair
Blair
>
>
> On Sat, Feb 23, 2008 at 8:53 PM, Blair Zajac <bl...@orcaware.com>
> wrote:
>
>
> On Feb 23, 2008, at 11:51 AM, Viktor Klang wrote:
>
> >
> > > We're using it for our internal deployment also.
> > >
> > > Cool! Got any tips, tricks or wisdom to share?
> >
> > No, not yet. We're still developing our app.
> >
> > The only trick I heard is that for boxes hosting application servers
> > which are CPU intensive you can put memcached on them also, since
> > memcached is memory intensive but not CPU intensive. So we have 10
> > application servers and each Java process gets 2 Gigs of RAM,
> leaving
> > another 2 free. So if we run a memcached process on each one and
> give
> > it a gig, then we have 10 gigs of distributed memory.
> >
> > That's neat :)
> > How fail-safe is it? Is it replicated in some specific manner?
>
> No, it's just a LRU cache to a backend much slower database that has
> immutable data in it. So I can safely cache data in memcached. It
> should be very fast.
>
> Nice. Is there a possibility to use LFU or any other algorithm?
I don't know for certain. But this page is worth checking out:
http://semanticvoid.com/pages/memcached.html
Regards,
Blair
> Steve Yen and I were chatting a while back about how memcached is stupid.
> It's a byte-store, but it's subject to a lot of problems... basically,
> knowing when to dirty a cache item is tough. Cache over-writes are a
> problem (2 or 3 processes generating the same results to cache meaning 2 or
> 3 simultaneous database accesses for the same data.)
>
> The semantics of memcached are similar to those of REST and if the key is
> meaningful and can be parsed into a question, putting smarts (as well as the
> persistence that memcachedb does) behind the memcached wire protocol might
> lead to a new way of looking at distributed, scalable applications.
>
> Put another way, if the memcached wire protocol is the front for an
> Actor-like mesh of computing, then there's a powerful abstraction that is
> unfrightening to the world of web developers who use memcached on a daily
> basis. It would also mean that one can migrate a web application from PHP,
> Rails, etc. by moving the logic out of the web code and into this smart
> memcached thingy. Then, migrate the front end to lift.
>
> I'm thinking that there might be a very exciting project to add this
> memcached stuff to lift. Would anyone out there be interested in spending
> time on such a project with me? Would anyone out there see a use of such a
> thing?
Why not just plain REST? ETags and Last-Modified give you pretty good
cache semantics. I'll go ahead and say up front that I think memcached
is overused and overhyped.
Steve
Memcached is not a DHT, it's just a remote hashtable. Most people who
use it do the distributed part on top of the memcached client they
use. So it's their application that's the DHT, not memcached.
> I know that I've heared alot of buzz about Prevayler (
> http://en.wikipedia.org/wiki/Prevayler )
> Is there a great enough need to go the "CORBA-way" to have
> language/platform independance, or would something like Prevayler work? Or
> Terracotta?
> What are the needs? What are the Pros and Cons of each listed offerings?
>
> As memory storage gets larger and larger, RDBMSes (which are designed for
> systems with "little" main memory and large disks) become less and less
> optimal.
Prevalyer is built on the concept that "most applications will never
exceed the amount of RAM they can buy". I worked on one that did and
am pretty much only interested in working on ones that will since
those are the ones that are popular enough to make serious money.
Remember, even if you can buy a machine with enough memory, does it
have enough IO bandwidth and CPU cycles to compute all the useful
things you need with that data? If not, how many replicas of that do
you need to handle your peak traffic requirements?
You know what I would like to see? A simplified BigTable clone that
has the same scaling properties as BigTable itself (unlike many of the
current clones).
Steve
Steve
On Sat, Feb 23, 2008 at 4:50 PM, David Pollak <d...@athena.com> wrote:
> Why not just plain REST? ETags and Last-Modified give you pretty good
> cache semantics. I'll go ahead and say up front that I think memcached
> is overused and overhyped.
>
> Because HTTP is heavier weight than the memcached protocol, HTTP keep-alive
> is harder to implement (especially in languages with crappy threading and no
> real concept of "global") than keeping the socket open to memcached,
I'll just throw this out there. Clearly your idea is predicated on
using memcached protocol but somebody else might find this useful:
There are other good protocols like BEEP that have multi-language
support and offer more semantics than get, put, get_multi, and aren't
nearly as heavy-weight as HTTP when it comes to persistent
connections. BEEP isn't so much a protocol as a toolkit for building
your own protocol. Has an RFC and clients in Java and C and was
written by Marshall Rose.
> memcached has client-managed timeout semantics, and there are a ton of
> applications that use memcached already and being able to gently migrate
> them to a Scala/lift backend is easier if you say, "you're already using
> memcached, just use this as a server and it will not only cache, but in some
> cases, compute your answer."
Here's kind of what you're saying to them:
"Here's a memcached server that's way slower but if you write your
code in Scala, it has nicer caching."
Am I essentially right? That might seem harsh but I'm trying to think
of potential reactions to this experiment. the hip web 2.0 crowd is a
cynical one. ;-)
> More generally, it's hard to write fault tolerant applications in Rails
> (and most other web frameworks), but it's easy in Erlang because in Erlang,
> failures are assumed and the "alternative in case of timeout" is almost
> always coded in. memcached is a nice place to put the "buffer layer."
>
> Let's just for a minute assign a name to this thing... let's call it
> smartcached.
>
> Imaging for a moment that smartcached is based on Scala Actors. You get a
> request for a cached item... the item is either not in cache or marked
> dirty. You send a message to a computational unit to build a new cache
> entry.
Where is this computational unit?
> If the computation doesn't come back in a certain period of time,
> you either return the old value (if that's a legal semantic for the data
> type) or your return a default value. This means that the web front end
> always gets an answer. That's a huge win, especially in a usage spike
> because you want answers to go back so web requests don't get piled up. It
> also means that people don't have the perception that a given service is
> down (serving 2 or 3 minute old pages is better than serving 500s.)
What if the client didn't have an old value yet and a default value
isn't good enough? That's a 500.
If your cache is replicated across actors then you have a better
chance of serving a cached item even if stale which seems to be the
property you want.
> The next win with smartcached is that you can serialize item building.
> That means that if you have multiple requests for an item not in cache, you
> don't have multiple machines building the same cache item and contending for
> resources (one might argue that those resources are in memory on the
> database after the first request, but still, having the database do n times
> the work, especially in load situations is not a good thing.) So, you wind
> up with the ability to serialize the building of an item. This is a win.
So you have to build the item with Scala or smartcached calls back
into your application to have it build the item and cache the result?
I think this relates to my question of where the computational unit
is.
> But you also wind up with the ability to throttle back the computation of
> items. When you have the computation of cache items distributed across n
> processes on m machines, there's no real way to "take the temperature" of
> the whole system (is the average database response time > 100% of normal, is
> the queue length in the message queue more than n items or is the items
> processed per second < 50% of normal) and throttle back number of
> simultaneous calculations so that the system has a chance to right itself.
> If the calculation is centralize *and* there are semantics build into the
> work allocator of the centralized calculator for doing this throttling, it
> just becomes part of the infrastructure rather than something that someone
> has to actively think about for each query.
You can achieve this throttling with a master node that tracks the
health of the system. Clients (meaning cache actors in this case) send
messages to the master informing it of their status. You can certainly
"take the temperature" of a distributed system, it's just not as easy
as with a centralized system.
> So... coming full circle, what does all this stuff have to do with
> memcached and why not just do it on top of Jetty and REST?
>
> Doing internal infrastructure on REST means actively changing up all the
> places that you're calling memcached. That's lots of work. Doing stuff
> over HTTP and REST could mean the perception of SOA and the associated
> Technicolor Yawn from hip web 2.0 people that rage against the IBM mandated
> machine, dude. :-)
The lift motto seems to be: build great things for smart people.
That's why it's not simply Rails in Scala. Why move away from that?
Building things for sheep is Not Satisfying in my experience.
> Plus, if we get it right on the memcached ABI side, there's no problems
> generalizing the stuff to REST.
That's true.
I like that you're taking an incremental approach to this. Maybe this
is just me missing BigTable but I think that at some point, smarter
caches become more work than building better databases that can simply
be fed more machines as load increases. Then again, maybe your smarter
cache eventually just becomes the database.
Steve
Now let's add AMQP (https://www.amqp.org/) and for example rabbitmq (http://www.rabbitmq.com/
) to this equation. We get a nice setup for gracefully degrading data
serving using cached data as high load fallback.
Example:
A cache usage hits memcached (or cacherl). We configure a specified
timeout before the data gets delivered from the cache. Using AMQP
messaging we put a message requesting a new calculation of the cache
data entry into an AMQP message queue. Now we can have different
clients (using Scala, Erlang, C++ ...) capable of calculating the data
listening on that queue. Some code in the cloud (for example a scala
actor) calculates the data and puts the result back into the memcached
cache. If the result is written into the cache before the timeout then
the user gets up to the second accurate pages. If the timeout is
missed the user sees older data but no page failure.
Voila, there we have an enterprise grade infrastructure (clustered and
persisted data cache and message store) out of already avaliable (open
source) system components that can handle high load situations.
For using such a setup in a lift application we need a timeout
extension of cacherl (or memcached) and a mechanism in /lift/ for easy
(maybe transparent?) usage of memcached data. (*)
+1 for memcached interface protocol integration into /lift/
(*) The hard core Scala boys then can sign up for the task of porting
rabbitmq and cacherl from Erlang to Scala ;)
> Erlang is going to be slower than a Scala implementation of the
> memcached ABI.
> ...
> But, all in all, I'd implement most of the system in Scala and skip
> having to have yet another piece of technology (Erlang) in the mix.
on what numbers your opinion to use Lift and a (yet virtual) Scala
version of Memcached is based on? I am a total Scala and Lift fanboy
and would also like to have such an infrastructure based on Scala/
Lift. But I do not see Erlang like load tolerance for clustered
servers with Lift/Scala yet. Erlang server clusters have a load
tolerance of about 80k requests per second (mid-range Linux/PC server
system) per cluster node. I have seen numers in this dimension for
erlang based web page (yaws) and comet request serving (erlycomet).
Unfortunately I have no numers for the erlang version of memcached.
Does Scala/Lift play in that league already? Of course it would be
better to have only one technology involved in such a setup.
Regards,
Alex
Regards, Alex
It looks like (a lot) JavaSpaces (TuplesSpace, GigaSpaces)
With it you could do :
* messaging, persistence (pure memory, or as cache for FS, RDBMS,...)
* master/slave command pattern (divide and conquer)
* space for data to process by actor
May be the api of XxxSpaces could be a good source of inspiration ;), something like
* take[T](template: T, tx: Option[Transaction], timeout: Duration) : T
* takeAll[T](template: T, tx: Option[Transaction], timeout: Duration) : Iterable[T]
* write[T](entry: T, tx: Option[Transaction], expiration: Option[Duration])
* read[T](template: T, tx: Option[Transaction], timeout: Duration) : T
* notify (I don't remember the exact api : need to create NotificationListener,...)
IMO Actor + Spaces could provide a good env (Gigaspaces provide a "pseudo" actor mecanisme via worker)
my 2cents (if you need info about JavaSpaces/GigaSpaces (what is cool, and what is not cool, may be I could help)
> I've been ranting on other threads about lift not just being about HTTP
> request/response and more than just CRUD. This discussion is bring
> these ideas into clearer focus for me.
>
> Okay... enough of a cryptic post for the morning.
I'll read it ;)
/davidB
>
> Thanks,
>
> David
>
> On 2/23/08, *Steve Jenson* <ste...@gmail.com <mailto:ste...@gmail.com>>
It looks like (a lot) JavaSpaces (TuplesSpace, GigaSpaces)
David Pollak wrote:
> Steve,
>
> I wish I had about 2 hours to write a worthy response to this note.
>
> A couple of things... I see a near-term need for a smartcached that uses
> the memcached protocol, but the larger project is not about wire
> protocols and should be pretty much independent of wire protocols.
>
> Persistence mechanisms (RDBMS, BigTable, etc.) are great for "demand"
> based web applications. However, I see a broader need for "proactive"
> web applications. This is a hybrid of persistence, messaging (with
> smart, scalable routing rules), and some "live agent/Actor" thingy that
> sticks around and keeps fresh some form of state.
With it you could do :
* messaging, persistence (pure memory, or as cache for FS, RDBMS,...)
* master/slave command pattern (divide and conquer)
* space for data to process by actor
May be the api of XxxSpaces could be a good source of inspiration ;), something like
* take[T](template: T, tx: Option[Transaction], timeout: Duration) : T
* takeAll[T](template: T, tx: Option[Transaction], timeout: Duration) : Iterable[T]
* write[T](entry: T, tx: Option[Transaction], expiration: Option[Duration])
* read[T](template: T, tx: Option[Transaction], timeout: Duration) : T
* notify (I don't remember the exact api : need to create NotificationListener,...)
Can you put 1B objects into JavaSpaces? 10B?
JavaSpaces is a spec.
About GigaSpaces (a commercial/pro implementation (+/- free for startup))
* 1B or 10B is possible depends of the size of the object and the size of the cluster.
* by example if you use the GUI admin, you could bench (write/take/read) but object are very little (2 fields)
* a better solution, is to start a server (embedded or not) and push data (I could retreive code if you want)
* fail-over/fault tolerance is done by
* a cluster of nodes (several on the same host is possible)
* support partionning of data, replication combinaison of both between "node"
* support several strategy of load balancing between node
* possible to create mirror/backup
* possible to store data of (selected) node to backend like FS, RDBMS, custom
* possible to use a backend to retreive data not available in space (like cache missing)
Open source implementation are mono-server/process (correct me if I make a mistake) :
* blitz use Berkeley DB to persist state (for hard persistance, or for over-memory (a little like ehcache do))
I didn't test with more than 1M (~10 fields) and to simulate a classical Ask/Bid/Match (financial)
What is interesting is the api and transaction management (take from Jini) that is the same for messaging and data access (and simple)
/davidB
>
> On 2/25/08, *David Bernard* <david.be...@gmail.com
> <mailto:ste...@gmail.com> <mailto:ste...@gmail.com
> <mailto:ste...@gmail.com>>>
>
> > wrote:
> >
> >
> > I hope you don't think of my comments as stop energy. I
> think this is
> > a very interesting experiment and my comments are meant to be
> > constructive.
> >
> >
> > On Sat, Feb 23, 2008 at 4:50 PM, David Pollak
> <d...@athena.com <mailto:d...@athena.com>
>