On Jun 21, 1:57 pm, "
morten.kr...@amberbio.com"
<
morten.kr...@gmail.com> wrote:
> It would be so much simpler to make a Haskell binding to it. Scalaris
> does not serialize to
> disk. There is no persistent storage. They claim that it is not easy
> to consistently write to disk. The whole system is supoosed to be
> 'always on'.
Happstack-state already supports this mode of operation. Simply use
the null-saver to avoid journaling any events.
> I guess one could make an occasional query for all key/
> values and dump them to disk. That would give some persistence, but
> ACID would not be totally guaranteed on disk.
That is exactly what happens when you call 'createCheckpoint'. I
believe you can already use the nullSaver to avoid saving individual
events to disk, and still use createCheckpoint to create a checkpoint
to disk, S3, etc, on occasion. However, I am not sure why you would
want to do that. Saving the events seems better.
> It makes sense to me
> that the serialization is done, if at all, by an outside script
> similar to monitoring.
How does this script get access to the RAM in the process where the
values are stored? With happstack, you could have your normal
multimasters, which handle incoming requests, run with out saving any
events or checkpoints to disk. But you could also have an additional
master or two, which did not handle any incoming requests, but did
saving events and checkpoints to disk.
In this way, you could have a bunch of diskless machines with fast
processors for handling requests, and different machines with fast
RAID arrays for providing persistent storage to disk. Ideally one or
two RAID machines per shard?
> State is probably not serialized consistently right now in Happstack
> State either; suppose the multi masters lose their internal connection
> and each continue updating state. Then what do the disk files mean?
That would never happen in the current architecture. We use the spread
mode which has the following guarantees:
1. a message is either delivered to all clients or no clients
2. all clients receive messages sent to the network in the same order
3. there is only one spread network, and it can never become
fragmented
In multimaster-mode, a client does not directly perform an update on
its local state. Instead it creates an update event which it sends to
the network. Later, it will receive this message back from the spread
network, just like all the other clients, and will perform the update
in the same way that all the other clients do. If a multimaster were
to get disconnected from the network, then it would not received the
update event, and would therefore not commit update. When a
multimaster rejoins the network, it will request the latest state.
Note that the lost update is consistent with ACID principles, because
the event is lost before the transaction is commited:
http://www.nusphere.com/products/library/acid_transactions.htm
I believe there is a weakness right now that if all the nodes go down,
then the state is restored from which ever node joins first. If that
node happened to be disconnected from the network prior to everything
going down, then it could be missing events that other servers have.
This is not an unfixable flaw however. Certainly, any solution that
would work for scalaris should work here. (Assuming you have some way
to recover from all nodes going down in scalaris at all).
Scalaris seems to be based around storing (key, value) pairs -- but
what if your state is not really based around (key, value) pairs?
Happstack-state allows you to use almost any Haskell data type, so you
can choose to use (key,value) pairs only if they are right for your
application (via, IxSet, or whatever else you want). For example,
storing a tree (such as a threaded message board) as key/value pairs
requires a lot more work than just storing the tree. With happstack-
state, you just use your basic tree type, and normal tree manipulation
functions, and all is good. To use key value pairs, you either have to
convert between a Tree type and a key/value pair types (which means
writing and debugging more code), or you have to write a bunch of
functions for doing tree-like manipulations on data stored in key/
value pairs (more extra code, testing, and debugging). This is the
same reason why on a single server system, happstack-state is still
nicer than BerkeleyDB for many applications.
Also, scalaris does not allow you to delete keys.
And, perhaps most importantly, Scalaris is based on Paxos -- and I
have not heard good things about Paxos scaling. Do you have some
reason to believe that Paxos scales better than Spread?
So, to answer your question more directly, there is more to happstack-
state than replication. In fact, most (perhaps all) users of happstack-
state today do not use replication. So, on a single server setup, why
would people want to used happstack-state instead of MySQL, sqlite (a
relation database) or BerkeleyDB (a key/value store), which all
support running entirely in-memory rather than on-disk. One answer is
that happstate-state lets you use Haskell data types directly, with
out having to writing any marshaling code by hand, and that you can
write your 'queries' in Haskell instead of SQL. These features ought
to allow you to write shorter and simpler code in less time, with less
bugs, and less testing. So, for happstack-state on a single-server
system, it is not clear that there is a pre-existing library which
provides similar benefits.
When introducing replication and sharding into the mix, we, of course,
want to retain the benefits of happstack-state, and get more
scalability. We leverage spread, because spread *does* provide
functionality that we can use out of the box to extend happstack-
state. There is little benefit to writing something like spread, when
spread already provides the exact functionality that we need. The
functionality that spread provides is low-level, and can be used to
extend the benefits of happstack-state in a fairly transparent way. If
we pick something higher-level, then it would need to be something
that could provide the same benefits that happstack-state provides.
However, most of the higher level libraries seem like a step
backwards, since they will create more work for people building on our
platform (in the form of having to writing marshaling code, or only
use a restricted set of types).