This may be a dangerous path. Or it may be a good one ;-)
Spaces, file systems.... when can a store be a communication topology?
When cannot it be?
a
> EX.
> connect (topo,"topo://marketdataservice/equity/options");
>
> send (topo, "GLD", 3, 0);
> Spaces, file systems.... when can a store be a communication topology?
> When cannot it be?
As already said I have no experience in the area, however, I would say
it depends on whether the data in the database/tuplespace/filesystem are
duplicated across the nodes or whether they are partitioned (shards).
In the former case the topology for 'store' has broadcast semantics
(store the data in every replica) while 'read' has request/reply
semantics (ask one of the replicas for the data).
In the latter case both 'store' and 'read' can be accomplished by a
topology that delivers the command to particular instance, say based on
hash of the key or somesuch.
Anyway, I don't believe we have any experts in the area on the mailing
list, so there's no much point in going into detail.
What we have to do though is to make sure that there's a way to
"allocate" a new messaging pattern that can be then handed to the
experts so that they can define the semantics. The crucial feature here
is clear separation between different pattern, ie. making sure that any
possible change done in one pattern has zero effect on all the other
patterns.
Martin
A "store" is based on probability models of durability for data given
a physical topology. In real systems durable storage may not imply
that the data even exists anywhere that is addressable, only that it
has a high probability of being addressable in the future somewhere in
the system. There is no simple, clean abstraction that generalizes
well that I am aware of.
This has very complex semantics in distributed data stores. Even at
the level of a single computer with a simple disk system, data can be
in one of three or four different states with respect to durability
probability and on a distributed system with replicas the policy may
be mixed to achieve the optimal result. As an example of a common
policy, a logical store "commits" as soon as the data is in-memory on
the primary node and secondary nodes have persistently logged the data
but not applied it to their working in-memory image -- it is not
online *and* durable at any particular node. This is not too tricky to
implement if you ignore failure conditions.
From the standpoint of the network, the endpoints acting as stores are
often not equivalent which makes the policy logic a bit more
complicated than simple "M of N succeeded" type models because the
status of endpoints may be changed by events mid-write. For optical
buffering type durability models (a kind of "k-safety" persistence
model) "0 of N" is considered durable as long as the local interface
can guarantee that the data was in fact buffered, so it is not
"stored" in an accessible way at the time it is logically durable from
the sender's perspective.
If you look at systems with knobs in their protocols for tuning store
models, they usually severely restrict the system design assumptions
to keep it manageable. This is one of the reasons just about everybody
rolls their own model.
--
J. Andrew Rogers
Sounds like a pretty complex topic. I guess we should define the basics
of SP first, then move to defining specific messaging patterns, as used
in today's distributed applications, eg. "tuplespace" pattern above.
Martin