what is 'store'

9 views
Skip to first unread message

Alexis Richardson

unread,
Jul 22, 2011, 2:12:28 PM7/22/11
to sp-discu...@googlegroups.com, Kohei Honda, Tony Garnock-Jones
On Fri, Jul 22, 2011 at 2:23 PM, Gary Berger <gabe...@cisco.com> wrote:
>
> One example of an abstract I like is the Tupe-Space concept "Object Space
> can be thought of as a virtual repository, shared amongst providers and
> accessors of network services, which are themselves abstracted as objects.
> Processes communicate among each other using these shared objects ‹ by
> updating the state of the objects as and when needed."[1]

This may be a dangerous path. Or it may be a good one ;-)

Spaces, file systems.... when can a store be a communication topology?
When cannot it be?

a

> EX.
> connect (topo,"topo://marketdataservice/equity/options");
>
> send (topo, "GLD", 3, 0);

Martin Sustrik

unread,
Jul 23, 2011, 4:55:18 AM7/23/11
to sp-discu...@googlegroups.com, Alexis Richardson, Kohei Honda, Tony Garnock-Jones
On 07/22/2011 08:12 PM, Alexis Richardson wrote:

> Spaces, file systems.... when can a store be a communication topology?
> When cannot it be?

As already said I have no experience in the area, however, I would say
it depends on whether the data in the database/tuplespace/filesystem are
duplicated across the nodes or whether they are partitioned (shards).

In the former case the topology for 'store' has broadcast semantics
(store the data in every replica) while 'read' has request/reply
semantics (ask one of the replicas for the data).

In the latter case both 'store' and 'read' can be accomplished by a
topology that delivers the command to particular instance, say based on
hash of the key or somesuch.

Anyway, I don't believe we have any experts in the area on the mailing
list, so there's no much point in going into detail.

What we have to do though is to make sure that there's a way to
"allocate" a new messaging pattern that can be then handed to the
experts so that they can define the semantics. The crucial feature here
is clear separation between different pattern, ie. making sure that any
possible change done in one pattern has zero effect on all the other
patterns.

Martin

J. Andrew Rogers

unread,
Jul 24, 2011, 2:05:51 PM7/24/11
to sp-discu...@googlegroups.com
On Sat, Jul 23, 2011 at 1:55 AM, Martin Sustrik <sus...@250bpm.com> wrote:
> On 07/22/2011 08:12 PM, Alexis Richardson wrote:
>
>> Spaces, file systems.... when can a store be a communication topology?
>>  When cannot it be?
>
> As already said I have no experience in the area, however, I would say it
> depends on whether the data in the database/tuplespace/filesystem are
> duplicated across the nodes or whether they are partitioned (shards).
>
> In the former case the topology for 'store' has broadcast semantics (store
> the data in every replica) while 'read' has request/reply semantics (ask one
> of the replicas for the data).
>
> In the latter case both 'store' and 'read' can be accomplished by a topology
> that delivers the command to particular instance, say based on hash of the
> key or somesuch.


A "store" is based on probability models of durability for data given
a physical topology. In real systems durable storage may not imply
that the data even exists anywhere that is addressable, only that it
has a high probability of being addressable in the future somewhere in
the system. There is no simple, clean abstraction that generalizes
well that I am aware of.

This has very complex semantics in distributed data stores. Even at
the level of a single computer with a simple disk system, data can be
in one of three or four different states with respect to durability
probability and on a distributed system with replicas the policy may
be mixed to achieve the optimal result. As an example of a common
policy, a logical store "commits" as soon as the data is in-memory on
the primary node and secondary nodes have persistently logged the data
but not applied it to their working in-memory image -- it is not
online *and* durable at any particular node. This is not too tricky to
implement if you ignore failure conditions.

From the standpoint of the network, the endpoints acting as stores are
often not equivalent which makes the policy logic a bit more
complicated than simple "M of N succeeded" type models because the
status of endpoints may be changed by events mid-write. For optical
buffering type durability models (a kind of "k-safety" persistence
model) "0 of N" is considered durable as long as the local interface
can guarantee that the data was in fact buffered, so it is not
"stored" in an accessible way at the time it is logically durable from
the sender's perspective.

If you look at systems with knobs in their protocols for tuning store
models, they usually severely restrict the system design assumptions
to keep it manageable. This is one of the reasons just about everybody
rolls their own model.

--
J. Andrew Rogers

Martin Sustrik

unread,
Aug 4, 2011, 5:15:33 AM8/4/11
to sp-discu...@googlegroups.com, J. Andrew Rogers
Hi Andrew,

Sounds like a pretty complex topic. I guess we should define the basics
of SP first, then move to defining specific messaging patterns, as used
in today's distributed applications, eg. "tuplespace" pattern above.

Martin

Reply all
Reply to author
Forward
0 new messages