Erlang client side scaling (and client endpoint fail over)

adr...@gmail.com

unread,

Jan 8, 2013, 10:30:06 AM1/8/13

to scal...@googlegroups.com

Hello,

I'm new to scalaris and I'm woundering how client erlang programs can connect to it without bottleneck.

Are all cluster server nodes equal meaning all clients can connect to any server node ? In that case, how should clients select nodes ?
Or is only one of the server nodes the endpoint of client requests (some dispatcher role) and its throughouput can become a bottleneck and SPOF ?
Or are some server nodes some "sharded" endpoint of client requests (somewhat dispatcher cluster) bringing scaling and failover on the cluster's client side ?

Are there any erlang client code samples I've missed ?
Is there some pooling proxy I should be aware of ?

I haven't found high level architecture diagrams in the doc. May be I've missed some explaining paragraphs.

Pierre M.

Florian Schintke

unread,

Jan 8, 2013, 12:37:55 PM1/8/13

to scal...@googlegroups.com

Hi,

all scalaris nodes accept client requests, so clients can connect to
any of them.

> I'm new to scalaris and I'm woundering how client erlang programs can
> connect to it without bottleneck.
>
> Are all cluster server nodes equal meaning all clients can connect to any
> server node ? In that case, how should clients select nodes ?
> Or is only one of the server nodes the endpoint of client requests (some
> dispatcher role) and its throughouput can become a bottleneck and SPOF ?
> Or are some server nodes some "sharded" endpoint of client requests
> (somewhat dispatcher cluster) bringing scaling and failover on the
> cluster's client side ?

Load, request distribution and client side fail over could easily be
done by dynamic or rotating DNS entries for a set of scalaris
servers. So the client can lookup a given server name like
scalaris.foo.bar.com and get a proper IP of an up and running node.

> Are there any erlang client code samples I've missed ?
> Is there some pooling proxy I should be aware of ?

Most scalable approach would be when different clients connect to
different Scalaris nodes. Pooling proxy could become a bottleneck,
that's why we do not offer some. Inside Scalaris we have the
functionality to retrieve a random node of the system. Currently, I do
not remember whether we already offer that via an API. But that way,
clients could learn additional nodes the could use to submit their
requests.

Erlang nodes could either use the JSON based API which is in place or
one could have some additional Erlang process in each erlang node
which dirctly receives requests in the form of Erlang terms via tcp.
We will discuss whether that makes sense in general and then would
implement such a process and document how to connect to Scalatis and
send requests to it.

Currently Scalaris nodes are also accessible via distributed Erlang,
but this may be dropped at some time in the future.

> I haven't found high level architecture diagrams in the doc. May be
> I've missed some explaining paragraphs.

Probably not, unfortunately the current documentation is not in the
best shape and one can always improve it. I'll think about your
suggestion to add some introductory higher level diagrams. If you have
more specific ideas for this, they are also welcome.

Florian

adr...@gmail.com

unread,

Jan 18, 2013, 5:48:48 AM1/18/13

to scal...@googlegroups.com

Hello Florian,

thanks for your answers. Here is some feedback and wild suggestions about the doc. "hope it helps"

The documentation has currently 2 parts called "Users guide" and "Developers guide". The first has materials for researchers, architects, packagers and application developpers. The second is for scalaris contributors (coding implementors and researchers).

I "externaly" suggest to have more parts each dedicated to a "work/hack/use" case (research being also a case):

Introduction: scalaris is a powerfull DBMS, backed by tons of research, has transactions, wikipedia example, replication and data locality, flexible storage backend, links to slides.
For researchers. Scientific and academic view. Topics, concepts, papers, professors, students, thesis, results. Thinks like gossip, chord, vivaldi.
For scalaris developpers. How/where all is implemented. Design choices. Contributor community (website, source code, issue tracker) and coding guidelines.
For packagers (say Linux/BSD distros). How to download, build and test the builds. How to report issues.
For datacenter sysops. Some nice use case with 3 data centers (from wiki example?). Some information about network topology, bandwidth and latency, switching failover, routing(firewalling), nodes distribution in all 3 places and installation of the scalaris packages and their configuration files. How to monitor the distributed cluster's activity. How to backup and restore if suitable storage backend+use.
For application architects. How to size the cluster, tune replication, data locality, uses of transactions or simple read/write.
For application developpers. ClientS-ServerS connections. The client API and its guidelines. code snippets.

The idea is to make each profile efficient with scalaris and not lost in other's issues. (of course everybody can read all, curiosity is great).
I think my questions about (simple)high level use case fall in the two last parts. (which node(s)/IP/port to connect to? which client module?)

For example, here is a memo I wrote myself about playing with hanoidb:
{ok, Tree} = hanoidb:open("dadb.hanoidb").
ok = hanoidb:put(Tree, <<"key1">>, <<"val1">>).
ok = hanoidb:put(Tree, <<"key2">>, <<"val2">>).
{ok,<<"val1">>} = hanoidb:get(Tree, <<"key1">>).
{ok,<<"val2">>} = hanoidb:get(Tree, <<"key2">>).
ok = hanoidb:close(Tree).
% also:
    ok = hanoidb:delete(Tree, <<"key1">>),
    not_found = hanoidb:get(Tree, <<"key1">>)
% key expiry:
    ok = hanoidb:put(Tree, <<"foo">>, <<"bar">>, 2), % 2 sec
    {ok, <<"bar">>} = hanoidb:get(Tree, <<"foo">>),
    ok = timer:sleep(3000), % 3000 ms
    not_found = hanoidb:get(Tree, <<"foo">>)

I think It is an easy way to get "hands on" a neat software project (beginning with simple "how to" from the existing manual). Then (new?) contributors can write more sophisticated materials.

I hope this makes sense.

Pierre M.

Florian Schintke

unread,

Jan 18, 2013, 6:54:28 AM1/18/13

to scal...@googlegroups.com

Thanks for your suggestion and feedback. It gives a reasonable
direction how we can evolve the documentation.

Florian

> 1. Introduction: scalaris is a powerfull DBMS, backed by tons of

> research, has transactions, wikipedia example, replication and data
> locality, flexible storage backend, links to slides.

> 2. For researchers. Scientific and academic view. Topics, concepts,

> papers, professors, students, thesis, results. Thinks like gossip, chord,
> vivaldi.

> 3. For scalaris developpers. How/where all is implemented. Design

> choices. Contributor community (website, source code, issue tracker) and
> coding guidelines.

> 4. For packagers (say Linux/BSD distros). How to download, build and

> test the builds. How to report issues.

> 5. For datacenter sysops. Some nice use case with 3 data centers (from

> wiki example?). Some information about network topology, bandwidth and
> latency, switching failover, routing(firewalling), nodes distribution in
> all 3 places and installation of the scalaris packages and their
> configuration files. How to monitor the distributed cluster's activity. How
> to backup and restore if suitable storage backend+use.

> 6. For application architects. How to size the cluster, tune

> replication, data locality, uses of transactions or simple read/write.

> 7. For application developpers. ClientS-ServerS connections. The client

> --
> You received this message because you are subscribed to the Google Groups "scalaris" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/scalaris/-/6r4tJBAPxXwJ.
> To post to this group, send email to scal...@googlegroups.com.
> To unsubscribe from this group, send email to scalaris+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/scalaris?hl=en.
>

Florian

Reply all

Reply to author

Forward