Baratine 0.8.6

53 views
Skip to first unread message

Scott Ferguson

unread,
Nov 26, 2014, 10:58:21 AM11/26/14
to barat...@googlegroups.com
Baratine 0.8.6 is on baratine.io (and in maven as baratine.io/m2).

Download: http://baratine.io/download/baratine-0.8.6.tar.gz
Release notes: http://doc.baratine.io/v0.8/manual/release-notes/0.8/0.8.6
Javadoc: http://baratine.io/javadoc-0.8/

The biggest visible change is using JDK-8, particularly for the lambda
expressions,
and for the interface enhancements.

Services can now use a Streams-like API for queries. Calling clients can
send lambda-expressions as part of the query to execute in the service
itself.

The deployment has updated, mostly internally. There's more work
expected for 0.8.7, which will add more flexibility and work better with
dynamic pods.

jUnit has been improved, making it easier to test a single service in
isolation.

There's a new option for clients, essentially starting an embedded
Baratine instance that can connect to the Baratine cluster as a server.
This means the client can send lambda expressions to the server, using
BFS as a jar transport.

-- Scott


thomasm

unread,
Dec 7, 2014, 5:24:50 AM12/7/14
to barat...@googlegroups.com
Regarding:

Stream calls across a pod

Stream calls across a pod become map/reduce calls if the pod has multiple nodes. A “pair” pod will run the stream call on both node-0 and node-1 and then combine the results before returning to the caller. The calling code is identical to the non-pod call:

myService = manager.lookup("pod://my-pod/my-service");
myService.myStream("arg1")
         .filter(x->x.startsWith("my-prefix")
         .reduce((x,y)->x + "::" + y,
                 x->System.out.println("Result: " + x));



I am wondering how to write such a Service, that is active on all nodes. Is it just

@Service(/"my-service")

?

As far as i understood the available documentation (pre stream - era), my Service get's initialized on all Servers and a "primary" virtual node (=active) will handle the calls. If my primary node dies, the checkpoint and journal is used on the fallback of the "virtual node" to restore state.

And now for the "stream-api" ... how do i getting started here ? Is it still just a @Service annotation ? Will my service be initialized and available on *all* virtual nodes (but there is still a "primary" virtual node for "normal" calls, but these will not make much sense) ?

I would like to create some Service, that is running on all virtual nodes and each nodes hold a subset of the data (partitioned by a key), so i can combine the data on a node level and later on in total with such a stream call. For this it work i would need the service to be aware of it's current location and the whole setup (like you got initialized on node 1 of 10) so it knows with partition it is responsible for. How to i qet this metainformation during initialization ?

Scott Ferguson

unread,
Dec 8, 2014, 12:04:51 PM12/8/14
to barat...@googlegroups.com
On 12/7/14, 2:24 AM, thomasm wrote:
Regarding:

Stream calls across a pod

Stream calls across a pod become map/reduce calls if the pod has multiple nodes. A “pair” pod will run the stream call on both node-0 and node-1 and then combine the results before returning to the caller. The calling code is identical to the non-pod call:

myService = manager.lookup("pod://my-pod/my-service");
myService.myStream("arg1")
         .filter(x->x.startsWith("my-prefix")
         .reduce((x,y)->x + "::" + y,
                 x->System.out.println("Result: " + x));



I am wondering how to write such a Service, that is active on all nodes. Is it just

@Service(/"my-service")

?

The service deployment is unchanged. The stream is a method pattern: it works with all services.

The multi-node fork/join (map/reduce) works when you access the service with the "pod:" scheme. The pod's ServiceRef instances intercept the stream calls and split them among the nodes.

If you deploy a service to a pod that's "pair", "triad" or "cluster", and call a stream method through the "pod:", the pod will split that method call to all the nodes.




As far as i understood the available documentation (pre stream - era), my Service get's initialized on all Servers and a "primary" virtual node (=active) will handle the calls. If my primary node dies, the checkpoint and journal is used on the fallback of the "virtual node" to restore state.

Correct.



And now for the "stream-api" ... how do i getting started here ? Is it still just a @Service annotation ? Will my service be initialized and available on *all* virtual nodes (but there is still a "primary" virtual node for "normal" calls, but these will not make much sense) ?

I would like to create some Service, that is running on all virtual nodes and each nodes hold a subset of the data (partitioned by a key), so i can combine the data on a node level and later on in total with such a stream call. For this it work i would need the service to be aware of it's current location and the whole setup (like you got initialized on node 1 of 10) so it knows with partition it is responsible for. How to i qet this metainformation during initialization ?

I think we need to add an API to add more meta-data to the service.

The basic model is like REST, where each instance is owned by a node. So /my-service/1 belongs to node-3, /my-service/100 belongs to node 1, ... (It's hashed by the URL, so the distribution is random.)

Your service can create those instances with the @OnLookup method on the /my-service instance. I'll need to create an example to show that. They can be facades if you like; they don't need to hold state other than the key.

/my-service itself is where you'd do the map-reduce. Presumably your /my-service knows how to search all the instances.

It looks like we'll probably need to update the configuration for /my-service itself. In the case of the @ResourceService and ResourceManager, it's treated as something of a special case, which it probably shouldn't be.

The two (3?) things we need to add are:
  1. Services.getCurrentNode() - to find out what node the service is on.
  2. Some configuration for /my-service to let Baratine know to deploy it on all nodes.
  3. Probably restore the ServiceRef.node(15) and maybe add ServiceManager.node("/foo")

The second node("/foo") is because Baratine uses a consistent hash based on the URL to make sure the data is properly local. So a database key "/my-service/1" is on the same node as "/my-service/1" itself.

-- Scott

















Am Mittwoch, 26. November 2014 16:58:21 UTC+1 schrieb ferg:
Baratine 0.8.6 is on baratine.io (and in maven as baratine.io/m2).

   Download: http://baratine.io/download/baratine-0.8.6.tar.gz
   Release notes: http://doc.baratine.io/v0.8/manual/release-notes/0.8/0.8.6
   Javadoc: http://baratine.io/javadoc-0.8/

The biggest visible change is using JDK-8, particularly for the lambda
expressions,
and for the interface enhancements.

Services can now use a Streams-like API for queries. Calling clients can
send lambda-expressions as part of the query to execute in the service
itself.

The deployment has updated, mostly internally. There's more work
expected for 0.8.7, which will add more flexibility and work better with
dynamic pods.

jUnit has been improved, making it easier to test a single service in
isolation.

There's a new option for clients, essentially starting an embedded
Baratine instance that can connect to the Baratine cluster as a server.
This means the client can send lambda expressions to the server, using
BFS as a jar transport.

-- Scott


--
You received this message because you are subscribed to the Google Groups "Baratine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to baratine-io...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

thomas....@memenga.net

unread,
Mar 19, 2015, 3:38:56 PM3/19/15
to barat...@googlegroups.com
I can spot this type of functionality in the current release 0.8.7

>   2. Some configuration for /my-service to let Baratine know to
> deploy it on all nodes.

Has this already been implemented in 0.8.7 ? I can no see such a feature.

Because for now i can see my Stream call ends up on just one node (using a triad pod). I can see the "onInit()" calls happening multiple times for each node index, but the call only queries the primary node.

I struggeling with your example by using lookup to create multiple instances because using lookup on _self is resulting in something like "'local:///my-service/1' is an unknown service in AmpManager[podapp:pods/mypod.4]" ?


Can you give me any more adivce how to get the thing to work ? Should i use the latest source code from git and build it ?


The documentation says these stream calls that are issued on pods are automatically map-red calls. But i can not get this thing to work.



Scott Ferguson

unread,
Mar 20, 2015, 6:20:46 PM3/20/15
to barat...@googlegroups.com
It should happen automatically (we're releasing 0.8.8 in a few days, so
this might only apply to 0.8.8.)

All the nodes should have an instance of the service, although only the
"owner" node will be started. A map/reduce call will be sent to each
instance.

Note: this only applies to the StreamBuilder return type in the proxy,
and only for a proxy to a "pod://" URL. The StreamBuilder is responsible
for sending the multiple messages.

-- Scott
Reply all
Reply to author
Forward
0 new messages