DistributedPubSubExtension and the Distributed Workers tutorial

559 views
Skip to first unread message

Peter Wolf

unread,
Nov 4, 2013, 3:14:38 PM11/4/13
to akka...@googlegroups.com
Hello, this is probably a question for TypeSafe. 

I am learning about Remote Actors via the Akka "Distributed Workers" tutorial, and I have a question about DistributedPubSubExtension.  Hopefully, the answer can be added to the tutorial, and will help others learning this wonderful tool.

How do the Mediators on each machine in a cluster find each other?  I can find no description in the documentation, or tutorial.  The tutorial simply says

    "The master actor is made available for both front end and workers by
    registering itself in the DistributedPubSubMediator."

However, there is no description of the mechanism or how it is configured.

The documentation

    http://doc.akka.io/docs/akka/2.2.0/contrib/distributed-pub-sub.html

also does not seem to contain a description of the configuration.

Finally, the application.conf file in the source code only contains this

    akka {
      actor.provider = "akka.cluster.ClusterActorRefProvider"
      remote.netty.tcp.port=0
      extensions = ["akka.contrib.pattern.ClusterReceptionistExtension"]
    }

So, how does it work?  How do the Mediators know where to look?  How do I use this framework to set up Actors that run on a bunch of machines and find each other automatically?

Many thanks
Peter

Björn Antonsson

unread,
Nov 5, 2013, 3:20:45 AM11/5/13
to Akka User List
Hi Peter,

On Monday, 4 November 2013 at 21:14, Peter Wolf wrote:
Hello, this is probably a question for TypeSafe. 

I am learning about Remote Actors via the Akka "Distributed Workers" tutorial, and I have a question about DistributedPubSubExtension.  Hopefully, the answer can be added to the tutorial, and will help others learning this wonderful tool.

How do the Mediators on each machine in a cluster find each other?  I can find no description in the documentation, or tutorial.  The tutorial simply says

    "The master actor is made available for both front end and workers by
    registering itself in the DistributedPubSubMediator."

However, there is no description of the mechanism or how it is configured.

The documentation

    http://doc.akka.io/docs/akka/2.2.0/contrib/distributed-pub-sub.html

also does not seem to contain a description of the configuration.



The page you link to contains both a description of how it works and samples and configuration. What do you think is missing?

Finally, the application.conf file in the source code only contains this

    akka {
      actor.provider = "akka.cluster.ClusterActorRefProvider"
      remote.netty.tcp.port=0
      extensions = ["akka.contrib.pattern.ClusterReceptionistExtension"]
    }

So, how does it work?  How do the Mediators know where to look?  How do I use this framework to set up Actors that run on a bunch of machines and find each other automatically?


The mediator uses the Cluster to discover member nodes and then disseminates the information to the participating nodes. If you really want to know how it’s implemented, then look at the source code

That configuration in the sample project is common for all the cluster nodes and tells them to load the ClusterReceptionistExtension that provides the cluster client functionality.

What do you mean with finding each other automatically? The mediator is used for publish/subscribe to paths and topics. Actors have to subscribe (register themselves under a path or topic) and other actors have to publish to messages to those paths/topics.

B/

Many thanks
Peter

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
Björn Antonsson
Typesafe – Reactive Apps on the JVM
twitter: @bantonsson

Peter Wolf

unread,
Nov 5, 2013, 7:29:18 AM11/5/13
to akka...@googlegroups.com
Hi Bjorn, thanks for the quick answer

Maybe I don't understand how Clusters work...

Let's say I have several machines distributed across the Internet.  How do they find each other?

I expect, somewhere, I have to give the IP address of at least one machine.

Am I misunderstanding something?

Many thanks
Peter

Björn Antonsson

unread,
Nov 5, 2013, 7:36:51 AM11/5/13
to akka...@googlegroups.com
Hi Peter,

On Tuesday, 5 November 2013 at 13:29, Peter Wolf wrote:
Hi Bjorn, thanks for the quick answer

Maybe I don't understand how Clusters work...

Let's say I have several machines distributed across the Internet.  How do they find each other?

I expect, somewhere, I have to give the IP address of at least one machine.

Am I misunderstanding something?


A actor system that wants to participate in the cluster needs to join it. How that is accomplished is described here for Scala or here for Java.

In the activator example the nodes explicitly join the cluster in the main or test classes.

B/

Many thanks
Peter



On Monday, November 4, 2013 3:14:38 PM UTC-5, Peter Wolf wrote:
Hello, this is probably a question for TypeSafe. 

I am learning about Remote Actors via the Akka "Distributed Workers" tutorial, and I have a question about DistributedPubSubExtension.  Hopefully, the answer can be added to the tutorial, and will help others learning this wonderful tool.

How do the Mediators on each machine in a cluster find each other?  I can find no description in the documentation, or tutorial.  The tutorial simply says

    "The master actor is made available for both front end and workers by
    registering itself in the DistributedPubSubMediator."

However, there is no description of the mechanism or how it is configured.

The documentation

    http://doc.akka.io/docs/akka/2.2.0/contrib/distributed-pub-sub.html

also does not seem to contain a description of the configuration.

Finally, the application.conf file in the source code only contains this

    akka {
      actor.provider = "akka.cluster.ClusterActorRefProvider"
      remote.netty.tcp.port=0
      extensions = ["akka.contrib.pattern.ClusterReceptionistExtension"]
    }

So, how does it work?  How do the Mediators know where to look?  How do I use this framework to set up Actors that run on a bunch of machines and find each other automatically?

Many thanks
Peter

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Wolf

unread,
Nov 5, 2013, 8:16:42 AM11/5/13
to akka...@googlegroups.com
Excellent.  I get it.  Thanks Bjorn

You might want to add a paragraph explaining this to your Tutorial

Remote Actors is the coolest part of your technology.  However there seems to be several different mechanisms for hooking them up (e.g. Microkernel, config file, DistributedPubSubExtension).  It is hard for a learner to separate them and get going.  Your Activator tutorials are a huge help.

Thanks again
P


√iktor Ҡlang

unread,
Nov 5, 2013, 8:28:45 AM11/5/13
to Akka User List
The Microkernel is how you deploy an Akka application, this is unrelated to remoting.
The DistributedPubSubExtension does have documentation and I guess if you haven't read the cluster documentation before you read the DPSE docs then you're starting in a suboptimal order.

Do you have any concrete advise as to how the documentation would be easier to consume we're all ears!

Cheers,
 

Thanks again
P


--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.



--
Cheers,

Viktor Klang

Director of Engineering

Twitter: @viktorklang

Peter Wolf

unread,
Nov 5, 2013, 9:21:51 AM11/5/13
to akka...@googlegroups.com
Just my experience...

I have been trying to learn Remote Actors for a month.  Scala and Actor were clearly the right tool for my job, so bought a bunch of books, read the tutorials, and hunted on StackOverflow (and here).

Writing an Actor application was very easy.  I had that working almost immediately.  However, there seemed to be many answers to "how to make Actors remote".

My book "Akka Concurrency" said "just change the config file" like this

{code}
akka { actor {
      provider = "akka.remote.RemoteActorRefProvider"
    }
    remote {
      transport = "akka.remote.netty.NettyRemoteTransport"
      netty {
        hostname = "127.0.0.1"
    } }
}
{code}
 
But that seemed completely different from the "Distributed Workers" tutorial.

Also, the documentation in the "Distributed Workers" tutorial spends a lot of time explaining the complicated sample code (e.g. FrontEnd, WorkExecutor etc.) but very little time explaining how the remote Actors are hooked up.

So the problem for learners is finding a thread to pull on-- how to start to understand writing ones own Remote Actors 

My concrete advice is to simplify the example to only have two trivial Remote Actors, and spend time explaining how they find each other and communicate.

But thank you for the wonderful tools.  I absolutely plan on moving my Torque and MapReduce work to Scala/Actor.  It is clearly a better solution!

Peter

√iktor Ҡlang

unread,
Nov 5, 2013, 9:41:35 AM11/5/13
to Akka User List
Did you read the remote actors documentation?

Cheers,


--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Wolf

unread,
Nov 5, 2013, 11:37:22 AM11/5/13
to akka...@googlegroups.com
Hi Victor, I love your stuff!

Yes, I did read the documentation.  The problem is that searching the Internet and books there is too much documentation, some of it is obsolete, and its hard to know where to start.

I am going to stick with your excellent Activator tutorials from now on, and am just offering my feedback as a new learner.  

In my case, the information I needed was in documentation not linked to the "Distributed Workers" tutorial


The tutorial is an excellent lesson in robust Actor programming.  I will definitely return to it when I am ready.  However, it does not explain how one sets up a cluster of machines to run Remote Actors.

So my feedback is to create a Remote Actor tutorial with much less code, and more explanation/links about the Cluster framework.  I think something like the old "Ping Pong" would be good for this.

Hope this is helpful.
P





On Monday, November 4, 2013 3:14:38 PM UTC-5, Peter Wolf wrote:

Roland Kuhn

unread,
Nov 5, 2013, 11:59:33 AM11/5/13
to akka...@googlegroups.com
Hi Peter,

I so happens that we recently filed a ticket to convert the old samples into Activator templates.

Regards,

Roland

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.



Dr. Roland Kuhn
Akka Tech Lead
Typesafe – Reactive apps on the JVM.
twitter: @rolandkuhn


remigius...@descom-consulting.ch

unread,
Sep 11, 2014, 3:28:18 AM9/11/14
to akka...@googlegroups.com
Hi Viktor,

Although this thread's activities seem to be gone for a while, I allow myself to post a question/suggestion here.

My use case: I'm looking for a pub-sub mechanism that will allow distributed data updates on clients that belong to the same organization. There might be hundreds or more organizations (hopefully) each having a one to about five people working simultaneously. A change of one should trigger an update on all members working for the same company. This would mean I have a large number of channels (or topics) each having only few (say 1-50) subscribers. Still, there may be significant traffic and the mediator might be a bottleneck here. Typically, one starts out with a small scale of an application, but would like to be prepared for its growth. Knowing how it can grow will allow you to avoid getting tied.

In the last section of:


is written:

"...it can be good to know that it is possible to start the mediator actor as an ordinary actor and you can have several different mediators at the same time to be able to divide a large number of actors/topics to different mediators. For example you might want to use different cluster roles for different mediators."

Maybe it is obvious for someone familiar with akka how this can be achieved, but for me - just evaluating its use - a sample for multiple mediators might be very helpful.

Cheers, R.

√iktor Ҡlang

unread,
Sep 11, 2014, 9:07:07 AM9/11/14
to Akka User List
Hi R,

Good questions!
There are so many samples that we wish we'd have time to create and maintain, but alas, we also need to work on Akka itself.
What Typesafe can help you with is consulting services to get your PoC up and running, let me know if that sounds worth exploring.

Happy hAkking!


>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.



--
Cheers,

Remigius Stalder

unread,
Sep 12, 2014, 5:46:11 AM9/12/14
to akka...@googlegroups.com
Hi √iktor,

I did not mean a full activator sample, but I think a few more words
in the documentation on *how* to achieve getting multiple mediators to
work for a large set of topics and/or subscribers might be helpful
(instead of just mentioning that it is possible). Is some sort of
sharding necessary? Can multiple concurrent mediators mediate
publishers/subscribers to the same topic or can a topic only be
subscribed via the same mediator it was published with? etc. As far as
I understand, multiple roles will create disjoint sets of cluster
members, which will break the symmetry when subscribing to one or more
topics having the same characteristics.

Cheers, R.
> You received this message because you are subscribed to a topic in the
> Google Groups "Akka User List" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/akka-user/rMANm-GnOig/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to

Patrik Nordwall

unread,
Sep 12, 2014, 8:41:34 AM9/12/14
to akka...@googlegroups.com
On Fri, Sep 12, 2014 at 11:46 AM, Remigius Stalder <remigius...@descom-consulting.ch> wrote:
Hi √iktor,

I did not mean a full activator sample, but I think a few more words
in the documentation on *how* to achieve getting multiple mediators to
work for a large set of topics and/or subscribers might be helpful
(instead of just mentioning that it is possible).

I think that is a fair point. I have added a note to a related issue.

 
Is some sort of
sharding necessary? Can multiple concurrent mediators mediate
publishers/subscribers to the same topic or can a topic only be
subscribed via the same mediator it was published with?

Yes, the mediators will be completely decoupled.
 
etc. As far as
I understand, multiple roles will create disjoint sets of cluster
members, which will break the symmetry when subscribing to one or more
topics having the same characteristics.

I'm not sure I understand the concern, but a member may have several roles, i.e. not disjoint.

/Patrik



--

Patrik Nordwall
Typesafe Reactive apps on the JVM
Twitter: @patriknw

Reply all
Reply to author
Forward
0 new messages