Akka Cluster Pub/Sub performance for chat-room like applications

340 views
Skip to first unread message

Alexander Lukyanchikov

unread,
Jul 13, 2017, 7:08:19 AM7/13/17
to Akka User List
Hi, we are building a message processing system, which is basically looks like a classic chat room:

~ 1 million devices are connected via websockets to a dozen of nodes
- each of them subscribes on a number of topics
- each of them publishes updates to the topics, and we should deliver these updates to subscribed devices

The problem is, a number of topics is huge, like x100 times more than a number of devices.

Right now we are using Redis to store device-to-subscription and device-to-node relations. 
On each topic update, we are looking in cache to find all devices to notify and node addresses, where these devices are connected at the moment.

We are not using Akka Cluster yet, but looks like it's Pub/Sub functionality perfectly matches with our case and we can get rid of our logic and cache.

The only question, is it capable to manage tens of millions of topics? Would it perform better then our current solution?

MJ

unread,
Jul 13, 2017, 7:15:28 AM7/13/17
to Akka User List
Can you let us know the tech stack. 

Alexander Lukyanchikov

unread,
Jul 13, 2017, 8:54:22 AM7/13/17
to Akka User List
Sure. It is Java + Play + Akka. AWS environment, c4.xlarge machines (4 CPU, 8 Gb RAM, 750 Mbs bandwidth)

Justin du coeur

unread,
Jul 13, 2017, 8:55:46 AM7/13/17
to akka...@googlegroups.com
Question: how many subscribers does a "topic" typically have?  Also, what are your reliability requirements?

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscribe@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Justin du coeur

unread,
Jul 13, 2017, 8:56:52 AM7/13/17
to akka...@googlegroups.com
(I should note: I don't use Akka Pub/Sub myself, but I'm wondering whether Cluster Sharding actually fits your use case well.  Depending on the details, it might.)

On Thu, Jul 13, 2017 at 8:55 AM, Justin du coeur <jduc...@gmail.com> wrote:
Question: how many subscribers does a "topic" typically have?  Also, what are your reliability requirements?

johannes...@lightbend.com

unread,
Jul 13, 2017, 9:02:24 AM7/13/17
to Akka User List
On Thursday, July 13, 2017 at 1:08:19 PM UTC+2, Alexander Lukyanchikov wrote:
The only question, is it capable to manage tens of millions of topics? Would it perform better then our current solution?

No, most likely it currently won't scale up to 1 million active topics. In Akka's pubsub, each node keeps a topic actor that manages subscriptions of local actors to this topic. Then the information about which node is interested in which topics is replicated across the whole cluster.

We plan to test the actual memory consumption under realistic conditions but he haven't got to that so far.

I created a ticket to estimate the memory usage and to collect ideas about how to optimize for bigger workloads:


Johannes

johannes...@lightbend.com

unread,
Jul 13, 2017, 9:15:20 AM7/13/17
to Akka User List
On Thursday, July 13, 2017 at 2:56:52 PM UTC+2, Justin du coeur wrote:
(I should note: I don't use Akka Pub/Sub myself, but I'm wondering whether Cluster Sharding actually fits your use case well.  Depending on the details, it might.)

Yep, I guess that's true. With cluster sharding each topic would be managed on a single node. If that node goes down you either lose your subscriptions or you have them persisted in which case another node will pick them up after a while. Each message travels from the node where it is ingested to the node with the topic actor and from there to all the nodes that manage the external connections (like WS). Without any extra work, you will have to deduplicate that traffic or you will internally send each message multiple times for each external connection. If a topic is busy, the single topic actor might become a bottleneck.

With PubSub, when a node goes down, all subscriptions that had been managed at that node are gone. Each message is in the worst case broadcasted to the topic actor of each other node and from there locally to the subscribers.

Johannes

Alexander Lukyanchikov

unread,
Jul 14, 2017, 3:17:13 AM7/14/17
to Akka User List
Thank you! I've subscribed there to follow further discussions.
Reply all
Reply to author
Forward
0 new messages