Scalability of Akka Distributed Data

steffen...@emnify.com

unread,

Dec 21, 2017, 3:40:00 AM12/21/17

to Akka User List

Hi,

we are investigating Akka Distributed Data for storing some (Long -> Long) mapping in a LWWMap.

We plan for up to ~1mio entries for this and expect a write load of few hundreds/thousands entries per day, in contrast we expect a read load of some hundred requests per second. Therefore, we plan to use readLocal together with writeMajority.

Roughly calculating the size of the map, we should end up with some 20-30MB, which feels okay'ish when delta-CRDTs work.

While testing with only some thousand entries, we see huge propagation delays from one node to another in the order of tens of seconds (both running on the same PC). This raises our concern, if Akka Distributed Data is really a valid option for our use case (I will explain it a bit in detail below). What concerns us is what happens when we need to do a full cluster restart, when this map will fill up at a rate of like 100 entries per second. We currently do not feel too confident that this can work.

I already asked in the Akka-User chat and got the feedback that artery could help a bit to overcome the head-of-line blocking (we see the Phi value detector logging about higher delays). We tried that and got even slower updates IIRC. And the usual suspect to manually shard it into multiple top-level maps, which we could go for, but we would still have 100 maps at 10k entries each.

Our use case is as follows and I will describe it using an example of a CarActor, which is addressed by its ID with Cluster Sharding (so that we have one actor per imaginary car in the whole cluster). However, we need to send messages to that actor also via 3 other identifiers, let's say the IDs of the engine, gear, and the exhaust pipe (that these parts are occasionally replaced and even transferred to a different car pretty well reflects our scenario; reads would be like every time we pass a toll bridge.. damn.. I think we could earn way more money if we would really build that very system :-)). So these three mappings (engine ID -> car ID, gear ID -> car ID, exhaust ID -> car ID) are what additionally needs to disseminated among the nodes participating in cluster sharding. The characteristics of CRDTs sound very appealing here - an engine wouldn't be in another car in the very next second, leaving enough time for convergence via gossip. Having the whole data structure in memory feels like nothing to how we run our system currently (all nodes store all data, mostly successfully invalidate other node's state, backed by a heap of >100G).

It would be nice to get your feedback here, for both, if CRDTs will kill us one day and if you have other ideas, how to disseminate the mapping (in our mind we have 1) storing those in Redis accessible by all cluster sharding nodes and 2) having some mapping actors that know the mappings and store them in some local data structure; either 2a) one mapping actor per node knowing the whole state or 2b) one actor per engine ID, addressed via cluster sharding, knowing only the car ID by looking it up once in the data base, leading to 3mio tiny mapping actors alive ¯\_(ツ)_/¯).

Thanks

Steffen

Patrik Nordwall

unread,

Dec 21, 2017, 2:17:24 PM12/21/17

to akka...@googlegroups.com

Hi Steffen,

As documented Distribute Data has limitations with regards to data size. Crucial is to split it up to many top level entries. I’d suggest you try with 1000-10000 top level entries.

Even with delta-CRDTs it must sometimes transfer the full state, meaning that the message size for that mustn’t be too big (<200 kB). E.g. when that is done is when adding a new node to the cluster.

I’m slightly uncertain but I think LwwMap doesn’t support delta-CRDT at all (ORMap.put) so it’s important to split up in many top level instead.

I suggest that you create a simplified prototype for your use case. Then we could also try it, profile it and see if something could be improved. Doesn’t sound impossible to handle this amount.

Yes, you should use Artery.

/Patrik

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

steffen...@emnify.com

unread,

Jan 8, 2018, 3:30:33 AM1/8/18

to Akka User List

Thanks for your response, Patrik, and sorry for the late reply (was on holidays).

> I’m slightly uncertain but I think LwwMap doesn’t support delta-CRDT at all (ORMap.put) so it’s important to split up in many top level instead.

Akka docs says [1]:
> ORMap, ORMultiMap, PNCounterMap and LWWMap have support for delta-CRDT and they require causal delivery of deltas.

and in the log output we saw some information about delta replication. But it really didn't feel like it's going smooth.

There are IMHO contradictory statements regarding the limitations of the ORMap: Ryan Knight's presentation [2] states that one cannot add an entry that has been removed before. The documentation doesn't state such limitation. Could you comment on the current state in Akka 2.5? (okay.. could try implement it on my own, but maybe you have it in mind).

> I suggest that you create a simplified prototype for your use case. Then we could also try it, profile it and see if something could be improved. Doesn’t sound impossible to handle this amount.

Yes, I plan to do that the next days. I just have to think a bit about which part of the huge parameter space makes sense. We e.g. noticed that it helps a lot to have some pause between putting new entires into the map at a high rate.

Best

Steffen

[1] https://doc.akka.io/docs/akka/2.5.5/scala/distributed-data.html

[2] https://www.jfokus.se/jfokus17/preso/Akka-Distributed-Data-Deep-Dive.pdf

Reply all

Reply to author

Forward