Event Stores for Akka Persistence for CQRS?

Ashley Aitken

unread,

Aug 18, 2014, 10:52:03 PM8/18/14

to akka...@googlegroups.com

I'm keen to hear other people's thoughts on the choice of an event store for Akka Persistence for doing CQRS.

As mentioned in my other post, I believe that Akka Persistence only provides part of the story for CQRS (but a very important part) and that other stores will most likely be needed for query models (both SQL and NOSQL stores).

Since they are project specific I would like to focus here on what store is "best" for Akka Persistence for CQRS.

Right now my leading contenders are Kafka and Event Store (but I haven't thought too much about Cassandra or MongoDB etc). My knowledge of all of these is limited so please excuse and correct me if any of my statements are wrong.

KAFKA: Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.

Persistent topics for publishing and subscribing

Highly scalable and distributed

Need to manually create topics for projections

Each topic has own copy of events

Writing to multiple topics is not atomic

Allows logs to be kept for different amounts of time

Battle tested technology from LinkedIn

Not generally used a lifetime store for events

http://kafka.apache.org

https://github.com/krasserm/akka-persistence-kafka/

EVENT STORE: The open-source, functional database with Complex Event Processing in JavaScript.

Built specifically for storing and projecting events

Store event once and create virtual projection streams

Journal plugin at early stage in development

Projections are still beta but finalising soon

JSON serialisation (which has +ve and -ve points)

Javascript for projection stream specification

Atom interface helps with debugging

Not as distributed or scalable?

Includes temporal criteria for streams

http://geteventstore.com

https://github.com/EventStore/EventStore.Akka.Persistence

Personally, I like the potential Kafka has to be the event store / log for CQRS but also the store for logs in big data processing and analytics. However, the fact that events need to be manually replicated to different topics and problems that would be caused if this wasn't consistent is a worry.

On the other hand, Event Store has been specifically designed and built for event store and projection processing by a leader in the field of CQRS. However, it uses a unique set of technologies and I am not sure of it has been battle tested by many or its long term viability.

What do others think? What are your thoughts of and has your experience been with other stores?

MONGODB: ?

CASSANDRA: ?

As mentioned, I can definitely see the use of the last two for query models in addition to one of the event persistence and projection stream store but have not really considered them for the latter myself.

Of course, enormous kudos and no disrespect to any of these fantastic free and open-source projects

Thanks in advance for sharing any thoughts / experiences.

Cheers,

Ashley.

Martin Krasser

unread,

Aug 19, 2014, 2:51:42 AM8/19/14

to akka...@googlegroups.com

Hi Ashley,

thanks for bringing up these questions. Here are some general comments:

as you already mentioned (in different words) akka-persistence is currently optimized around write models rather than read models (= Q in CQRS) i.e it is optimized for fast, scalable persistence and recovery of stateful actors (= PersistentActor).

For full CQRS support, the discussions so far (in several other threads) make the assumption that both write and read models are backed by the same backend store (assuming read models are maintained by PersistentView actor, receiving a stream of events from synthetic or physical "topics"). This is a severe limitation, IMO. As Greg already mentioned elsewhere, some read models may be best backed by a graph database, for example. Although a graph database may be good for backing certain read models, it may have limitations for fast logging of events (something where Kafka or Cassandra are very good at). Consequently, it definitely makes sense to have different backend stores for write and read models.

If akka-persistence should have support for CQRS in the future, its design/API should be extended to allow different backend stores for write and read models (of course, a provider may choose to use the same backend store to serve both which may be a reasonable default). This way PersistentActors log events to one backend store and PersistentViews (or whatever consumer) generate read models from the other backend store. Data transfer between these backend stores can be implementation-specific for optimization purposes. For example

- Cassandra (for logging events) => Spark (to batch-process logged events) => Graph database XY (to store events processed with Spark), or
- Kafka (for logging events) => Spark Streaming (to stream-process logged events) => Database XY (to store events processed with Spark Streaming)
- ...

These are just two examples how read model backend stores can be populated in a highly scalable way (both in batch and streaming mode). Assuming akka-persistence provides an additional plugin API for storage backends on the read model side (XY in the examples above) a wide range of CQRS applications could be covered with whatever scalability and/or ordering requirements needed by the respective applications. In case you want to read more about it, take a look at akka-analytics (it is very much work in progress as I'm waiting for Spark to upgrade to Akka 2.3 and Kafka to Scala 2.11)

WDYT?

Cheers,
Martin

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

-- 
Martin Krasser

blog:    http://krasserm.blogspot.com
code:    http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

Patrik Nordwall

unread,

Aug 19, 2014, 4:00:29 AM8/19/14

to akka...@googlegroups.com

On Tue, Aug 19, 2014 at 8:51 AM, Martin Krasser <kras...@googlemail.com> wrote:

Hi Ashley,

thanks for bringing up these questions. Here are some general comments:

as you already mentioned (in different words) akka-persistence is currently optimized around write models rather than read models (= Q in CQRS) i.e it is optimized for fast, scalable persistence and recovery of stateful actors (= PersistentActor).

For full CQRS support, the discussions so far (in several other threads) make the assumption that both write and read models are backed by the same backend store (assuming read models are maintained by PersistentView actor, receiving a stream of events from synthetic or physical "topics").

That is not my view of it, at all. PersistentView is a way to replicate the events to the read side, which typically will store a denormalized representation optimized for queries. That query store is typically not the same as the event store, because the requirements are very different.

Some simple read models may keep this representation in-memory but that is not what I see as the most common case.

/Patrik

--

Patrik Nordwall
Typesafe - Reactive apps on the JVM
Twitter: @patriknw

Martin Krasser

unread,

Aug 19, 2014, 4:53:46 AM8/19/14

to akka...@googlegroups.com

On 19.08.14 10:00, Patrik Nordwall wrote:

On Tue, Aug 19, 2014 at 8:51 AM, Martin Krasser <kras...@googlemail.com> wrote:

Hi Ashley,

thanks for bringing up these questions. Here are some general comments:

as you already mentioned (in different words) akka-persistence is currently optimized around write models rather than read models (= Q in CQRS) i.e it is optimized for fast, scalable persistence and recovery of stateful actors (= PersistentActor).

For full CQRS support, the discussions so far (in several other threads) make the assumption that both write and read models are backed by the same backend store (assuming read models are maintained by PersistentView actor, receiving a stream of events from synthetic or physical "topics").

That is not my view of it, at all. PersistentView is a way to replicate the events to the read side, which typically will store a denormalized representation optimized for queries. That query store is typically not the same as the event store, because the requirements are very different.

I agree, but recent discussions were about how to join events from several topics/streams that a PersistentView receives (e.g. all events of an aggregate type or based on a user-defined join/query). Stores (journal) that are optimized for high write-throughput are not necessarily the best choice for serving these joins/queries in an efficient way. Furthermore, why should I maintain a read model datastore via

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example?

ahjohannessen

unread,

Aug 19, 2014, 5:07:58 AM8/19/14

to akka...@googlegroups.com

On Tuesday, August 19, 2014 9:53:46 AM UTC+1, Martin Krasser wrote:

I agree, but recent discussions were about how to join events from several topics/streams that a PersistentView receives (e.g. all events of an aggregate type or based on a user-defined join/query)...

I think the most realistic approach is to limit join of events to persistent actors with same "topic" (e.g. type) and not arbitrary combinations of several topics/streams, because that can be

done much better with a read model datastore.

Ashley Aitken

unread,

Aug 19, 2014, 5:16:00 AM8/19/14

to akka...@googlegroups.com

On Tuesday, 19 August 2014 14:51:42 UTC+8, Martin Krasser wrote:

For full CQRS support, the discussions so far (in several other threads) make the assumption that both write and read models are backed by the same backend store (assuming read models are maintained by PersistentView actor, receiving a stream of events from synthetic or physical "topics"). This is a severe limitation, IMO. As Greg already mentioned elsewhere, some read models may be best backed by a graph database, for example. Although a graph database may be good for backing certain read models, it may have limitations for fast logging of events (something where Kafka or Cassandra are very good at). Consequently, it definitely makes sense to have different backend stores for write and read models.

Yes, I agree. This is mentioned in my (long and poorly formatted) post: https://groups.google.com/d/msg/akka-user/SL5vEVW7aTo/ybeJKoayd_8J

If akka-persistence should have support for CQRS in the future, its design/API should be extended to allow different backend stores for write and read models (of course, a provider may choose to use the same backend store to serve both which may be a reasonable default). This way PersistentActors log events to one backend store and PersistentViews (or whatever consumer) generate read models from the other backend store. Data transfer between these backend stores can be implementation-specific for optimization purposes.

I personally cannot see why Akka Persistence has to extend that far? I think it may be able to stop at reliable (at least once delivery) to another actor connecting to a query store on the read side. I think it may only need to cover [1] and [2] in this diagram:

<https://www.dropbox.com/s/z2iu0xi4ki42sl7/annotated_cqrs_architecture.jpg>

without forgetting sagas ;-)

BTW, can a PersistentView do AtLeastOnceDelivery? I don't think so as ALOD seems to need a PersistentActor to maintain its delivery state. But then how can a PersistentView reliably deliver events to an actor representing a query store?

It seems one needs a PersistentView that can read from a real (or synthetic) persistent event stream but also have its own persistence journal to maintain its delivery state. Is this possible with some mixture of extend/mixins?

For example

- Cassandra (for logging events) => Spark (to batch-process logged events) => Graph database XY (to store events processed with Spark), or
- Kafka (for logging events) => Spark Streaming (to stream-process logged events) => Database XY (to store events processed with Spark Streaming)
- ...

These are just two examples how read model backend stores can be populated in a highly scalable way (both in batch and streaming mode). Assuming akka-persistence provides an additional plugin API for storage backends on the read model side (XY in the examples above) a wide range of CQRS applications could be covered with whatever scalability and/or ordering requirements needed by the respective applications. In case you want to read more about it, take a look at akka-analytics (it is very much work in progress as I'm waiting for Spark to upgrade to Akka 2.3 and Kafka to Scala 2.11)

WDYT?

That sounds very interesting, thank you for explaining. I will read up on Akka-Analytics.

I guess though for simpler systems the read side application could use Akka Persistence to write to various query stores (as I mentioned above) and also handle queries from the clients (send query to query store, process response, repackage for client).

Finally, how do you see Cassandra comparing to Event Store in providing synthetic streams for the read side (i.e. can it)?

Cheers,

Ashley.

Ashley Aitken

unread,

Aug 19, 2014, 5:28:23 AM8/19/14

to akka...@googlegroups.com

On Tuesday, 19 August 2014 16:53:46 UTC+8, Martin Krasser wrote:

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example

I am confused, are you suggesting that spark is talking to the journal data store directly, without any involvement of Akka / Akka Persistence? If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Roland Kuhn

unread,

Aug 19, 2014, 7:00:49 AM8/19/14

to akka-user

Well, another comment is that spark uses Akka actors in its implementation, so I don’t see why it would magically be “much more efficient”. I think we are mixing up two concerns here, will reply later when I can type properly again.

Regards,

Roland

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Dr. Roland Kuhn
Akka Tech Lead
Typesafe – Reactive apps on the JVM.
twitter: @rolandkuhn

Martin Krasser

unread,

Aug 19, 2014, 7:33:55 AM8/19/14

to akka...@googlegroups.com

On 19.08.14 11:28, Ashley Aitken wrote:

On Tuesday, 19 August 2014 16:53:46 UTC+8, Martin Krasser wrote:

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example

I am confused, are you suggesting that spark is talking to the journal data store directly, without any involvement of Akka / Akka Persistence?

Yes. Why looping it through journal actors if something like the spark-cassandra-connector, for example, is able to do this in a highly scalable way? If one would like to have the same read scalability with an akka-persistence plugin, one would partly need to re-implement what is already done by the spark-cassandra-connector.

If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Because transformed/joined/... event streams in backend store on the read side must be consumable by PersistentViews (for creating read models). I still see this backend store to maintain changes (= transformed/joined/... events) instead of current state.

Martin Krasser

unread,

Aug 19, 2014, 7:36:00 AM8/19/14

to akka...@googlegroups.com

On 19.08.14 13:00, Roland Kuhn wrote:

19 aug 2014 kl. 11:28 skrev Ashley Aitken <amai...@gmail.com>:

On Tuesday, 19 August 2014 16:53:46 UTC+8, Martin Krasser wrote:

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example

I am confused, are you suggesting that spark is talking to the journal data store directly, without any involvement of Akka / Akka Persistence? If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Well, another comment is that spark uses Akka actors in its implementation, so I don’t see why it would magically be “much more efficient”. I think we are mixing up two concerns here, will reply later when I can type properly again.

This is a misunderstanding. As mentioned in my previous message, scaling reads through a single journal actor doesn't work, it's not about that I see a general performance issue with Akka actors.

Patrik Nordwall

unread,

Aug 19, 2014, 7:40:25 AM8/19/14

to akka...@googlegroups.com

On Tue, Aug 19, 2014 at 1:35 PM, Martin Krasser <kras...@googlemail.com> wrote:

On 19.08.14 13:00, Roland Kuhn wrote:

19 aug 2014 kl. 11:28 skrev Ashley Aitken <amai...@gmail.com>:

On Tuesday, 19 August 2014 16:53:46 UTC+8, Martin Krasser wrote:

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example

I am confused, are you suggesting that spark is talking to the journal data store directly, without any involvement of Akka / Akka Persistence? If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Well, another comment is that spark uses Akka actors in its implementation, so I don’t see why it would magically be “much more efficient”. I think we are mixing up two concerns here, will reply later when I can type properly again.

This is a misunderstanding. As mentioned in my previous message, scaling reads through a single journal actor doesn't work, it's not about that I see a general performance issue with Akka actors.

I think the integration "akka persistence -> kafka -> spark -> whatever" looks great, but not everybody has that infrastructure, and therefore we provide PersistentView as a simple way to replicate events to the read side, and then a de-normalized representation can be stored in whatever makes sense for the queries.

Martin, what do you suggest? Removing PersistentView altogether?

/Patrik

--

Martin Krasser

unread,

Aug 19, 2014, 7:47:12 AM8/19/14

to akka...@googlegroups.com

On 19.08.14 13:40, Patrik Nordwall wrote:

On Tue, Aug 19, 2014 at 1:35 PM, Martin Krasser <kras...@googlemail.com> wrote:

On 19.08.14 13:00, Roland Kuhn wrote:

19 aug 2014 kl. 11:28 skrev Ashley Aitken <amai...@gmail.com>:

On Tuesday, 19 August 2014 16:53:46 UTC+8, Martin Krasser wrote:

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example

I am confused, are you suggesting that spark is talking to the journal data store directly, without any involvement of Akka / Akka Persistence? If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Well, another comment is that spark uses Akka actors in its implementation, so I don’t see why it would magically be “much more efficient”. I think we are mixing up two concerns here, will reply later when I can type properly again.

This is a misunderstanding. As mentioned in my previous message, scaling reads through a single journal actor doesn't work, it's not about that I see a general performance issue with Akka actors.

I think the integration "akka persistence -> kafka -> spark -> whatever" looks great, but not everybody has that infrastructure, and therefore we provide PersistentView as a simple way to replicate events to the read side, and then a de-normalized representation can be stored in whatever makes sense for the queries.

Of course, that should be possible too, and as I already said, backend store providers can choose to use the very same backend for both plugins. There is absolutely no need that applications must use Spark as part of their infrastructure infrastructure. But if it is needed in large-scale applications, a seconds plugin API for on the read side would make things much more flexible. For users, who just want to have a single backend store, they just have to configure one additional line (plugin) in their application conf.

Martin, what do you suggest? Removing PersistentView altogether?

No, not at all, with an additional plugin, PersistentViews should have the option to read transformed/joined/... streams from a backend store that is optimized for that.

Patrik Nordwall

unread,

Aug 19, 2014, 7:49:50 AM8/19/14

to akka...@googlegroups.com

On Tue, Aug 19, 2014 at 1:46 PM, Martin Krasser <kras...@googlemail.com> wrote:

On 19.08.14 13:40, Patrik Nordwall wrote:

On Tue, Aug 19, 2014 at 1:35 PM, Martin Krasser <kras...@googlemail.com> wrote:

On 19.08.14 13:00, Roland Kuhn wrote:

19 aug 2014 kl. 11:28 skrev Ashley Aitken <amai...@gmail.com>:

On Tuesday, 19 August 2014 16:53:46 UTC+8, Martin Krasser wrote:

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example

I am confused, are you suggesting that spark is talking to the journal data store directly, without any involvement of Akka / Akka Persistence? If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Well, another comment is that spark uses Akka actors in its implementation, so I don’t see why it would magically be “much more efficient”. I think we are mixing up two concerns here, will reply later when I can type properly again.

This is a misunderstanding. As mentioned in my previous message, scaling reads through a single journal actor doesn't work, it's not about that I see a general performance issue with Akka actors.

I think the integration "akka persistence -> kafka -> spark -> whatever" looks great, but not everybody has that infrastructure, and therefore we provide PersistentView as a simple way to replicate events to the read side, and then a de-normalized representation can be stored in whatever makes sense for the queries.

Of course, that should be possible too, and as I already said, backend store providers can choose to use the very same backend for both plugins. There is absolutely no need that applications must use Spark as part of their infrastructure infrastructure. But if it is needed in large-scale applications, a seconds plugin API for on the read side would make things much more flexible. For users, who just want to have a single backend store, they just have to configure one additional line (plugin) in their application conf.

Martin, what do you suggest? Removing PersistentView altogether?

No, not at all, with an additional plugin, PersistentViews should have the option to read transformed/joined/... streams from a backend store that is optimized for that.

ah, now I understand what you mean. That makes sense.

/Patrik

delasoul

unread,

Aug 19, 2014, 8:33:11 AM8/19/14

to akka...@googlegroups.com

"and therefore we provide PersistentView as a simple way to replicate events to the read side, and then a de-normalized representation can be stored..."

If I understand this right, this means:
PersistenActor persists event
PersistentView queries EventStore e.g.: every second and forwards new events to the read side, e.g.: EventListener which then updates the ReadModel

What is the advantage of using the PersistenView here, instead of just emitting the event to the read side from the PersistentActor directly?

thanks,

michael

Roland Kuhn

unread,

Aug 19, 2014, 8:37:47 AM8/19/14

to akka-user

19 aug 2014 kl. 13:49 skrev Patrik Nordwall <patrik....@gmail.com>:

On Tue, Aug 19, 2014 at 1:46 PM, Martin Krasser <kras...@googlemail.com> wrote:

On 19.08.14 13:40, Patrik Nordwall wrote:

On Tue, Aug 19, 2014 at 1:35 PM, Martin Krasser <kras...@googlemail.com> wrote:

On 19.08.14 13:00, Roland Kuhn wrote:

19 aug 2014 kl. 11:28 skrev Ashley Aitken <amai...@gmail.com>:

On Tuesday, 19 August 2014 16:53:46 UTC+8, Martin Krasser wrote:

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example

I am confused, are you suggesting that spark is talking to the journal data store directly, without any involvement of Akka / Akka Persistence? If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Well, another comment is that spark uses Akka actors in its implementation, so I don’t see why it would magically be “much more efficient”. I think we are mixing up two concerns here, will reply later when I can type properly again.

This is a misunderstanding. As mentioned in my previous message, scaling reads through a single journal actor doesn't work, it's not about that I see a general performance issue with Akka actors.

I think the integration "akka persistence -> kafka -> spark -> whatever" looks great, but not everybody has that infrastructure, and therefore we provide PersistentView as a simple way to replicate events to the read side, and then a de-normalized representation can be stored in whatever makes sense for the queries.

Of course, that should be possible too, and as I already said, backend store providers can choose to use the very same backend for both plugins. There is absolutely no need that applications must use Spark as part of their infrastructure infrastructure. But if it is needed in large-scale applications, a seconds plugin API for on the read side would make things much more flexible. For users, who just want to have a single backend store, they just have to configure one additional line (plugin) in their application conf.

Martin, what do you suggest? Removing PersistentView altogether?

No, not at all, with an additional plugin, PersistentViews should have the option to read transformed/joined/... streams from a backend store that is optimized for that.

ah, now I understand what you mean. That makes sense.

I’m not completely there yet: in which way does this require changes to Akka Persistence? The only thing we need is to support multiple Journals in the same ActorSystem, and a way for PersistentView and PersistentActor to select between them, is this what you mean? Or do you mean that the read-side would be a new kind of plugin?

OTOH this would not solve the read-side concerns raised by Greg: building a View on top of an incoming event stream is precisely not what he wants, unless I got that wrong. The idea behind CQRS/ES is that the events from the write-side drive updates of the read-side which is then queried (i.e. actively asked instead of passively updating) in whatever way is appropriate (e.g. graph searches).

Regards,

Roland

Martin Krasser

unread,

Aug 19, 2014, 8:57:43 AM8/19/14

to akka...@googlegroups.com

On 19.08.14 14:37, Roland Kuhn wrote:

19 aug 2014 kl. 13:49 skrev Patrik Nordwall <patrik....@gmail.com>:

On Tue, Aug 19, 2014 at 1:46 PM, Martin Krasser <kras...@googlemail.com> wrote:

On 19.08.14 13:40, Patrik Nordwall wrote:

On Tue, Aug 19, 2014 at 1:35 PM, Martin Krasser <kras...@googlemail.com> wrote:

On 19.08.14 13:00, Roland Kuhn wrote:

19 aug 2014 kl. 11:28 skrev Ashley Aitken <amai...@gmail.com>:

On Tuesday, 19 August 2014 16:53:46 UTC+8, Martin Krasser wrote:

journal -> akka actor(s) -> read model datastore

when I can do this much more efficiently via

journal -> spark -> read model datastore

directly, for example

I am confused, are you suggesting that spark is talking to the journal data store directly, without any involvement of Akka / Akka Persistence? If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Well, another comment is that spark uses Akka actors in its implementation, so I don’t see why it would magically be “much more efficient”. I think we are mixing up two concerns here, will reply later when I can type properly again.

This is a misunderstanding. As mentioned in my previous message, scaling reads through a single journal actor doesn't work, it's not about that I see a general performance issue with Akka actors.

I think the integration "akka persistence -> kafka -> spark -> whatever" looks great, but not everybody has that infrastructure, and therefore we provide PersistentView as a simple way to replicate events to the read side, and then a de-normalized representation can be stored in whatever makes sense for the queries.

Of course, that should be possible too, and as I already said, backend store providers can choose to use the very same backend for both plugins. There is absolutely no need that applications must use Spark as part of their infrastructure infrastructure. But if it is needed in large-scale applications, a seconds plugin API for on the read side would make things much more flexible. For users, who just want to have a single backend store, they just have to configure one additional line (plugin) in their application conf.

Martin, what do you suggest? Removing PersistentView altogether?

No, not at all, with an additional plugin, PersistentViews should have the option to read transformed/joined/... streams from a backend store that is optimized for that.

ah, now I understand what you mean. That makes sense.

I’m not completely there yet: in which way does this require changes to Akka Persistence? The only thing we need is to support multiple Journals in the same ActorSystem, and a way for PersistentView and PersistentActor to select between them, is this what you mean?

This would go into the right direction, except that I wouldn't call the plugin that serves PersistentViews a "journal" because it only provides an interface for reading. Furthermore, this plugin could additionally offer an API for passing backend-specific query statements for joining/transforming/... streams on the fly (if supported/wanted).

Or do you mean that the read-side would be a new kind of plugin?

Yes, see above

OTOH this would not solve the read-side concerns raised by Greg: building a View on top of an incoming event stream is precisely not what he wants, unless I got that wrong. The idea behind CQRS/ES is that the events from the write-side drive updates of the read-side which is then queried (i.e. actively asked instead of passively updating) in whatever way is appropriate (e.g. graph searches).

I cannot see how my proposal is in contradiction with that. Can you please explain?

Juan José Vázquez Delgado

unread,

Aug 19, 2014, 9:39:09 AM8/19/14

to akka...@googlegroups.com

Hi guys, really interesting thread. However, it follows from this discussion that Akka Persistence is not currently 100% ready for a full CRQS/ES implementation. A little bit frustrating but, to be honest, it's true that it's still an experimental feature. As users, we're assuming this.

Anyway, and thinking about how to solve the Query part, what do you think about using some distributed in-memory data grid solution such as Hazelcast or GridGain?.

Regards,

Juanjo

Roland Kuhn

unread,

Aug 19, 2014, 9:44:10 AM8/19/14

to akka-user

19 aug 2014 kl. 15:39 skrev Juan José Vázquez Delgado <jvaz...@tecsisa.com>:

Hi guys, really interesting thread. However, it follows from this discussion that Akka Persistence is not currently 100% ready for a full CRQS/ES implementation. A little bit frustrating but, to be honest, it's true that it's still an experimental feature. As users, we're assuming this.

Akka Persistence is about persistent actors, using Event Sourcing to achieve this goal. This makes it a perfect fit for the C in CQRS. The Q on the other hand does not actually need to have anything to do with Akka or actors at all, per se. If we can provide nice things then we will, or course :-)

Anyway, and thinking about how to solve the Query part, what do you think about using some distributed in-memory data grid solution such as Hazelcast or GridGain?.

As I see it you should be able to use whatever fits your use-case for the Query side, in particular since the requirements for its structure are domain specific. Beware that none of the solutions are built on magic, though, and that things which sound too good to be true usually are.

Regards,

Roland

Regards,

Juanjo

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Patrik Nordwall

unread,

Aug 19, 2014, 10:11:33 AM8/19/14

to akka...@googlegroups.com

On Tue, Aug 19, 2014 at 2:33 PM, delasoul <michael...@gmx.at> wrote:

"and therefore we provide PersistentView as a simple way to replicate events to the read side, and then a de-normalized representation can be stored..."

If I understand this right, this means:
PersistenActor persists event
PersistentView queries EventStore e.g.: every second and forwards new events to the read side, e.g.: EventListener which then updates the ReadModel

What is the advantage of using the PersistenView here, instead of just emitting the event to the read side from the PersistentActor directly?

The PersistentView (read side) can process the events in its own pace, it is decoupled from the write side. It can be down without affecting the write side, and it can be started later and catch up.

Also, you can have multiple PersistentView instances consuming the same stream of events, maybe doing different things with them.

/Patrik

thanks,

michael

Greg Young

unread,

Aug 19, 2014, 10:49:06 AM8/19/14

to akka...@googlegroups.com

On Tuesday, August 19, 2014 9:44:10 AM UTC-4, rkuhn wrote:

19 aug 2014 kl. 15:39 skrev Juan José Vázquez Delgado <jvaz...@tecsisa.com>:

Hi guys, really interesting thread. However, it follows from this discussion that Akka Persistence is not currently 100% ready for a full CRQS/ES implementation. A little bit frustrating but, to be honest, it's true that it's still an experimental feature. As users, we're assuming this.

Akka Persistence is about persistent actors, using Event Sourcing to achieve this goal. This makes it a perfect fit for the C in CQRS. The Q on the other hand does not actually need to have anything to do with Akka or actors at all, per se. If we can provide nice things then we will, or course :-)

Yes it is very good at that now. Just need some way of supporting the Q side efficiently (without bringing in massive infrastructure) and its probably good :)

Greg Young

unread,

Aug 19, 2014, 11:04:23 AM8/19/14

to akka...@googlegroups.com

So I will mention some of the advantages of event store.

1) You can actually repartition etc your events for read models (functionality exists today)

2) As far as I know we are actually older than kafka

3) We support many things kafka does not (such as competing consumers on streams as well as client originated subscriptions)

4) We support things like max count/age but on a per stream basis not database as a whoie

5) We run in mac/linux/windows

6) We use standard protocols such as atom feeds over http

7) We have per stream ACLs (not sure if kafka does this) with support for most major authentication systems

8) We have an entire query language for adhoc querying your data (eg temporal correlation queries)

9) not sure on this one in comparison but we have commercial support with low SLAs

10) We can easily support 100m streams. This may have changed but last time I checked kafka was not designed to support vast numbers of streams (say stream/aggregate)

In general though things are quite similar between them...

Some downsides:

1) As of now we don't handle your sharding for you (this is on the list) we are focused on the bottom 95% right now though its been designed in from the beginning

2) Our akka based client is not super mature though there is also a java client

3) Our running environment may not be familiar to you (as of now everything is statically linked however and delivered as a native binary)

4) Our akka.persistence adapters are probably less mature
5) Our akka.persistence adapters are not supported by typesafe but by us (not sure on Martin's work but I would guess typesafe supported)

Cheers,

Greg

delasoul

unread,

Aug 19, 2014, 11:07:26 AM8/19/14

to akka...@googlegroups.com

Then the PersistentView is not used as a "middle-man" to replicate events to the read side, but it is the read side(meaning if a client sends a query a PersistentView creates the response)?
That's how I understood PersistentViews until now - but maybe that was wrong, so I' am asking...

thank's for your answer

Patrik Nordwall

unread,

Aug 19, 2014, 11:15:57 AM8/19/14

to akka...@googlegroups.com

On Tue, Aug 19, 2014 at 5:07 PM, delasoul <michael...@gmx.at> wrote:

Then the PersistentView is not used as a "middle-man" to replicate events to the read side, but it is the read side(meaning if a client sends a query a PersistentView creates the response)?
That's how I understood PersistentViews until now - but maybe that was wrong, so I' am asking...

That is one possible way of using a PersistentView, but not how I would use it in a large system. I would use it to consume the events and save a representation that is optimized for the queries in a separate database (or other type of product). Queries go directly (or via some other actor) to the database.

/Patrik

thank's for your answer

On Tuesday, 19 August 2014 16:11:33 UTC+2, Patrik Nordwall wrote:

On Tue, Aug 19, 2014 at 2:33 PM, delasoul <michael...@gmx.at> wrote:

"and therefore we provide PersistentView as a simple way to replicate events to the read side, and then a de-normalized representation can be stored..."

If I understand this right, this means:
PersistenActor persists event
PersistentView queries EventStore e.g.: every second and forwards new events to the read side, e.g.: EventListener which then updates the ReadModel

What is the advantage of using the PersistenView here, instead of just emitting the event to the read side from the PersistentActor directly?

The PersistentView (read side) can process the events in its own pace, it is decoupled from the write side. It can be down without affecting the write side, and it can be started later and catch up.

Also, you can have multiple PersistentView instances consuming the same stream of events, maybe doing different things with them.
/Patrik

thanks,

michael

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Ashley Aitken

unread,

Aug 19, 2014, 11:41:21 AM8/19/14

to akka...@googlegroups.com

On Tuesday, 19 August 2014 19:33:55 UTC+8, Martin Krasser wrote:

If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Because transformed/joined/... event streams in backend store on the read side must be consumable by PersistentViews (for creating read models). I still see this backend store to maintain changes (= transformed/joined/... events) instead of current state.

I am sorry I still don't see this.

This suggests to me that spark is talking directly to the read model datastore (e.g. graph database, MongoDB, SQL database).

So are you suggesting:

1. journal -> spark -> Akka actors (like PersistentView) -> read model data store

or

2. journal -> spark -> read model data store (like graph database, MongoDb, SQL database) -> Akka actors <- Queries

I see PersistentView(for generalised topics) as the glue between the Akka journal (write store) and read stores (1.).

Thanks for your patience.

Cheers,

Ashley.

Martin Krasser

unread,

Aug 19, 2014, 11:48:16 AM8/19/14

to akka...@googlegroups.com

On 19.08.14 17:41, Ashley Aitken wrote:

On Tuesday, 19 August 2014 19:33:55 UTC+8, Martin Krasser wrote:

If so, it sounds like a great solution but why would that require an extension to the Akka Persistence design/API?

Because transformed/joined/... event streams in backend store on the read side must be consumable by PersistentViews (for creating read models). I still see this backend store to maintain changes (= transformed/joined/... events) instead of current state.

I am sorry I still don't see this.

This suggests to me that spark is talking directly to the read model datastore (e.g. graph database, MongoDB, SQL database).

So are you suggesting:

1. journal -> spark -> Akka actors (like PersistentView) -> read model data store

or

2. journal -> spark -> read model data store (like graph database, MongoDb, SQL database) -> Akka actors <- Queries

I was suggesting 2.

I see PersistentView(for generalised topics) as the glue between the Akka journal (write store) and read stores (1.).

Thanks for your patience.

Cheers,

Ashley.

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

delasoul

unread,

Aug 19, 2014, 12:19:41 PM8/19/14

to akka...@googlegroups.com

Ok thanks, what confused me was: "a simple way to replicate events to the read side" - which I misunderstood for sending events, but you meant
smthg. else.
If a PersistentView is only involved in writing the ReadModel, is it not harder to achieve a consistent read model (have to make sure that the
PersistentView is alive to update the ReadModel)?

delasoul

unread,

Aug 19, 2014, 12:48:02 PM8/19/14

to akka...@googlegroups.com

As I am no Spark expert - will it be used only as kind of messaging(streaming) middleware to sync write and read store or also to somehow change/merge/
filter the events it gets/pulls from the write store or is this all done via the plugin for PersistentViews?
(I guess it has to be like this, otherwise using only one backend store cannot be supported?)

thanks,

michael

Martin Krasser

unread,

Aug 19, 2014, 1:19:14 PM8/19/14

to akka...@googlegroups.com

On 19.08.14 18:48, delasoul wrote:

As I am no Spark expert - will it be used only as kind of messaging(streaming) middleware to sync write and read store or also to somehow change/merge/
filter the events it gets/pulls from the write store

usually, to process (transform/aggregate/filter/...) these events.

Prakhyat Mallikarjun

unread,

Aug 20, 2014, 3:33:12 AM8/20/14

to akka...@googlegroups.com

Team,

Consider I have domain model

Bank
Customer's

Account's

Bank object will have many customers.

Every customer will have multiple accounts.

Consider I implement the above model using akka persistence. For the sake of discussion consider I make each as different Aggregate Roots using PersistentActor's.

I want to implement a query : Give my all customers whose balance is less then 100$.

Do you mean,

1. Create one PersistentView XYZ.

2. This XYZ will listen from all 3 AR's.

3. PersistentView XYZ will replicate these events as state to some DB consider cassandra.

4. Client will query directly cassandra to find "all customers whose balance is less then 100$".

Is my understanding correct?

If not, can you let me know how to achieve this using PersistentView?

-Prakhyat M M

Patrik Nordwall

unread,

Aug 21, 2014, 12:44:03 PM8/21/14

to akka...@googlegroups.com

On Wed, Aug 20, 2014 at 9:33 AM, Prakhyat Mallikarjun <prakh...@gmail.com> wrote:

Team,

Consider I have domain model
Bank
Customer's
Account's

Bank object will have many customers.
Every customer will have multiple accounts.

Consider I implement the above model using akka persistence. For the sake of discussion consider I make each as different Aggregate Roots using PersistentActor's.

I want to implement a query : Give my all customers whose balance is less then 100$.

Do you mean,
1. Create one PersistentView XYZ.
2. This XYZ will listen from all 3 AR's.

3. PersistentView XYZ will replicate these events as state to some DB consider cassandra.
4. Client will query directly cassandra to find "all customers whose balance is less then 100$".

Is my understanding correct?

Yes, maybe.

If you model each customer as a PersistentActor you have to setup one PersistentView for each customer, since currently a PersistentView instance can only consume events from one PersistentActor instance. Same thing with accounts.

We are currently discussing (in several threads) how we can improve PersistentView. Perhaps it should be able to consume events from several PersistentActors. Perhaps we should add another thing called topic, which PersistentActor can publish events to, and a PersistentView can consume events from. We don't know yet. We are collecting input.

Martin has also suggested that this type of scenario is better solved in the backend data store. In this case a program would transform the events in Cassandra by using Cassandra API:s and save another representation in Cassandra (or somewhere else). Clients can then query directly. No PersistentView involved.

/Patrik

--

Prakhyat Mallikarjun

unread,

Aug 21, 2014, 2:28:32 PM8/21/14

to akka...@googlegroups.com

Patrik,

Thanks.

You have given good directions to me.

Most of the apps will do queries or searches specific to domain or business. If business specific complex queries/searches are better solved in the backend data store, why akka persistence "PersistenceView" is required and should be still part of akka persistence?I am not doubting "PersistenceView" but want to understand why one would choose "PersistenceView" in design over simpler data store?

Akka persistence is making write side of my application very simple to configure and implement. But when it comes to complex querying/searching I find myself lost. I find no simple solution. Thanks to Greg I have one solution "Projection" but still I am evaluating whether it will fit our app. I have questions to answer myself whether to use cassandra event store or gregs event store. If I use cassandra event store I wont be able to use Projections.....its getting complex for me.....

I am looking for most simple solution for 'Q' in CQRS from akka persistence with DDD approach.

If team is working on solutions, when can we expect?

-Prakhyat M M

Björn Antonsson

unread,

Aug 22, 2014, 2:40:44 AM8/22/14

to akka...@googlegroups.com

Hi,

On 21 August 2014 at 20:28:35, Prakhyat Mallikarjun (prakh...@gmail.com) wrote:

Patrik,

Thanks.

You have given good directions to me.

Most of the apps will do queries or searches specific to domain or business. If business specific complex queries/searches are better solved in the backend data store, why akka persistence "PersistenceView" is required and should be still part of akka persistence?I am not doubting "PersistenceView" but want to understand why one would choose "PersistenceView" in design over simpler data store?

The PersistentView is not a data store in any sense of the word. It is a way to replay events from the journal in a simple way. Even if it is enhanced to support streaming of multiple id's or topics it is still can not be a specialized query model. PersistentView can be used to feed your data into a query model (maybe a normal SQL database or what fits your queries the best).

B/

--

Björn Antonsson

Typesafe – Reactive Apps on the JVM

twitter: @bantonsson

Greg Young

unread,

Aug 22, 2014, 2:02:49 PM8/22/14

to akka...@googlegroups.com

Persistent view is what enables a query. Maybe some higher level guidance might help you? http://www.ustream.tv/recorded/46673907 (a talk I did in budapest this spring) talks a lot about querying in event sourced systems and projections.

Cheers,

Greg

Prakhyat

unread,

Aug 23, 2014, 1:28:04 AM8/23/14

to akka...@googlegroups.com, akka...@googlegroups.com

Greg,

Great thanks a lot. I will look at it.

-prakhyat mm

Sent from my iPhone

You received this message because you are subscribed to a topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/akka-user/4kbYcwWS2OI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to akka-user+...@googlegroups.com.

Ashley Aitken

unread,

Aug 27, 2014, 2:04:06 PM8/27/14

to akka...@googlegroups.com

Can anyone please comment further on Cassandra as the event store?

I haven't got my head fully around column stores as yet but the Web site mentions "the performance of log-structured updates" which downs good for the write-side but also "strong support for denormalization and materialised views."

Would the latter two be useful in any way for creating the synthetic streams (and possibly projections) we have been talking about (e.g. a stream combining journals for a set of PAs or all PAs of a particular type). I'm not exactly sure what they mean for a column store.