Akka persistence, event sourcing, and SOA

JUL

unread,

Nov 7, 2013, 2:24:46 PM11/7/13

to akka...@googlegroups.com

Dear community,

When using DDD / Event Sourcing for the nice integration properties between SOA services, a service might need to 'subscribe' to other services' events to be able to process commands. In Akka persistence, this would translate into having one read-write journal (actual service event store), and multiple read-only journals (other services being listened to).

I can see right now that the configuration is supporting only one journal.

How would you implement such a use case? A custom "aggregating" journal routing events from the processors to the right journal? Or is this something that would likely be provided out of the box in the near future?

Thank you,

Julien

Andrew Easter

unread,

Nov 7, 2013, 2:54:13 PM11/7/13

to akka...@googlegroups.com

Julien,

I'm not completely sure what your idea of a "service" is, but I assume you are essentially referring to an EventsourcedProcessor (a persistent actor that uses event sourcing rather than command sourcing).

The concept of some second journal doesn't make a huge amount of sense. An EventsourcedProcessor will only deal with, as you allude to, persisting events to a single logical journal. From where I'm sitting, akka-persistence is not aiming for a higher level of abstraction than this - and nor should it. Implementing a DDD/CQRS abstraction layer on top of akka-persistence is very much down to the implementor. It's this abstraction layer that should deal with how events are published more widely, how they are listened to, and how they are acted upon.

There is some example code in the akka-persistence documentation showing a simple EventsourcedProcessor implementation:

http://doc.akka.io/docs/akka/2.3-M1/scala/persistence.html#Event_sourcing

You'll see that the example code takes responsibility for publishing persisted events to the system event stream:

context.system.eventStream.publish(event)

It's then completely up to your application to subscribe to events and decide how to act upon them when they occur. I'm not really sure where the concept of a second journal would fit in? It seems you are simply referring to a persistent read model? I'm not sure it's helpful to refer to that as a journal - whilst it could be implemented that way, it could also just as easily be implemented in a number of other ways too (e.g. an RDBMS). Regardless, it's definitely outside the scope of akka-persistence to make decisions on how people might implement read models in a CQRS based system.

I recently began working on a DDD/CQRS framework that builds on top of akka-persistence, but it's _very_ early days. You might want to check out the work that Vaughn Vernon has been doing which is more developed than anything I'm working on:

https://github.com/VaughnVernon/ActorModel

I'm not sure he is using akka-persistence, though.

Andrew

Andrew Easter

unread,

Nov 7, 2013, 2:58:24 PM11/7/13

to akka...@googlegroups.com

Another thing I'd add, in the context of DDD, services are stateless beings and would not be persistent. It's far better to visualise actors as aggregates which, naturally, do have state. I have written a blog post about this:

http://www.dreweaster.com/blog/2013/10/27/Akka-DDD-CQRS-Event-Sourcing-And-Me/

An "application service" in this context would probably resemble some kind of facade sitting in front of your actor driven domain model.

Andrew

Martin Krasser

unread,

Nov 8, 2013, 12:55:48 AM11/8/13

to akka...@googlegroups.com

Hi Julien,

in akka-persistence, each processor has its own logical journal. A
single processor can only write to and read from his own journal (where
reading is only done during recovery). If a processor's state depends on
the events emitted by another processor, it can write these events (or
something that is derived from them) to his own journal so that both can
recover independently from each other. Independent recovery is
especially important in a distributed setup where you don't want to make
a processor's ability to recover dependent on the availability of
(multiple) other services.

In the current implementation, all logical journals of processors from
the same ActorSystem are mapped to a single physical journal (backed by
LevelDB). With n ActorSystems (on the same or on different nodes) you'll
have n physical journals. This however is an implementation detail and
may change. Further optimizations may even recognize that
messages/events are redundantly journaled and only write pointers
instead of actual message/event data. From an application's perspective,
however, only the concept of one logical journal per processor is
important. Also, an application never interacts with journals directly,
only with processors.

Although you can already develop a distributed application with the
current LevelDB-backed journal (having 1 LevelDB instance per node, for
example), you can't yet migrate a processor from one node to another
(e.g. during failover) because LevelDB only writes to the local disk. To
support processor migration, journal replication is needed which will be
provided by distributed journal(s) in the near future.

Hope that helps.

Cheers,
Martin

> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ: http://akka.io/faq/
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google
> Groups "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to akka-user+...@googlegroups.com.
> To post to this group, send email to akka...@googlegroups.com.
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/groups/opt_out.

--
Martin Krasser

blog: http://krasserm.blogspot.com
code: http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

Patrik Nordwall

unread,

Nov 8, 2013, 2:19:05 AM11/8/13

to akka...@googlegroups.com

Martin, only one active writer per processor-id makes a lot of sense, but is it possible to use one writer and several readers? I think that can sometimes be useful. Given a distributed journal, an event sourced processor can "publish" the persisted domain events to CQRS read views via the journal, by having other processors with same processor-id replay events from latest know sequence number periodically (or triggered by something). These are only reading and possibly storing to another read view representation.

/Patrik

On Fri, Nov 8, 2013 at 6:55 AM, Martin Krasser <kras...@googlemail.com> wrote:

Hi Julien,

in akka-persistence, each processor has its own logical journal. A single processor can only write to and read from his own journal (where reading is only done during recovery). If a processor's state depends on the events emitted by another processor, it can write these events (or something that is derived from them) to his own journal so that both can recover independently from each other. Independent recovery is especially important in a distributed setup where you don't want to make a processor's ability to recover dependent on the availability of (multiple) other services.

In the current implementation, all logical journals of processors from the same ActorSystem are mapped to a single physical journal (backed by LevelDB). With n ActorSystems (on the same or on different nodes) you'll have n physical journals. This however is an implementation detail and may change. Further optimizations may even recognize that messages/events are redundantly journaled and only write pointers instead of actual message/event data. From an application's perspective, however, only the concept of one logical journal per processor is important. Also, an application never interacts with journals directly, only with processors.

Although you can already develop a distributed application with the current LevelDB-backed journal (having 1 LevelDB instance per node, for example), you can't yet migrate a processor from one node to another (e.g. during failover) because LevelDB only writes to the local disk. To support processor migration, journal replication is needed which will be provided by distributed journal(s) in the near future.

Hope that helps.

Cheers,
Martin

On 07.11.13 20:24, JUL wrote:

Dear community,

When using DDD / Event Sourcing for the nice integration properties between SOA services, a service might need to 'subscribe' to other services' events to be able to process commands. In Akka persistence, this would translate into having one read-write journal (actual service event store), and multiple read-only journals (other services being listened to).

I can see right now that the configuration is supporting only one journal.

How would you implement such a use case? A custom "aggregating" journal routing events from the processors to the right journal? Or is this something that would likely be provided out of the box in the near future?

Thank you,

Julien
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.

To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscribe@googlegroups.com.

To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

--
Martin Krasser

blog: http://krasserm.blogspot.com
code: http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

--

Read the docs: http://akka.io/docs/
Check the FAQ: http://akka.io/faq/
Search the archives: https://groups.google.com/group/akka-user

--- You received this message because you are subscribed to the Google Groups "Akka User List" group.

To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscribe@googlegroups.com.

To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

--

Patrik Nordwall
Typesafe - Reactive apps on the JVM
Twitter: @patriknw

Martin Krasser

unread,

Nov 8, 2013, 4:02:43 AM11/8/13

to akka...@googlegroups.com

On 08.11.13 08:19, Patrik Nordwall wrote:

Martin, only one active writer per processor-id makes a lot of sense, but is it possible to use one writer and several readers?

Yes, you can have several processors with the same processorId, even if their logic to process replayed messages differ (for projecting onto different read models, for example). BTW, the automated tests make heavily use of that. With availability of a distributed journal, they can also be distributed. However, it is currently not enforced that only one instance of processors with the same id can write. This is an addition that we'll need to implement (e.g. by using journals that limit writes to a single processor incarnation - just an idea. This write lock can be removed when the processor dies)

Updating read-only processors (replicas) can either be done pull-based (where a processor triggers its update by requesting a replay starting from a certain sequence number) or push-based (where a processor automatically receives new journaled messages whenever they are available). This is also something that needs to be implemented (where the former is trivial to add and the latter needs a pub/sub mechanism supported by journals)

I'm pretty confident that we can have that ready for the Akka 2.3 release.

I think that can sometimes be useful. Given a distributed journal, an event sourced processor can "publish" the persisted domain events to CQRS read views via the journal, by having other processors with same processor-id replay events from latest know sequence number periodically (or triggered by something). These are only reading and possibly storing to another read view representation.

/Patrik

On Fri, Nov 8, 2013 at 6:55 AM, Martin Krasser <kras...@googlemail.com> wrote:

Hi Julien,

in akka-persistence, each processor has its own logical journal. A single processor can only write to and read from his own journal (where reading is only done during recovery). If a processor's state depends on the events emitted by another processor, it can write these events (or something that is derived from them) to his own journal so that both can recover independently from each other. Independent recovery is especially important in a distributed setup where you don't want to make a processor's ability to recover dependent on the availability of (multiple) other services.

In the current implementation, all logical journals of processors from the same ActorSystem are mapped to a single physical journal (backed by LevelDB). With n ActorSystems (on the same or on different nodes) you'll have n physical journals. This however is an implementation detail and may change. Further optimizations may even recognize that messages/events are redundantly journaled and only write pointers instead of actual message/event data. From an application's perspective, however, only the concept of one logical journal per processor is important. Also, an application never interacts with journals directly, only with processors.

Although you can already develop a distributed application with the current LevelDB-backed journal (having 1 LevelDB instance per node, for example), you can't yet migrate a processor from one node to another (e.g. during failover) because LevelDB only writes to the local disk. To support processor migration, journal replication is needed which will be provided by distributed journal(s) in the near future.

Hope that helps.

Cheers,
Martin

On 07.11.13 20:24, JUL wrote:

Dear community,

When using DDD / Event Sourcing for the nice integration properties between SOA services, a service might need to 'subscribe' to other services' events to be able to process commands. In Akka persistence, this would translate into having one read-write journal (actual service event store), and multiple read-only journals (other services being listened to).

I can see right now that the configuration is supporting only one journal.

How would you implement such a use case? A custom "aggregating" journal routing events from the processors to the right journal? Or is this something that would likely be provided out of the box in the near future?

Thank you,

Julien
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.

To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.

To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

--
Martin Krasser

blog: http://krasserm.blogspot.com
code: http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

--

Read the docs: http://akka.io/docs/
Check the FAQ: http://akka.io/faq/
Search the archives: https://groups.google.com/group/akka-user

--- You received this message because you are subscribed to the Google Groups "Akka User List" group.

To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.

To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

--

Patrik Nordwall
Typesafe - Reactive apps on the JVM
Twitter: @patriknw

--

>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.

To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.

To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

Patrik Nordwall

unread,

Nov 8, 2013, 6:10:53 AM11/8/13

to akka...@googlegroups.com

Thanks for the description. That matches my expectations.

/Patrik

JUL

unread,

Nov 8, 2013, 9:31:51 AM11/8/13

to akka...@googlegroups.com

Thanks for your quick answer Andrew! I use "service" in an SOA context, not DDD. It would map to a bounded context in DDD land. The external journal I am talking about would map to an anti corruption layer in Evans DDD. Does that clarify the question?

JUL

unread,

Nov 8, 2013, 10:03:31 AM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

I understand that, but I was wondering what happen when you want to integrate 2 applications. Events are a great way to integrate 2 applications. Application A can subscribe to application B's event stream and build its own view model out of application B events. From what I understand, this would be possible with 2 actor system, but then application A don't have access to the view model recovered from application B's journal, right?

Andrew Easter

unread,

Nov 8, 2013, 10:35:57 AM11/8/13

to akka...@googlegroups.com

Julien,

It feels like it's the responsibility of Application B to expose it's read view(s) to Application A by whatever mechanism it deems fit. In the case that A and B are completely separate applications, is this not where SOA comes in? e.g. read views exposed via restful endpoints.

Additionally, there is no reason why a subscriber to journal events could not just forward them over some distributed message queue, allowing any separate applications to listen in.

Please do let me know if I'm missing your point completely!

I do like the idea of an out of the box mechanism for listening into journals rather than forcing processor implementations to explicitly publish events more widely.

Andrew

Martin Krasser

unread,

Nov 8, 2013, 10:40:33 AM11/8/13

to akka...@googlegroups.com

On 08.11.13 16:03, JUL wrote:

I understand that, but I was wondering what happen when you want to integrate 2 applications. Events are a great way to integrate 2 applications. Application A can subscribe to application B's event stream

Application A can subscribe to whatever actors/processors from application B publish. This pub/sub communication of events is nothing specific to akka-persistence - use whatever works for you (point-to-point remoting, distributed pub/sub , ...)

and build its own view model out of application B events. From what I understand, this would be possible with 2 actor system, but then application A don't have access to the view model recovered from application B's journal, right?

Application A can access any actor from Application B (via remoting), hence application A can access any read model, maintained by processors of application B (and vice versa).

Best if you think of the design of your application as if they were plain actors, and use akka-persistence only for those actor whose state should be durable (via event and/or command sourcing).

Martin Krasser

unread,

Nov 8, 2013, 11:00:33 AM11/8/13

to akka...@googlegroups.com

Listening to journaled messages/events on a per-processor basis is
anyway something we need to add for maintaining processor replicas (for
a push-based approach, see previous posts). Making this API public could
be a useful feature but I'm not sure yet how this API should relate
other parts of Akka, such as an ActorSystem's event stream, for example.

>
> Andrew

JUL

unread,

Nov 8, 2013, 11:40:47 AM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

I think there is a misunderstanding.

Traditional integration is through a REST or whatever RPC service exposing a view model of B. But event sourcing allows you to use events instead. A could consume events of B to build a specific-to-A-needs view model of B locally in A.

The first benefit is performance of course:access to B's view model is local. But more importantly, view model of B in A is probably very different from view model of B in B. It is tuned to the specific needs of A. Of course, live events will be channelled through external pub/sub, but at startup, the view model of B in A must be recovered first. Akka persistence have 95% of what is needed for that scenario, except access to different journal stores for 2 different local processors (processor for A, and processor for view model of B in A).

An alternative would be to pub/sub, and A stores events of B locally and replay from there. The problem is you are introducing a runtime dependency. What happens when you need to restart A? You need persistent pub/sub to buffer B's incoming events while A is down. I would find it simpler to use pub/sub for runtime, and plugs directly to B's journal at recovering time. It is although much more resilient to pub/sub issue: just restart view model of B in A in case of corruption or inconsistency.

Does it make sense?

Martin Krasser

unread,

Nov 8, 2013, 12:21:54 PM11/8/13

to akka...@googlegroups.com

On 08.11.13 17:40, JUL wrote:

I think there is a misunderstanding.

Traditional integration is through a REST or whatever RPC service exposing a view model of B. But event sourcing allows you to use events instead. A could consume events of B to build a specific-to-A-needs view model of B locally in A.

The first benefit is performance of course:access to B's view model is local. But more importantly, view model of B in A is probably very different from view model of B in B. It is tuned to the specific needs of A. Of course, live events will be channelled through external pub/sub, but at startup, the view model of B in A must be recovered first. Akka persistence have 95% of what is needed for that scenario, except access to different journal stores for 2 different local processors (processor for A, and processor for view model of B in A).

An alternative would be to pub/sub, and A stores events of B locally and replay from there. The problem is you are introducing a runtime dependency. What happens when you need to restart A? You need persistent pub/sub to buffer B's incoming events while A is down. I would find it simpler to use pub/sub for runtime, and plugs directly to B's journal at recovering time. It is although much more resilient to pub/sub issue: just restart view model of B in A in case of corruption or inconsistency.

This is exactly what was described in this post. Two processors with the same processor id reside on different nodes (A and B) where the processor on B read/writes to its journal and the processor on A reads-only from the same journal. These processors may have different implementation, with the result that the processor on A creates a view model of B that is specific to A.

Is that what you need?

Andrew Easter

unread,

Nov 8, 2013, 1:24:43 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

This is exactly what was described in this post. Two processors with the same processor id reside on different nodes (A and B) where the processor on B read/writes to its journal and the processor on A reads-only from the same journal. These processors may have different implementation, with the result that the processor on A creates a view model of B that is specific to A.

How would one deal with view models that are a projection of events from multiple processors? Is it a case of having multiple read-only processors, each reading from a different journal, aggregating data to form a separate projection stored elsewhere?

One other factor, is it really a given that view models have to be recovered on startup? This is assuming view model state can't be persisted and survive restarts? I suppose this is the point you make, Julien, with regard to avoiding losing messages published by B when A is offline? It does feel like I'm missing something here, though. Surely it's not always going to be practical to be forced to rebuild views on every restart of an application?

Andrew

JUL

unread,

Nov 8, 2013, 1:35:29 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

Yes, except on A, read/write processor A would recover from journal store A, and read processor B will recover from unrelated journal store B. Obviously, A and B are 2 different application, that have 2 completely different journal stores in different locations.

Martin Krasser

unread,

Nov 8, 2013, 1:49:17 PM11/8/13

to akka...@googlegroups.com

On 08.11.13 19:35, JUL wrote:

Yes, except on A, read/write processor A would recover from journal store A, and read processor B will recover from unrelated journal store B. Obviously, A and B are 2 different application, that have 2 completely different journal stores in different locations.

What I described is only possible with a distributed journal (i.e. a journal backed a distributed store, coming soon). Then a processor on node A can read what a processor (with the same processor id) on node B has written. With the current LevelDB journal, this is not possible, of course.

Martin Krasser

unread,

Nov 8, 2013, 2:05:23 PM11/8/13

to akka...@googlegroups.com

On 08.11.13 19:24, Andrew Easter wrote:

This is exactly what was described in this post. Two processors with the same processor id reside on different nodes (A and B) where the processor on B read/writes to its journal and the processor on A reads-only from the same journal. These processors may have different implementation, with the result that the processor on A creates a view model of B that is specific to A.

How would one deal with view models that are a projection of events from multiple processors? Is it a case of having multiple read-only processors, each reading from a different journal, aggregating data to form a separate projection stored elsewhere?

This may cause problems if one the the processors fails and replays messages but the others don't.

From the whole discussion thread so far, it looks to me that akka-persistence should introduce a new persistent actor: ReadModel (or whatever name is best for a "read-only processor") that is able to read from 1-n logical journals (but isn't able to write). Then we can leave it to applications to ensure that a Processor (read/write) is a cluster-wide singleton with as many ReadModel instances as needed. This would also avoid dealing with distributed write locks.

One other factor, is it really a given that view models have to be recovered on startup? This is assuming view model state can't be persisted and survive restarts? I suppose this is the point you make, Julien, with regard to avoiding losing messages published by B when A is offline? It does feel like I'm missing something here, though. Surely it's not always going to be practical to be forced to rebuild views on every restart of an application?

Andrew

JUL

unread,

Nov 8, 2013, 2:12:30 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

> How would one deal with view models that are a projection of events from multiple processors?

A separate view model would be built out of each event streams. A view model would not be projected from more than one stream.

> This is assuming view model state can't be persisted and survive restarts?

This is correct. I am assuming I want to use a similar architecture for the view model as well. As Martin mentioned, this can be achieved with multiple processors with same id recovering from same journal. It seems attractive to me having an in-memory view model. Of course, with its own snapshots to speed up view model recovery.

The rationale is that in practice, you often have cache layers on top of even view models, so why not go all the way and only have the in-memory cache?

Here is what I have in mind:

An example would be system B = CRM and system A = order fulfilment. To complete an order, you might need to know if the supplied customer ID do actually exist, and what kind of deals the customer negotiated (all information part of the CRM system). So system A would build a custom view model of A consisting of a dictionary from customer id to type of deal for example.

Andrew Easter

unread,

Nov 8, 2013, 2:30:06 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

A separate view model would be built out of each event streams. A view model would not be projected from more than one stream.

I'm not sure this can possibly hold true in every use case.

This is correct. I am assuming I want to use a similar architecture for the view model as well. As Martin mentioned, this can be achieved with multiple processors with same id recovering from same journal. It seems attractive to me having an in-memory view model. Of course, with its own snapshots to speed up view model recovery.

I think there is still some confusion going on here. Julien, it seems you are suggesting that view models are, themselves, event sourced - i.e. they are not read only and do write to their own journal. It seems that Martin is suggesting introducing a new persistent actor (ReadModel) that does not do any writes at all, it's strictly read only.

I can see the value in in-memory view models, although this may not be practical in every use case. It really depends on the view model and what it's storage requirements.

Another thing, Julien you previously state "A and B are 2 different application, that have 2 completely different journal stores in different locations." This is also at odds with what Martin is referring to as his proposal seems to imply that the applications, whilst decoupled, would be deployed within the same Akka cluster.

Andrew

Martin Krasser

unread,

Nov 8, 2013, 2:30:28 PM11/8/13

to akka...@googlegroups.com

On 08.11.13 20:12, JUL wrote:

> How would one deal with view models that are a projection of events from multiple processors?

A separate view model would be built out of each event streams. A view model would not be projected from more than one stream.

> This is assuming view model state can't be persisted and survive restarts?

This is correct. I am assuming I want to use a similar architecture for the view model as well. As Martin mentioned, this can be achieved with multiple processors with same id recovering from same journal. It seems attractive to me having an in-memory view model. Of course, with its own snapshots to speed up view model recovery.

Right, snapshots would need to be stored for each view model instance.

The rationale is that in practice, you often have cache layers on top of even view models, so why not go all the way and only have the in-memory cache?

Here is what I have in mind:

The live events propagation could also be done via the journal (either push or pull based). This would further simplify the application architecture.

An example would be system B = CRM and system A = order fulfilment. To complete an order, you might need to know if the supplied customer ID do actually exist, and what kind of deals the customer negotiated (all information part of the CRM system). So system A would build a custom view model of A consisting of a dictionary from customer id to type of deal for example.

Le vendredi 8 novembre 2013 13:24:43 UTC-5, Andrew Easter a écrit :

This is exactly what was described in this post. Two processors with the same processor id reside on different nodes (A and B) where the processor on B read/writes to its journal and the processor on A reads-only from the same journal. These processors may have different implementation, with the result that the processor on A creates a view model of B that is specific to A.

How would one deal with view models that are a projection of events from multiple processors? Is it a case of having multiple read-only processors, each reading from a different journal, aggregating data to form a separate projection stored elsewhere?

One other factor, is it really a given that view models have to be recovered on startup? This is assuming view model state can't be persisted and survive restarts? I suppose this is the point you make, Julien, with regard to avoiding losing messages published by B when A is offline? It does feel like I'm missing something here, though. Surely it's not always going to be practical to be forced to rebuild views on every restart of an application?

Andrew

Martin Krasser

unread,

Nov 8, 2013, 2:33:32 PM11/8/13

to akka...@googlegroups.com

Not necessarily to the same Akka cluster but have access to the same distributed journal.

Andrew

JUL

unread,

Nov 8, 2013, 2:42:21 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

> Julien, it seems you are suggesting that view models are, themselves, event sourced - i.e. they are not read only and do write to their own journal

You are correct in the sense that the view models would be event sourced as well. But they would use the main event store in a read-only way, to project a view model. The view models would only write their snapshots, not events. Events would only be written by aggregates.

> what Martin is referring to as his proposal seems to imply that the applications, whilst decoupled, would be deployed within the same Akka cluster.

You are correct, I am diverging. In this case, those would be 2 separate applications / cluster. The goal here would be to integrate 2 distinct applications within the enterprise (or 2 SOA services). I would use Akka persistence as a kind of service bus.

Andrew Easter

unread,

Nov 8, 2013, 2:42:30 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

Not necessarily to the same Akka cluster but have access to the same distributed journal.

Okay, I get it now :-)

Andrew

JUL

unread,

Nov 8, 2013, 2:47:26 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

The live events propagation could also be done via the journal (either push or pull based).

I am curious, how would you push across applications?

Martin Krasser

unread,

Nov 8, 2013, 2:59:07 PM11/8/13

to akka...@googlegroups.com

On 08.11.13 20:47, JUL wrote:

The live events propagation could also be done via the journal (either push or pull based).

I am curious, how would you push across applications?

The distributed journal maintains view model actor references (a view model actor registers itself at the distributed journal at startup). When a processor writes a message/event it can then be pushed to view models with a matching processor id.

Andrew Easter

unread,

Nov 8, 2013, 3:02:38 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

On a side-note, I've been thinking a lot over the past couple of days about how to implement the DDD/CQRS concept of a Process Manager in the framework I'm working on. This "new persistent actor" would be a very good fit, in theory. A Processs Manager consumes events, possibly from multiple aggregates, and issues commands based on those events. And the ability to snapshot state naturally makes sense.

Just an interesting thought.

Andrew Easter

unread,

Nov 8, 2013, 3:18:09 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

On Friday, 8 November 2013 12:02:38 UTC-8, Andrew Easter wrote:

On a side-note, I've been thinking a lot over the past couple of days about how to implement the DDD/CQRS concept of a Process Manager in the framework I'm working on. This "new persistent actor" would be a very good fit, in theory. A Processs Manager consumes events, possibly from multiple aggregates, and issues commands based on those events. And the ability to snapshot state naturally makes sense.

Just an interesting thought.

Worth bearing in mind, though, that a process manager would differ from a read model in that it would need to remain unique within the application and thus have it's own unique identifier (like a processor representing an aggregate). And a process manager would only fire out commands in response to live events, not replayed events - event sourced processors already have this distinction (i.e. live event handling logic is not applied during replay).

JUL

unread,

Nov 8, 2013, 4:03:14 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

When a processor writes a message/event it can then be pushed to view models with a matching processor id.

Making a distributed journal a full blown pub / sub middleware? That would be cool.

Richard Rodseth

unread,

Nov 8, 2013, 7:09:39 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

You mean like Apache Kafka? I've been reading a lot about Apache Kafka and it seems like the Kafka/Akka combination is a potent one (even without Storm). I don't see Akka persistence taking on the Kafka role, do you?

Andrew Easter

unread,

Nov 8, 2013, 7:50:45 PM11/8/13

to akka...@googlegroups.com, kras...@googlemail.com

I'm afraid I only have a limited knowledge of Kafka but had reasoned in the past, albeit with limited understanding, that it might be a very good fit for an event store. However, I do recall reading somewhere concerns about never removing data from Kafka - I believe it's used mostly in situations where old data can be cleared out in reasonable time. However, as I say, I'm no authority on this.

Andrew

Richard Rodseth

unread,

Nov 8, 2013, 7:56:05 PM11/8/13

to akka...@googlegroups.com, Martin Krasser

No, Kafka messages have a retention period. It defaults to a week, the value used by LinkedIn. Not a solution for event sourcing.

Andrew Easter

unread,

Nov 8, 2013, 8:04:20 PM11/8/13

to akka...@googlegroups.com, Martin Krasser

In which case, what role would Kafka play here? Is your suggestion to have the distributed journal use Kafka as it's way of publishing persisted events out to multiple subscribers?

Richard Rodseth

unread,

Nov 8, 2013, 11:42:19 PM11/8/13

to akka...@googlegroups.com, Martin Krasser

Yes. It can also be used like a queue with a single consumer. I'm no Kafka or DDD expert, but I would think Pub/Sub messaging plays a role in sagas or communicating between bounded contexts. But if the Akka folks think Pub/Sub is redundant I'd like to know about it.

Martin Krasser

unread,

Nov 9, 2013, 12:26:05 AM11/9/13

to akka...@googlegroups.com

On 08.11.13 22:03, JUL wrote:

When a processor writes a message/event it can then be pushed to view models with a matching processor id.

Making a distributed journal a full blown pub / sub middleware? That would be cool.

In the same way as journal and snapshot storage backends are pluggable, pub/sub providers should be pluggable as well. So, pub/sub may be provided by the same storage backend that is also used for journaling but may also be something completely different.

In addition to just abstracting over pub/sub providers, akka-persistence makes sure that read-only processors properly hook into live streams which is not trivial in presence of failures. This is the main reason why I don't want to put this burden on the application developer (nevertheless, the same pub/sub abstraction can also be used by applications directly, if needed).

This architecture is (to some extend) comparable to the architecture of Datomic:

- A single writer (transactor) makes updates to the storage service (compare to a single processor writing to a journal)
- A peer reads from the storage service during startup/recovery (compare to recovery of a read-only processor reading from the journal)
- A transactor makes live updates to peers (compare to a processor live-updating read-only processors via pub/sub)

I must admit that my knowledge of Datomic is not very deep, so please let me know if you think this comparison is not valid.

Martin Krasser

unread,

Nov 9, 2013, 12:37:19 AM11/9/13

to akka...@googlegroups.com

On 09.11.13 01:09, Richard Rodseth wrote:

You mean like Apache Kafka? I've been reading a lot about Apache Kafka and it seems like the Kafka/Akka combination is a potent one (even without Storm). I don't see Akka persistence taking on the Kafka role, do you?

Kafka could be used as pub/sub plugin (based on what I proposed in this post), and yes, akka-persistence shouldn't be yet another pub/sub solution but rather an abstraction over existing ones.

Martin Krasser

unread,

Nov 9, 2013, 12:53:34 AM11/9/13

to akka...@googlegroups.com

On 09.11.13 05:42, Richard Rodseth wrote:

Yes. It can also be used like a queue with a single consumer. I'm no Kafka or DDD expert, but I would think Pub/Sub messaging plays a role in sagas or communicating between bounded contexts. But if the Akka folks think Pub/Sub is redundant I'd like to know about it.

A pub/sub plugin for akka-persistence could be build on top of what Akka already provides (e.g. distributed pub/sub) but may also be based on 3rd-party pub/sub solutions.

Andrew Easter

unread,

Nov 9, 2013, 11:04:06 AM11/9/13

to akka...@googlegroups.com

It's definitely important to consider that some people may not wish to implement their read models using akka-persistence processors - in some use cases, it wouldn't be practical.

So, I think the pub/sub plugin approach works nicely. For example, if publishing to a Kafka topic, any application (not just an akka based one) could subscribe to live event feeds and project view(s) in whatever way makes sense.

This gives the implementor the choice as to whether to use persistent pub/sub or not (obviously Kafka has this by design). In the case of views that are only projected from a live event stream, this is going to be important such that events aren't missed when the read model consumer goes offline temporarily.

Andrew Easter

unread,

Nov 11, 2013, 10:59:06 AM11/11/13

to akka...@googlegroups.com

Martin,

This discussion has referred to creating a read only processor with the same processor id as a read/write counterpart.

This is nice but, at least in the use case I'm looking at, my read views are interested in receiving updates from multiple processors (each processor representing an aggregate root of a particular type).

I wonder whether there would have to be some flexibility in the way messages are routed from journal to read only views? For example, rather than a straight match on processor id, a regular expression could be used. In that way, one could choose to add a common prefix (i.e. aggregate root type) to processors of the same type and have a view that defines a regular expression that matches on that prefix (ignoring the unique part of the id).

Andrew

Martin Krasser

unread,

Nov 11, 2013, 12:00:06 PM11/11/13

to akka...@googlegroups.com

Hi Andrew,

there's one additional thing to consider when using read models that
receive events from multiple processors: causal dependency of events. If
events from different streams causally depend on each other, one has to
make sure that this causal dependency is preserved when replaying
messages. With akka-persistence this can only be achieved (at the
moment) when using a (read/write) processor (instead of the read model)
that journals the live streams it receives. Upon replay, the events are
then received in the very same order. Would akka-persistence replay
streams from the logical journals of the source processors, it could not
detect causal dependencies.

From this perspective I'm a bit hesitant to allow a read model to
recover from multiple event streams. Of course, some applications may
define read models for which it doesn't matter if the ordering in the
live stream differs from that in the replayed event stream but this is a
special case.

In order to support the general case properly, akka-persistence would to
need replace sequence numbers by vector clocks (where vector size ==
number of processors) and I doubt that this should be the scope of
akka-persistence (a framework on top of it could still add that, if
needed). So I recommend using a plain processor in your case.

Cheers,
Martin

Andrew Easter

unread,

Nov 11, 2013, 12:09:41 PM11/11/13

to akka...@googlegroups.com, kras...@googlemail.com

Hmmm, that's a very good point :-)

Thanks for the insight.

Andrew

ahjohannessen

unread,

Nov 12, 2013, 6:18:44 AM11/12/13

to akka...@googlegroups.com, kras...@googlemail.com

Would it make sense to have read model functionality that made it possible

to aggregate from 1-n processors of *same* type with different processor ids, e.g:

pid = claim-earnings-1

pid = claim-earnings-..

pid = claim-earnings-n

In this case there would be no causal dependencies between those streams. It would

be useful functionality for a read model that made it possible to get the total event

stream of processors with the same type.

Note: Reason for mentioning this is because I am in a situation where we build actor

subtree hierarchies up that represent unemployment sessions. Each unemployment

session hierarchy might stick around for some months and constitutes a model for

computation and data that is used for calculating benefits.

A read model that could aggregate streams of processors of same type would

probably make some things less complicated in this scenario.

Patrik Nordwall

unread,

Nov 12, 2013, 7:08:04 AM11/12/13

to akka...@googlegroups.com, Martin Krasser

On Tue, Nov 12, 2013 at 12:18 PM, ahjohannessen <ahjoha...@gmail.com> wrote:

Would it make sense to have read model functionality that made it possible
to aggregate from 1-n processors of *same* type with different processor ids, e.g:

pid = claim-earnings-1
pid = claim-earnings-..
pid = claim-earnings-n

In this case there would be no causal dependencies between those streams.

In what order do you expect the events to be replayed? Is it all claim-earnings-1 events followed by all claim-earnings-2...?

What is the advantage over having separate processors for each, that delegates replayed events to an aggregator actor?

/Patrik

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

--

Patrik Nordwall
Typesafe - Reactive apps on the JVM
Twitter: @patriknw

ahjohannessen

unread,

Nov 12, 2013, 12:01:56 PM11/12/13

to akka...@googlegroups.com, Martin Krasser

Hi Patrik,

The order could, in our case, as well be claim-earnings-2 events followed by claim-earnings-1 events.

I am not sure if there is any real advantage other than perhaps less wiring around the producers of events.

One can probably accomplish the same thing by using channels + proxies that point to

the aggregator actor or similar.

I am just curious about possible approaches for similar scenarios, e.g. multiple processors of same type

and one global stream for those :)

Patrik Nordwall

unread,

Nov 12, 2013, 2:19:08 PM11/12/13

to akka...@googlegroups.com

12 nov 2013 kl. 18:01 skrev ahjohannessen <ahjoha...@gmail.com>:

Hi Patrik,

The order could, in our case, as well be claim-earnings-2 events followed by claim-earnings-1 events.

Ok, then it can be implemented on top of existing processor building block.

/Patrik

JUL

unread,

Dec 19, 2013, 10:55:23 AM12/19/13

to akka...@googlegroups.com

To complement the discussion, Jay Kreps (the author of Kafka) recently posted a very nice paper on precisely the subject of logs as a means to integrate data between systems:

http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

It covers the subject of partition, ordering, state recovery of processors, and more.

Martin Krasser

unread,

Dec 29, 2013, 5:22:46 AM12/29/13

to akka...@googlegroups.com

Hi Julien,

this is now covered by this pull request. See also these posts. Should you have any comments please add them directly to the pull request.

Thanks,
Martin

Am Donnerstag, 7. November 2013 20:24:46 UTC+1 schrieb JUL:

Dear community,

When using DDD / Event Sourcing for the nice integration properties between SOA services, a service might need to 'subscribe' to other services' events to be able to process commands. In Akka persistence, this would translate into having one read-write journal (actual service event store), and multiple read-only journals (other services being listened to).

I can see right now that the configuration is supporting only one journal.

How would you implement such a use case? A custom "aggregating" journal routing events from the processors to the right journal? Or is this something that would likely be provided out of the box in the near future?

Thank you,

Julien

Ashley Aitken

unread,

Jun 18, 2014, 11:20:57 AM6/18/14

to akka...@googlegroups.com

Hi Patrick,

On Wednesday, 13 November 2013 03:19:08 UTC+8, Patrik Nordwall wrote:

The order could, in our case, as well be claim-earnings-2 events followed by claim-earnings-1 events.

Ok, then it can be implemented on top of existing processor building block.

Can you please give me a quick idea of how this could be implemented effectively now (i.e. a View with a know set of Processors to follow)?

I am thinking of a how to create a denormalised view for say a Department that contains Staff. If the DepartmentView follows the DepartmentESP then it has a list of updated Staff processor ids.

Are you suggesting that the DepartmentView will message each Staff member's ESP or View to get their state? Will it do this every n seconds as the DepartmentView is updated. Or something else?

Sorry to keep on going about this - I know Akka Streams may be the answer for this in the near future but I'm just keen to know what the alternative approaches are (e.g. using an event bus, publish-subscribe).

Thanks,

Ashley.

Akka Team

unread,

Jun 20, 2014, 4:39:46 AM6/20/14

to Akka User List

Hi Ashley

On Wed, Jun 18, 2014 at 5:20 PM, Ashley Aitken <amai...@gmail.com> wrote:

Hi Patrick,

On Wednesday, 13 November 2013 03:19:08 UTC+8, Patrik Nordwall wrote:

The order could, in our case, as well be claim-earnings-2 events followed by claim-earnings-1 events.

Ok, then it can be implemented on top of existing processor building block.

Can you please give me a quick idea of how this could be implemented effectively now (i.e. a View with a know set of Processors to follow)?

A View corresponds to one processorId but you can have for example an aggregate actor that has two Views as children, receiving updates from both (and supervising them) and doing aggregation. Of course you should be able to collate messages coming from the two view to avoid inconsistencies.

-Endre

I am thinking of a how to create a denormalised view for say a Department that contains Staff. If the DepartmentView follows the DepartmentESP then it has a list of updated Staff processor ids.

Are you suggesting that the DepartmentView will message each Staff member's ESP or View to get their state? Will it do this every n seconds as the DepartmentView is updated. Or something else?

Sorry to keep on going about this - I know Akka Streams may be the answer for this in the near future but I'm just keen to know what the alternative approaches are (e.g. using an event bus, publish-subscribe).

Thanks,
Ashley.

--

>>>>>>>>>> Read the docs: http://akka.io/docs/

>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html

>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.

For more options, visit https://groups.google.com/d/optout.

--

Akka Team

Typesafe - The software stack for applications that scale

Blog: letitcrash.com
Twitter: @akkateam

Ashley Aitken

unread,

Jun 22, 2014, 12:41:04 PM6/22/14

to akka...@googlegroups.com

Thank you for your post Endre.

On Friday, 20 June 2014 16:39:46 UTC+8, Akka Team wrote:

Can you please give me a quick idea of how this could be implemented effectively now (i.e. a View with a known set of Processors to follow)?

A View corresponds to one processorId but you can have for example an aggregate actor that has two Views as children, receiving updates from both (and supervising them) and doing aggregation. Of course you should be able to collate messages coming from the two view to avoid inconsistencies.

This sounds like it may be ok for one DenormalisedView with a few Views as children actors. However, there are some further constraints that I feel don't fit so well:

1. What happens when there are a very large number of DVs each based on hundreds of Vs? And what if different DVs have the same Vs? This seems like a lot of actor infrastructure just to create denomalised views, especially since an actor can only have one parent (AFAIK) so there would be many duplicate Vs.

2. You suggest the V actors would message their parent DV actor, so this suggests to me that all the Vs (and DVs) have to be active all the time polling for changes to their processor. With a very large number of DVs with hundreds of Vs each this seems an inefficient way to maintain denomalised views.

Please correct me if I am misunderstanding something (which could be very likely).

Of course, the read model is all about queries so perhaps we could drive it from the queries rather than the other way around. When a query comes for a DV (or V) we could construct the actor hierarchy (DV and Vs) to update the denormalised view but this would need to be done for every query.

It seems to me the best (and the most common) way to effectively implement this is to have it event-based, so that a change in a processor signals an update to a view (cached somewhere) which then signals an update to a denormalised view(s) (cached somewhere).

Queries can just read simply from an up-to-date view or denormalised view which could be stored in V or DV snapshots or even a (NOSQL) database. I don't see how this can be done with the current Akka Persistence.

The idea of using Akka Streams and Martin's StreamExample look interesting. However, it again seems like infrastructure that needs to be setup for each DV (of which there may be a very large number, each with hundreds of Vs).

We also need to be able to recreate the denormalised views (of which there could be a large number) at any time (e.g. due to corruption of a DV or to implement a new DV). I am not sure if Akka Streams will enable this, i.e. to replay all events from all the producers of a stream?

I am sorry but I am really struggling to see how the current Akka Persistence will effectively and efficiently work on the read side of CQRS (except in the case of Views based on a single Processor). Remembering that some read models are meant to be a sort of query cache.

Again, I hope this makes senses and apologise if it doesn't due to my misunderstanding CQRS or Akka / Persistence.

Cheers,

Ashley.

Patrik Nordwall

unread,

Jun 23, 2014, 2:22:46 AM6/23/14

to akka...@googlegroups.com

Hi Ashley,

We need something in the journal to make this efficient. Not sure if it is pulling from multiple PersistentActor (new name of Processor and EventsourcedProcessor) or if it is pushing events from the journal (or a combination).

However, the way I would implement it right now is as follows. Let's use your domain Department and Staff.

Staff extends PersistentActor. When a Staff actor is updated it publishes a notification event to the "department" DistributedPubSub topic.

This is also done after recovery and periodically with a low frequency (e.g. once per minute). The reason for the periodic notification is that those messages may be lost.

One the read side we have Department extends PersistentActor that subscribes to the "department" topic. When it receives a notification it creates a child actor StaffView extends View associated to the Staff persistentActorId, if it has not already been created. The StaffView has autoUpdate turned off. Update of the StaffView is triggered by the Department actor via the "department" notifications. Department actor can adjust the update rate. StaffView receives the events of the Staff and forwards interesting (transformed) events to the parent Department actor, which holds an aggregated read model of all Staff actors. Department also stores all persistentActorId, so that it can re-create the child StaffView actors in its recovery.

This is not efficient or scalable, but should be working for small systems. For bigger systems I would use Kafka, or implement ticket https://github.com/akka/akka/issues/15004.

Cheers,

Patrik

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

--

Reply all

Reply to author

Forward