-- Martin Krasser blog: http://krasserm.blogspot.com code: http://github.com/krasserm twitter: http://twitter.com/mrt1nz
In your summary, the only query command type pre-defined in akka-persistence is QueryByPersistenceId. I'd find it useful to further pre-define other query command types in akka-persistence to cover the most common use cases, such as:
- QueryByStreamDeterministic(name, from, to) (as a generalization of QueryKafkaTopic, ... and maybe also QueryByPersistenceId)
- QueryByTypeDeterministic(type, from, to)
- QueryByStream(name, fromTime)
- QueryByType(type, fromTime)
Supporting these commands would still be optional but it would give better guidance for plugin developers which queries to support and, more importantly, make it easier for applications to switch from one plugin to another. Other, more specialized queries would still remain plugin-specific such as QueryByProperty, QueryDynamic(queryString), etc ...
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
Hi Martin and Alex,
the point you raise is a good one, I did not include it in my first email because defining these common (and thereby de-facto required) queries is not a simple task: we should not include something that most journals will opt out of, and what we pick must be of general interest because it presents a burden to every journal implementor—at least morally.
My reasoning for keeping persistenceIds and arbitrary named streams separate is that persistenceIds must be supported in a fully linearizable fashion whereas named streams do not necessarily have this requirement; on the write-side it might make sense to specify that persistenceIds are only ever written to from one actor at a time, which potentially simplifies the Journal implementation. This also does not hold for named streams.
Concerning types I assume that there is an implicit expectation that they work like types in Java: when I persist a ShoppingCartCreated event I want to see it in the stream for all ShoppingCartEvents. This means that the Journal needs to understand subtype relationships that in turn have to be lifted from the programming language used. It might be that this is possible, but at least it is not trivial. Is it reasonable to expect most Journal implementations to support this?
One query that we might want to include generically is the ability to ask for the merged streams of multiple persistenceIds—be that deterministic or not.
Another thought: there should probably be a trait JournalQuery from which the discussed case classes inherit, and we should specify that the Journal is obliged to reply to all such requests, explicitly denying those it does not implement.
Concerning types I assume that there is an implicit expectation that they work like types in Java: when I persist a ShoppingCartCreated event I want to see it in the stream for all ShoppingCartEvents. This means that the Journal needs to understand subtype relationships that in turn have to be lifted from the programming language used. It might be that this is possible, but at least it is not trivial. Is it reasonable to expect most Journal implementations to support this?
Hi Roland,
It's a good summary. As far as I can see it looks very complete and allows for a lot of flexibility between scalability and ease of use.
I will add that I am interested to see Martin's causal consistency with zero application-specific hints or code (unless its just obvious previous sequence as back pointer.)
Best,
Vaughn
Hi Ashley,
Personally I'm having a little trouble understanding how your three points differ much. Couldn't you extend PersistentActor as something like TopicEnricher and make it's enrich method call persist? I see your need, but I think it's enough out of the scope of ES that it is probably more application specific.
I do think that persisting metadata with events could benefit from an explicit parameter, so I think that's worth considering as part of the API.
Best,
Vaughn
Personally I'm having a little trouble understanding how your three points differ much.
Couldn't you extend PersistentActor as something like TopicEnricher and make it's enrich method call persist? I see your need, but I think it's enough out of the scope of ES that it is probably more application specific.
Most systems that store state say ORM and publish events are just really broken (in a subtle way). They have the problem of two sources of truth (what if they disagree?)
Hi Roland,
sounds great that you are pushing for the whole CQRS story.
I'm just experimenting with akka and CQRS and have no production experience, but I'm thinking about the concepts since some time. So please take my comments with a big grain of salt and forgive me for making it sound somewhat like a wish list. But I think if there is a time for wishes, it might be now.
I feel the need to distinguish a little more between commands, querys, reads and writes.
In a CQRS setup, I (conceptually) see three parts:
(1) the Command side: mainly eventsourced PersistentActors that build little islands of consistency for changing the application state
(2) the link between the Command and Query side: the possibility to make use of all or a subset of all events/messages written in (1) for building an optimized query side (3) or even for updating other islands of consistency on the Command side (other aggregates or bounded contexts) in (1)
(3) the Query side: keeping an eventually consistent view of the application state in any form that is suitable for fast application queries
From the 10000-foot view, we write to (1) and read from (3) but each of (1),(2) and (3) has its own reads and writes:
(1) writes: -messages produced by the application to be persisted
reads: -consistent read of the persisted messages for actor replay
-stream of all messages (I think, that's what you mean by SuperscalableTopic)
(2) writes: -at least conceptually: all messages of the all-messages stream of (1)
reads: -different subsets of the all-messages stream that make sense to different parts of our application
(3) writes: -any query-optimized form of our data that was read of some sub-stream of (2)
reads: -whatever query the query-side datastore allows (SQL, fulltext searches, graph walks etc.)
While (1) is the current akka persistence implementation plus a way to get all messages, (2) is more like the Event Bus (though I would name it differently) in this picture from the axonframework documentation (http://www.axonframework.org/docs/2.0/images/detailed-architecture-overview.png). (1) and (2) could be done by one product like what eventstore does with its projections or it could be different products like Cassandra for (1) and Kafka for (2). (3) could be anything that holds data.
Some more detail:
On (1):
As said before, the command side is fine as it is today to put messages into the datastore and get them out again for persistent actors. I definitely would consider replaying of messages for persistent actors part of the command side, since the command side in itself has stronger consistency requirements (consistentency within an aggregate) than the query side.
Additionally, as Ashley wrote, any actor should be able to push messages to the all-messages stream. In contrast to persistent actors I dont' see any need for replay here. Therefore and for other reasons (like message deduplication in (2)) I would like to propose adding an extra unique ID (UUID?) for any message handled by the command side, independent of an actors' persistenceId (which would be needed for replays nevertheless).
Also, I see the need to provide some guarantees for the all-messages stream. I would consider an ordering guarantee for messages from the same actor and an otherwise (at least roughly) timestamp based sorting a good compromise. This would also be comparable to the guarantees that akka provides for message sending. Ideally, the order stays the same for repeated reads of the all-messages stream. With the guarantees mentioned before, if the datastore keeps all messages of an actor on the same node, the all-messages stream could even be created per datastore node.
On (2):
As mentioned above, I see the QueryByPersistenceId as part of (1) as it requires stronger consistency guarantees. All other QueryByWhatever are all about the question, how to retrieve the right subset of messages from the all-messages stream for the application and its domain. This of course differs by application and domain. Therefore I like Martin's QueryByStream(name, ...), where a stream is any subset of messages the application cares about.
I also think it should not be up to the datastore to decide what streams to offer. I also can't really imagine how this should work in most datastores. While there might be some named index on top of JSON messages in MongoDB that can be served as as stream, I don't see how to create a stream/view/index in Key-Value stores or RDBMS where a message is probably persisted as byte array whithout any knowledge of the application.
To tell the datastore what streams to offer, I would consider something like projections in eventstore or user-defined topics in Martin's kafka-persistence that are driven by the application. In the simplest form it could be a projection function like
Message => Seq[String]
that is applied to each message of the all-messages stream, which taskes a message and gives back Strings of sub-stream-IDs. This could be anything from the type-name to the persistenceId of an actor to a property of the message. So it feels a little like "tagging on the read side" or more precisely "tagging on the way from the command to the query side". New messages should be added to the sub-streams as they arrive in the all-messages stream. This could probably be done in any datastore that can keep another index based on message IDs (if IDs are added on the command side) and is adjustable with application needs.
The interface for (2) could allow new projections that could be run against the all-messages stream or any sub-stream at any later time. A PersistentView could just track one of the projected sub-streams.
For datastore implementations this would bring the minimum requirement to construct sub-streams as told by the application and serve streams by name.
On (3):
On the query side of the application I see anything, that holds a view of the data written on the command side and allows querying/consuming by some criteria. As you wrote in your post, the variety of datastores is huge and I'm not sure if the set of common queries is very big. I like the idea of ad-hoc querying you proposed as it is a much more messaging like way to query datastores but I see disperate sets of standardized queries per datastore type (graph dbs, rdbms, key-value etc.) that focus on the datatsore conceps (tables and rows in rdbms, nodes and edges in graph dbs - something like QueryById(id, table) for RDBMS etc.) and not on messages. I would very much like to see this kind of datastore querying but I don't think it's neccessary for CQRS setups with akka.
As a user of a full akka-cqrs-solution the points to provide application logic could be
- which messages to persist in (1)
- which projections to make for the application in (2)
- which project streams to consume from (2) in order to trigger anything from query-side updates to creation of new commands for other aggregates to updating a view in someone's browser
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to:
- predefine trait JournalQuery with minimal semantics (to make the Journal support discoverable at runtime)
- predefine queries for named streams since that is universally useful; these are separate from PersistenceID queries due to different consistency requirements
- add support for write-side tags (see below)
- add a comprehensive PersistenceTestKit which supports the fabrication of arbitrary event streams for both PersistentActor and read-side verification
Ashley, your challenge about considering non-ES write-sides is one that I think we might not take up: the scope of Akka Persistence is to support persistent Actors and their interactions, therefore I believe we should be opinionated about how we achieve that. If you want to use CQRS without ES then Akka might just not be for you (for some values of “you”, not necessarily you ;-) ).
Now why tags? My previous conclusion was that burdening the write-side with generating them goes counter to the spirit of ES in that this tagging should well be possible after the fact. The problem is that that can be extremely costly, so spawning a particular query on the read-side should not implicitly replay all events of all time, that has the potential of bringing down the whole system due to overload. I still think that stores might want to offer this feature under the covers—i.e. not accessible via Akka Persistence standard APIs—but for those that cannot we need to provide something else. The most prominent use of tags will probably be that each kind of PersistentActor has its own tag, solving the type issue as well (as brought up by Alex and Olger). In summary, write-side tags are just an optimization.
Concerning the ability to publish to arbitrary topics from any Actor I am on the fence: this is a powerful feature that can be quite a burden to implement. What we are defining here is—again—Akka Persistence, meaning that all things that are journaled are intended to stay there eternally. Using this to realize a (usually ephemeral) event bus is probably going to suffer from impedance mismatches, as witnessed by previous questions concerning the efficiency of deleting log entries—something that should really not be done in the intended use-cases. So, if an Actor wants to persist an Event to make it part of the journaled event stream, then I’d argue that that Actor is at least conceptually a PersistentActor. What is wrong with requiring it to be one also in practice? The only thing that we might want to add is that recovery (i.e. the write-side reading of events) can be opted out of. Thoughts?
On 05.09.14 09:09, Roland Kuhn wrote:
Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to:
- predefine trait JournalQuery with minimal semantics (to make the Journal support discoverable at runtime)
- predefine queries for named streams since that is universally useful; these are separate from PersistenceID queries due to different consistency requirements
- add support for write-side tags (see below)
- add a comprehensive PersistenceTestKit which supports the fabrication of arbitrary event streams for both PersistentActor and read-side verification
Ashley, your challenge about considering non-ES write-sides is one that I think we might not take up: the scope of Akka Persistence is to support persistent Actors and their interactions, therefore I believe we should be opinionated about how we achieve that. If you want to use CQRS without ES then Akka might just not be for you (for some values of “you”, not necessarily you ;-) ).
Now why tags? My previous conclusion was that burdening the write-side with generating them goes counter to the spirit of ES in that this tagging should well be possible after the fact. The problem is that that can be extremely costly, so spawning a particular query on the read-side should not implicitly replay all events of all time, that has the potential of bringing down the whole system due to overload. I still think that stores might want to offer this feature under the covers—i.e. not accessible via Akka Persistence standard APIs—but for those that cannot we need to provide something else. The most prominent use of tags will probably be that each kind of PersistentActor has its own tag, solving the type issue as well (as brought up by Alex and Olger). In summary, write-side tags are just an optimization.
Concerning the ability to publish to arbitrary topics from any Actor I am on the fence: this is a powerful feature that can be quite a burden to implement. What we are defining here is—again—Akka Persistence, meaning that all things that are journaled are intended to stay there eternally. Using this to realize a (usually ephemeral) event bus is probably going to suffer from impedance mismatches, as witnessed by previous questions concerning the efficiency of deleting log entries—something that should really not be done in the intended use-cases. So, if an Actor wants to persist an Event to make it part of the journaled event stream, then I’d argue that that Actor is at least conceptually a PersistentActor. What is wrong with requiring it to be one also in practice? The only thing that we might want to add is that recovery (i.e. the write-side reading of events) can be opted out of. Thoughts?
Already doable with:
override def preStart(): Unit = {
self ! Recover(fromSnapshot = SnapshotSelectionCriteria.None, replayMax = 0L)
}
but maybe you were more thinking about different traits for writing and recovery (?)
Usually it comes down to the realization that the computer is not the book of record. One of my favourites was being asked to build a fully consistent inventory system. I generally like to approach things with questions, the one i had was 'sure but how do we get people who are stealing stuff to appropriately check it out?'
You received this message because you are subscribed to a topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/akka-user/MNDc9cVG1To/unsubscribe.
To unsubscribe from this group and all its topics, send an email to akka-user+...@googlegroups.com.
Usually it comes down to the realization that the computer is not the book of record. One of my favourites was being asked to build a fully consistent inventory system. I generally like to approach things with questions, the one i had was 'sure but how do we get people who are stealing stuff to appropriately check it out?'
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
Hi Murali,
I started development of an application based on Akka Persistence to implement CQRS concepts about a year ago. A lot of ideas came from different topics in this group. Recently I started to extract a small library from this application. The approach I took is to redundantly store all events in a global persistent actor in order to recreate views in a journal agnostic way. This also allows for easy in-memory testing.
There is a minor risk, in that storing the event twice breaks consistency. The globally stored events I only use for (re)constructing views. Once there is a better solution, the global persistent actor can be deleted and the views can be reconstructed in a different manner.
Even though this solution is not perfect, it might help your use case. I will add more tests and documentation over time as well, since currently, all tests remain in the application code.
https://github.com/Product-Foundry/akka-cqrs
Hope this helps,
Andre
case class QueryByPersistenceId(id: String, fromSeqNr: Long, toSeqNr: Long)
case class EventStreamOffer(metadata: Metadata, stream: Flow[PersistentMsg])
/Magnus
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to a topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/akka-user/MNDc9cVG1To/unsubscribe.
To unsubscribe from this group and all its topics, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
Greg, I agree with you there. I do not disagree with convenience. :)
But where are different kinds of convenience:
As a devops person I want to minimize shared state in order to have pieces of software that can fail and start independently. I want to sleep at night and have free weekends.
So the questions I hid in my text was:
What is the motivation to solve your problem in akka-persistence instead of in your application code?
Magnus
On Fri, May 8, 2015 at 1:07 PM, Alejandro López <ale6...@gmail.com> wrote:
Right now, what would be the best alternative in order to create projections, allowing views to subscribe to arbitrary events as described by Roland above? (or at least a coarse approximation)
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to a topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/akka-user/MNDc9cVG1To/unsubscribe.
To unsubscribe from this group and all its topics, send an email to akka-user+...@googlegroups.com.
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
My upcoming book "Reactive Messaging Patterns with the Actor Model" discusses the use of PersistentView to support a Durable Subsriber.