Event Enrichment: Write Model + Integration Model + Read Model ?

685 views
Skip to first unread message

Forat Latif

unread,
Mar 22, 2014, 3:54:40 PM3/22/14
to ddd...@googlegroups.com
Hi everyone, 

I am building a system in which there is a read model and a write model, each in a different BC, but in some cases the read BC doesnt have all the data it needs in the domain event that is sent by the write BC.

This data is not present in the same aggregate that fires the event, its present in multiple aggregates.

AFAIK I have two options:

1) Enrich the event in the write side before sending it to the read side, to do that there are two ways (AFAIK):
  1.1)  Create a data structure that is eventually consistent with the aggregates (updated by a custom event handler) in the write BC that contains the information I need to enrich the events, in this case the "Event enricher" would enrich the events before sending them to the read side.
  1.2)  Add getter methods to some of the aggregates and the "event enricher" would load the other aggregates it needs and use the getter methods to enrich the event before sending it to the read side

2) Dont enrich the event in the write side, create a data structure in the read side that contains all the extra information it needs to create all its views, and use this data structure together with the event that is received while doing the denormalization, this is basically the same as 1.1 but the data structure is on the read side.


I like option 1.1 best, but it involves creating yet another model to maintain (which can lead to more pain and suffering), but its easy to generate, and its disposable just like the read model, its kind of like an "integration model" that lets the write BC send events with information that their consumers need.

Has anyone solved this problem in a better way?

Thanks

PS: I can post a simple (real world) example of this, but I'll do that later if necessary


@yreynhout

unread,
Mar 23, 2014, 7:55:07 PM3/23/14
to ddd...@googlegroups.com
Whatever solution you choose accept the inherent staleness (one option is less/more likely to be stale than another) of information in the other aggregates. If the book of records is "out there in the real world", this is even more likely.
Often I enrich for a different purpose, that is to make events more readable/descriptive. I donnow about you but memorizing guids is not my cup of tea. Now, I know when I'm doing that, there is a chance that information might be stale. I accept that. Sometimes that might not be acceptable but most of the time it is (we had this discussion before on this forum). There are pros/cons to doing such a thing. Advice like "Always do this" and "Never do that" does not apply here IMO.
There is this uneasy feeling you'll get when your read model starts dictating what should be in your event. Feedback like that might indicate you're not capturing enough and more/better analysis is required. Not "let's just slap it onto this event". Why is a good question here (when is it ever not a good question).
You should be able to reason about your write model pretty much in isolation (otherwise what would be the point of investing into it in the first place).

Regards,
Yves.

peter....@gmail.com

unread,
Mar 24, 2014, 2:58:13 AM3/24/14
to ddd...@googlegroups.com
I would advice against 2, as it makes replay very dificult (youd need to either rebuild one at the time in correct order or use a single writer)

In these cases I’ve usually split the data into 2 readmodels, and joined them when they’re used.

/Peter

--
You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mirko Bonadei

unread,
Mar 24, 2014, 4:04:30 AM3/24/14
to ddd...@googlegroups.com
Hi Forat.

I am in the middle of a solutions for a similar problem. I have decided that is not good to pollute events of a Bounded Context with information needed for a read model because since I can potentially have N read models this could become a mess very quickly.

So, I am creating a logs with all the events, on this log build indexes to be able to retrieve easily from the read models. At this point I unroll events into the read models which have their own complexity, they can merge two events in one de-normalized document and other complex things that are not responsibility of any bounded context.

I don't know if this is the final solution but it seems clean until now.

Yes, post your example if you have time.

Cheers,

Mirko

Nicolò Pignatelli

unread,
Mar 25, 2014, 5:06:07 AM3/25/14
to ddd...@googlegroups.com
Hi Forat,

probably it would be better to discuss against an example. If your aggregate can't provide the whole information to the read model, maybe there is something wrong with one of your models (read, write or both).

Let us know :)

Forat Latif

unread,
Mar 26, 2014, 5:41:06 PM3/26/14
to ddd...@googlegroups.com
Thanks guys, very interesting replies.

I think a good solution if you have to merge data from many aggregates is to have a small separate bounded context (an "integration context") that enriches the events for consumers. Its in essence what Mirko (event merging logic) and Peter (one of the read models) do, although their strategies differ.

Whether you consider this a separate BC or part of the read context if you have a single read model is irrelevant to the discussion because the idea is basically the same, but I prefer to call it what it is, a model (in a separate BC) whose purpose is integration between BCs, and if you have various consumers they dont dictate what you should put in your write model but how you enrich your events in this separate integration context.

This can also help you to deal with versioning of your events.

I think its better to discuss this with a very simplified example and list the different solutions:

I have the aggregates Course and Stutent (a classic example), if I want to enroll a student to a course I have the following command method on the Course Aggregate:

enrollStudent (Guid studentId)  that publishes the following event:  StudentEnrolled (Guid courseId, Guid studentId)

In the details view of the course I need to have the student name in the list of participants. 

From your replies we can discuss 2 solutions

1) The "integration model" I was talking about:

EnrollStudentCommand -> Course -> StudentEnrolled -> EventEnricher -> StudentEnrolled (with student name)  -> Read Model

The event enricher is the one that uses the integration model which in this case is a simple table with studentId and studentName.


The EnrichUserDisabled class is the event enricher that uses the internal map (that I presume is populated by listening to userAdded and usernameChanged events for example).

2) Mirko's Solution if I am not wrong is:

EnrollStudentCommand -> Course -> StudentEnrolled -> Read Model -> Reads other events and builds its view model with them.

I totally agree with Yves, there is probably no rule of thumb here (I am sure there are more solutions and you can put the in-between model anywhere), and staleness is unavoidable but acceptable for most cases (thats certainly my case). I think all we can say is that polluting the write model is not the way to go.

In some cases it could be that aggregate boundaries are wrong as Nicolò points out, but thats certainly not the case in this example.

In this simple example,  solution 1) is probably the simplest (I am not familiar with solution 2, maybe Mirko can tell us which one he would use in this case).

Thanks again, Cheers



--
You received this message because you are subscribed to a topic in the Google Groups "DDD/CQRS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dddcqrs/VZusJ5m-jMc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dddcqrs+u...@googlegroups.com.

Mirko Bonadei

unread,
Mar 27, 2014, 3:38:49 PM3/27/14
to ddd...@googlegroups.com
Hi Forat, yes the right point here is "I think all we can say is that polluting the write model is not the way to go".

On the example, yes the flow is quite this:
EnrollStudentCommand -> Course -> StudentEnrolled -> Read Model -> Reads other events and builds its view model with them.

But the term Read Model is not a good term. There are lots of read models which can react to the event StudentEnrolled and then all of them will update themselves on their own logic. Maybe this flow is better for the explanation:

EnrollStudentCommand -> Course -> StudentEnrolled -> Event Store -> Read Model 1
                                                                                                     -> Read Model 2
                                                                                                     -> ...
                                                                                                     -> Read Model N

Maybe the read model 1 is for advertising people, with their own ubiquitous language and their own meaning to events, and read model N has the same ubiquitous language of yours and it simply aggregates data in a smart way.

Events emitted by my aggregates are "fat", so I read an event in a human way I get: "At time YYYY-MM-DD HH:II:SS the Student <GUID> has been enrolled to the course <GUID>". So that example is quite straightforward. 

This is why my solution is probably more complex but more elastic (and that was a requirement for me). So give me time tomorrow and on Saturday morning so I can write up an valuable example to share with you in this interesting thread.

Cheers,

Mirko

Mirko Bonadei

unread,
Mar 29, 2014, 4:40:03 PM3/29/14
to ddd...@googlegroups.com
Here is the example.

Suppose that we are dealing with the previous situation. But we are not a University, we work for a company which sells courses online. We have an advertising team which try to sell using different channels (say A/B tests) and need to have specific reports to know which selling channel is the best.

We have different bounded contexts. One for manage the specific domain of the Courses. And in this BC we have the situation where the command EnrollStudent hits the aggregate Course and when the enrol is completed it triggers the event StudentEnrolled.

The event is in this form:

StudentEnrolled: {
    "conversation_id": "ee6bdf26-e524-4220-8c46-556f71274dc0"
    "course_id": "91b3077a-77ee-4193-a7b9-05bccf5f458b",
    "course_title": "Structure and Interpretation of Computer Programs",
    "student_id": "008ff4c4-5bfd-43a3-810f-52f0e75b5a4e",
    "student_name": "John Doe",
    "at": "2014-03-29T21:59:29Z",
}


But how we can track the acquisition channel?

For this we have a different BC, where we tackle the complexity of the presentation of our products, for example with different websites, ecc...

In this BC we can have an aggregate WebSite which can be hit by a command such as RequestStudentEnrollment and emits an event StudentEnrollmentRequested which contains all the information sent by the command. For example:

StudentEnrollmentRequested: {
    "conversation_id": "ee6bdf26-e524-4220-8c46-556f71274dc0"
    "course_id": "91b3077a-77ee-4193-a7b9-05bccf5f458b",
    "student_id": "008ff4c4-5bfd-43a3-810f-52f0e75b5a4e",
    "student_name": "John Doe",
    "channel": "Web Site with green background",
    "at": "2014-03-29T21:59:10Z",
}


I know that the channel here is really poor but I have similar real examples. :-)

So, I decide to put this information not in the BC here I implements all the complexity which regards Courses but in the BC where I implement the complexity of the presentation and the acquisition of students.

How can we generate a report with the following informations?

time: 2014-03-29T21:59:29Z
student_id: 008ff4c4-5bfd-43a3-810f-52f0e75b5a4e
student_name: John Doe
course_id: 91b3077a-77ee-4193-a7b9-05bccf5f458b
course_title: Structure and Interpretation of Computer Programs
channel: Web Site with green background


It is clear that we have pieces of information here and there and we need something which is able to generate the read model requested using the event log of the system.
So suppose that we have a really long event log, and a way to manage it in an optimal way (for example with indexes, using cache for immutable data, ecc...).

We can unroll the log, and insert each event in the logic of our new read model. In our case the logic should correlate events using the "conversation_id" and generate our report merging informations from two events: StudentEnrollmentRequested and
StudentEnrolled.

This way gives us the flexibility needed to help the advertising team. Reports are what red/green light in TDD is for us. So being able to merge, rename, and manipulate our data in an efficient way a a win and with this kind of data flow it seems possible.

What do you think?

Mirko

Reply all
Reply to author
Forward
0 new messages