Category mappings - Event Sourcing or CRUD?

344 views
Skip to first unread message

Mark Bernman

unread,
Aug 16, 2017, 3:45:20 AM8/16/17
to DDD/CQRS
I'm trying to model the following in an event sourced way but running into some issues:

A Book can be assigned a number of categories (Many-to-Many between Book and Category)
A Vendor can be assigned a number of categories (Many-to-Many between Vendor and Category)
A Book can be assigned to only one Vendor

Expected Data Volume
Book - Millions
Vendor - Tens of thousands
Category - Thousands

When I want to assign a Book to a Vendor, I need to check that the intersection of the sets of categories of the Book and the Vendor is non-empty, i.e. the Book and Vendor share at least one Category.

The options I've been able to come up with aren't appealing to me:

1. A Category aggregate which contains every Book and Vendor that is of that Category.
Cons:
* Large aggregate, would end up with tens of thousands of events on the stream, which would be a pain to load. I've already run across problems with long running streams grinding my system to a halt. I've ended up snapshotting certain aggregates in an external data store, but it feels hacky to me.

2. Put the list of Categories on the Book and Vendor aggregates.
Cons:
* Both Books and Vendors need to worry now about Categories. They both need to contain logic to ensure they're not assigned to a Category more than once. What happens when the logic for Category assignment changes? I need to change the Category application logic in both Book and Vendor. This is a tighter coupling than I'd like.
* A large number of Category assignment/unassignment events on the Book and Vendor streams could make those streams be thousands of events long in short time.


Perhaps this is really a CRUD domain that I'm over-engineering by trying to fit it into my already existing event sourced BC. I could just store this in a SQL association table of sorts and query it when assigning a Book to a Vendor. I think eventual consistency is something I can't get around here with the data volumes I'm encountering.

Any suggestions on something I'm missing?

Kasey Speakman

unread,
Aug 16, 2017, 7:00:25 PM8/16/17
to DDD/CQRS
Conclusion I drew when faced with similar situation is that it is CRUD data. These are really only usable as whole sets, because inclusion and exclusion must be checked.

However, one thing you could do to still event-source the set changes (e.g. auditing, rebuilding) is to treat the (assuming eventually consistent) relational read-model as a snapshot. Read it to get (probably-stale) state along with last processed version for the set, then read events from set's event stream since relational model's version and apply them to bring state up to "current". E.g. Book-Category is one stream and Vendor-Category another. Events like BookAssociatedWithCategory.

Just spit-balling.

Kasey Speakman

unread,
Aug 16, 2017, 7:43:36 PM8/16/17
to ddd...@googlegroups.com
Tho loading all categories for a book and vendor and doing the set comparison in memory will be a lot less efficient than:

1) asking a fully consistent CRUD model or
2) asking an eventually consistent read model and accepting you will block some associations which should be valid (false negatives) and allow some associations which should be denied. The latter you can detect and compensate for in a separate process.

--
You received this message because you are subscribed to a topic in the Google Groups "DDD/CQRS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dddcqrs/JADhzXsTst8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dddcqrs+u...@googlegroups.com.
Visit this group at https://groups.google.com/group/dddcqrs.
For more options, visit https://groups.google.com/d/optout.

Andriy Drozdyuk

unread,
Aug 16, 2017, 10:22:09 PM8/16/17
to DDD/CQRS
I would start with modeling the events and not entities.
If you have events such as:

- BookAssigned(category)
- VendorAssigned(category)
- BookAssigned(vendor) 

what should they apply to? Perhaps it makes sense to have an event such as "CategoryAssigned(book_id, category)" instead? Working along these lines will lead you to discover the aggregates in your domain. Aggregates should define consistency boundaries, not "collections" of things. What is the thing that should be "consistent" - e.g. cannot be modified by two things at once?

How you ensure that commands are correct and that no wrong assignment is made can be left to later. In general, I find that this is not such a big problem as initially seems. Command can be rejected at the command handler (before it is even passed onto aggregate). Sometimes this will not ensure complete consistency - e.g. you may have two books that are assigned to a single vendor. But then this may be an ok tradeoff, and this error can be detected later, by running periodic queries on your projections. After such an error is found, it can be flagged in the UI. After all, in reality, these things do happen and they are corrected, rather than prevented.

Alexandre Potvin Latreille

unread,
Aug 28, 2017, 9:36:32 AM8/28/17
to DDD/CQRS
If you want uniqueness to be strongly consistent I wouldn't see a problem in having categories on books and vendors, especially if the sets would be relatively small. However, you certainly cannot bring all books & vendors under the same consistency boundary. I think that your best bet would be to listen to events such as VendorUncategorized and BookUncategorized and issue compensating actions when you detect the rule was broken (most likely just flag that the book has an invalid vendor and have it fixed manually). That shouldn't happen very often anyway given that to break the rule you'd need a race condition.

Rickard Öberg

unread,
Aug 28, 2017, 10:51:03 AM8/28/17
to ddd...@googlegroups.com
Hi,

This sounds really straightforward. Do a read model check of the
conditions before sending the command on to the aggregate. Optimistic
locking can handle any errors in the read model query (unlikely, but
possible). Done.

If your conceptual model is that the aggregate needs to hold all the
information needed to make the decision, then yes, it is difficult. If
not, then it's trivial.

/Rickard
> --
> You received this message because you are subscribed to the Google Groups
> "DDD/CQRS" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Herman Peeren

unread,
Aug 30, 2017, 5:38:40 AM8/30/17
to DDD/CQRS
Questions about the domain
  • what is this system all about? Is it a catalogue of an online bookshop?  What is the goal of it? What books are in it? What are those vendors: are they the ones that actually sell a book that is in your system? Is there some payment model per sold book? Or also per category that is assigned to them?
  • Once the information about the books is entered into the system, does it change? What changes do you expect? Only new books that are entered? Or can the information about the books be changed? Will there be new categories? How often? Can vendors change? How often? Can a book be in the system without a vendor (yet)? Or do you always add a vendor when adding a book?
  • those categories: are they book categories? Could that be hierarchical (for instance fiction vs non-fiction and then finer grained under that) or are all categories totally independent of each other? What is the process of coming to the categories or is some standard used? Who decides about the categories that are assigned to a book?
  • what do you mean by "A Vendor can be assigned a number of categories"? Do you mean: what categories a vendor actually sells? Or: what categories they can (are allowed to) sell? In the first case the categories are just an indirect relation between the vendors and the books, not a direct attribute of the vendors (and then not a part of your writ-model). But if the book categories assigned to vendors indicate that they can only sell those categories of books, could you then tell a bit more about the handling of vendors and categories?
CRUD data? CRUD domain?
What do you mean by "CRUD data" or "CRUD domain"? Event Sourcing is a (lossless) way to model change; nothing more, nothing less. Anything you can model with Event Sourcing could be handled with CRUD, but that is not lossless and so less flexible. The core difference is that with ES you store the events that caused the change instead of just overwriting the value that changed. There is no intrinsic property to data or a domain that makes something apt for CRUD or ES. Or do you have some criteria to decide if something is "CRUD data" or a "CRUD domain"?

Samuel Francisco

unread,
Aug 30, 2017, 12:03:04 PM8/30/17
to DDD/CQRS
Hi @Mark Bernman, 

I believe this is the typical case where you need a Domain Service, to answer if the Book can be assigned to the Vendor or not. This Domain Service would query the events of "Book assigned a Category" and "Vendor assigned a Category" to verify the match. 

Herman Peeren

unread,
Aug 31, 2017, 2:57:17 AM8/31/17
to DDD/CQRS
A technical solution, like adding a Domain Service, is often an indication that the model can be improved.

To say it bluntly: your model is wrong. The mistake is probably somewhere in assigning book categories to Vendors, because that gives inconsistencies. When applying book categories to Vendors you use those categories in a different way, with a different meaning and use. My solution would be to use Domain Driven Design, that is: to better look at the domain again and refactor your model. Just adding a Domain Service is putting the problems in your model under the carpet, not solving them.

Many (maybe: most?) of the issues on this mailinglist can be solved by refactoring the model. But because most people in the DDD-community have a technical background as developers, we are inclined to look for technical fixes. The core of DDD however is to study the domain and to continuously improve our models of it: models (design) that are driven by the domain.

@yreynhout

unread,
Sep 10, 2017, 7:20:55 AM9/10/17
to DDD/CQRS
I need to check that the intersection of the sets of categories of the Book and the Vendor is non-empty

When does this apply and how does that hold over the course of time? Is there maybe an event as of which it no longer holds? Why does this apply? Why can't the vendor sell a book that's not in his/her categories? Besides the obvious observation that he'll be making money that way, what would be a reason to prevent it? When you say "book", do you mean a copy of the book (either physical or digital) or do you mean the concept of a book of which there exist many copies?
Reply all
Reply to author
Forward
0 new messages