Aggregates/Repositories as Parameters and Bidirectional Aggregate Relationships

2,034 views
Skip to first unread message

Michael Ainsworth

unread,
Jan 11, 2016, 8:19:10 PM1/11/16
to DDD/CQRS
I'm aware of the rules in aggregate relationships by Eric Evans. E.g., one aggregate root cannot hold references to another aggregate root's local entities (or can only do so transiently), a local entity within one aggregate root can hold reference to another aggregate root (etc, etc).

I've come up with some questions regarding relationships/references between aggregates (and aggregates and repositories), which I may not have understood correctly. I'd love some (predictably illuminating) answers.

Questions:

1. Is it OK to pass repositories (for transient use only) to a function of an aggregate root?

For example, if a Visitor::registerAsNewUser() function accepts a UserRepository, it can check for duplicate usernames rather than having the calling code (command handler/domain service) do so.

2. Is it OK to pass one aggregate root (again, for transient use only) to a function of another aggregate root?

For example, if a Order::addLineItem() function accepts a Product aggregate, it can add the appropriate values rather than relying on the calling code to pull them out of the Product and pass them.

I would assume the answers to 1 and 2 are "yes" because:

* Aggregate root/repository parameters are only used transiently.
* Aggregate roots/repositories are domain objects (the repository acting as a collection and abstracting the persistence mechanism).
* Using aggregate roots and repositories as parameters enriches the domain model language.
* Using aggregate roots and repositories as parameters allows you to move logic previously located in "domain services" into the domain models themselves (e.g., in the MVC approach, having "fat controllers" is considered bad because it's taking logic out of the model and putting it into the calling code).

However, according to Eric Evan's rules on aggregate design, an aggregate should not be able to "reach outside of itself". Am I missing something here? Or does "reaching outside of itself" meaning holding a direct reference (rather than an identity reference)?

3. Is it OK to have bidirectional relationships between aggregate roots? What are the typical use-cases of such relationships?

4. If it is OK to have bidirectional relationships, then in creating the relationship you are actually modifying two aggregates at the same time (unless the relationship itself is an aggregate). Do bidirectional relationships violate the "commands should target one aggregate only" rule?

Ben Kloosterman

unread,
Jan 11, 2016, 8:31:05 PM1/11/16
to ddd...@googlegroups.com
On Tue, Jan 12, 2016 at 12:19 PM, Michael Ainsworth <michae...@gmail.com> wrote:
I'm aware of the rules in aggregate relationships by Eric Evans. E.g., one aggregate root cannot hold references to another aggregate root's local entities (or can only do so transiently), a local entity within one aggregate root can hold reference to another aggregate root (etc, etc).

I've come up with some questions regarding relationships/references between aggregates (and aggregates and repositories), which I may not have understood correctly. I'd love some (predictably illuminating) answers.

Questions:

1. Is it OK to pass repositories (for transient use only) to a function of an aggregate root?

For example, if a Visitor::registerAsNewUser() function accepts a UserRepository, it can check for duplicate usernames rather than having the calling code (command handler/domain service) do so.

BK> Infrastructure is intruding into the domain so you need a damn good reason.
 

2. Is it OK to pass one aggregate root (again, for transient use only) to a function of another aggregate root?

For example, if a Order::addLineItem() function accepts a Product aggregate, it can add the appropriate values rather than relying on the calling code to pull them out of the Product and pass them.

I would assume the answers to 1 and 2 are "yes" because:

* Aggregate root/repository parameters are only used transiently.
* Aggregate roots/repositories are domain objects (the repository acting as a collection and abstracting the persistence mechanism).
* Using aggregate roots and repositories as parameters enriches the domain model language.
* Using aggregate roots and repositories as parameters allows you to move logic previously located in "domain services" into the domain models themselves (e.g., in the MVC approach, having "fat controllers" is considered bad because it's taking logic out of the model and putting it into the calling code).

However, according to Eric Evan's rules on aggregate design, an aggregate should not be able to "reach outside of itself". Am I missing something here? Or does "reaching outside of itself" meaning holding a direct reference (rather than an identity reference)?

In DDD i see little objection to this , you are coupling them but thats about it . I have also done this in CQRS to get a cheap transaction. 
 

3. Is it OK to have bidirectional relationships between aggregate roots? What are the typical use-cases of such relationships?

Why not ?   Aggregate root = consistent actor can work well.  

4. If it is OK to have bidirectional relationships, then in creating the relationship you are actually modifying two aggregates at the same time (unless the relationship itself is an aggregate). Do bidirectional relationships violate the "commands should target one aggregate only" rule?

If there is a good reason yes ..  however this can be tricky the rule no/ minimal  infrastructure in the domain is far more important than a command modifying 2 aggregate and other guidelines.   

Ben

Ben Kloosterman

unread,
Jan 11, 2016, 9:35:23 PM1/11/16
to ddd...@googlegroups.com
1. Is it OK to pass repositories (for transient use only) to a function of an aggregate root?

For example, if a Visitor::registerAsNewUser() function accepts a UserRepository, it can check for duplicate usernames rather than having the calling code (command handler/domain service) do so.

To be clear on this .. getting DDD to do this is tricky but you mentioned commands  assuming CQRS 
check duplicates before creating the command or transfer the command to the domain . Remember commands should rarely fail .. 

In CQRS   most common options  
1) Check in fascade / Client / Anti corruption layer before passing to domain
2) For Crud things use a CRUD domain . eg User stuff  . Do not try to make all your BCs CRUD . That said if its  only a FEW  aggregates / entities you should add to the BC
3) Include all historical data in a separate aggregate 
4) Create and revert using a process checking the read model .  
5) Include a external service  that checks it .  eg behind an interface IUserService and certainly not a repository .  However in nearly all cases  1 is better.,

eg if your domain is in a web service and gets a  request call the read model before calling the actual domain and fail immediately back to the client.  Note often a request received by the web service in this case is not really a command yet.   It becomes a command once its passed to the domain.   

There has been discussion that the web api can be used as a "command" / command handler which i have used ( since the wire message is a command) but in this case the validation needs to run before hand . Its also easier for Crud devs ..  

eg code in no specific language  

[Validate()] 
[WebMethod]
NewUserActivateAccount ( accountid: guid  )
{
        var rep = GetRepository().;
        account = rep . GetAccountAgg( accountid);

         account.Inactivate() ;
        
        rep. SaveAllEvents(account); // commit changes
}

or 

 WebMethod]
CreateNewUser ( Name: string  )
{
   // validate  , anti corruption , security etc 
   ValidateFromReadModeIfUserExists();  // will except 

    // infrastructure 
    var rep = GetRepository().;
    var user = rep.CreateUser()

    // domain 
    // user is a separate project its your domain - it contains the real domain .
      user.Create(name);
     // end of domain back to infrastructure 
 
      rep.CommitChanges(user);
}

Note this does not cover threading but that wont be an issue with some languages ( though   you need a cache on repository) 


Regards,

Ben


Michael Ainsworth

unread,
Jan 11, 2016, 9:56:20 PM1/11/16
to DDD/CQRS


On Tuesday, 12 January 2016 12:31:05 UTC+11, Bennie Kloosteman wrote:
BK> Infrastructure is intruding into the domain so you need a damn good reason.
 

I was under the impression that a Repository was part of the domain layer? E.g., you should be able to write your entire application so that it runs in memory (RAM) and that it is ignorant of the persistence mechanism (whether using event sourcing or ORM). At it's simplest level, a Repository can encapsulate a basic array.

 
In DDD i see little objection to this , you are coupling them but thats about it . I have also done this in CQRS to get a cheap transaction. 

I guess if you can live with the level of coupling, it's OK.
 
If there is a good reason yes ..  however this can be tricky the rule no/ minimal  infrastructure in the domain is far more important than a command modifying 2 aggregate and other guidelines.   

I guess you could use a saga to make them "mutually aware of each other", but I'm not particularly fond of that approach. It's possible that most relationships can be unidirectional, as you are typically traversing in one direction in the write model (Order -> Customer::isPurchasingDisabled()) and either direction in the read model (reports of # orders per customer, for example).

Ben Kloosterman

unread,
Jan 11, 2016, 10:33:28 PM1/11/16
to ddd...@googlegroups.com
On Tue, Jan 12, 2016 at 1:56 PM, Michael Ainsworth <michae...@gmail.com> wrote:


On Tuesday, 12 January 2016 12:31:05 UTC+11, Bennie Kloosteman wrote:
BK> Infrastructure is intruding into the domain so you need a damn good reason.
 

I was under the impression that a Repository was part of the domain layer? E.g., you should be able to write your entire application so that it runs in memory (RAM) and that it is ignorant of the persistence mechanism (whether using event sourcing or ORM). At it's simplest level, a Repository can encapsulate a basic array.

This is all true except its not part of the domain , its infrastructure.   Repositories are normally generic code eg Repository<T :IAggregate> .  
If the  repository was part of the domain it would not be ignorant of the persistent mechanism since the repository does the loading / persistence  ie its not just saying it doesn't care about the type of Database  - it saying it doesn't care about loading and saving aggregates ..    Golden rule does it have business logic/ language  ? No then its not in the domain , a business person will not say load the aggregate  so loading / saving aggregates  is not part of the domain  .Note building aggregates form events is part of the domain hence the repository will instantiate the aggregate and give it a  list of event with which it builds itself though this call is normally hidden in the materialization in the repository. 

Yes the domain is ignorant of the loading / persistent mechanism , the command handlers do the wiring up. 
  

I guess you could use a saga to make them "mutually aware of each other", but I'm not particularly fond of that approach. It's possible that most relationships can be unidirectional, as you are typically traversing in one direction in the write model (Order -> Customer::isPurchasingDisabled()) and either direction in the read model (reports of # orders per customer, for example).

Correct .  Sagas ARE better , they are just expensive to implement (Sagas with state are often aggregate roots themselves)   , that said where you need it  in a complicated process then CQRS truly shines and projects that tried to do this with CRUD often fail / deteriorate . 

I used 2 aggregates being changed by the same handler and committed as it forms a nice transaction  ( since the events for both were saved as a single transaction. ) . This is not / should not be a common thing but can be useful. 

Regards,

Ben

Danil Suits

unread,
Jan 11, 2016, 11:41:59 PM1/11/16
to DDD/CQRS
Insert disclaimer here:


Is it OK to pass repositories (for transient use only) to a function of an aggregate root?

I would think "no".

The one case where I'm still fuzzy is how one aggregate creates an instance of another.  That factory has to come from somewhere, but I don't think it's the repository.  I don't know that it isn't either.

 
Is it OK to pass one aggregate root (again, for transient use only) to a function of another aggregate root?

Again, I would guess no -- you can't query it, I don't *think* you should be sending it commands, so what are you going to do with it that you couldn't achieve with an id, and perhaps a version?  I don't see how the aggregate changing its own state is going to be able to use stale data from another store to validate its own invariant.

On the other hand, queries seem like a great fit for a domain service, why wouldn't you use it that way?  Pointless, perhaps, if the state passed to the domain service just comes from the command (since the command handler *could* validate the command before invoking the aggregate), but encapsulation says that the application shouldn't know whether the aggregate is going to pass command state or aggregate state to the service.

As noted above, an aggregate can create another one (it's not going to mutate itself, but it will pass some of its state to the factory to create a single instance of something else, and persist that in the transaction.  Given that pattern, I suppose you could have a command that tells one aggregate to send a command to another, but it seems really weird.


Is it OK to have bidirectional relationships between aggregate roots?

I haven't seen any (again: novice disclaimer).  It strikes me as weird -- how would you use such a thing to protect an invariant in either object?  Certainly the projections of the objects can "reference each other", such that projections reference anything at all, but that's not at all the same thing.


 If it is OK to have bidirectional relationships, then in creating the relationship you are actually modifying two aggregates at the same time

Another reason not to.  Again, creation patterns look to me like a special case, but Greg has said "don't do it" enough times that admitting I discovered why not after putting it into production is going to be embarrassing.  So, you first?


Michael Ainsworth

unread,
Jan 12, 2016, 12:43:57 AM1/12/16
to DDD/CQRS


On Tuesday, 12 January 2016 15:41:59 UTC+11, Danil Suits wrote:
Is it OK to pass repositories (for transient use only) to a function of an aggregate root?

I would think "no".

The one case where I'm still fuzzy is how one aggregate creates an instance of another.  That factory has to come from somewhere, but I don't think it's the repository.  I don't know that it isn't either.

You can actually record the event of creating the aggregate without actual creating it, and leave the "replay" to do the creation.

E.g.:

    $eventStore->store($orderRepository->placeOrder(...));

Or in the case of Udi Dahan's "objects don't just appear out of thin air", the Cart::checkout() method can return an OrderPlaced event.

    $eventStore->store($cart->checkout(...));

To reiterate, if you don't need any data from the newly created aggregate (which is likely, given the ID and other properties are sent by the client), then you can just record the "aggregate was created" event without actually instantiating the aggregate.
 
 Is it OK to pass one aggregate root (again, for transient use only) to a function of another aggregate root?

Again, I would guess no -- you can't query it, I don't *think* you should be sending it commands, so what are you going to do with it that you couldn't achieve with an id, and perhaps a version?  I don't see how the aggregate changing its own state is going to be able to use stale data from another store to validate its own invariant.

Why can't you query it for state?

class Order {
public function addLineItem(Product $product) {
    if ($product->isDiscontinued()) throw new DomainError("You can't purchase a discontinued item");
}
 }

I think the above is much nicer than the below:

class AddProductToOrderServiceCommandHandler {
public function execute(AddProductToOrder $command) {
    $product = $this->productRepository->getProduct($command->productId);
    $order = $this->orderRepository->getOrder($command->orderId);

    if ($product->isDiscontinued()) throw new DomainError("You can't purchase a discontinued item.");

    $order->addLineItem($product);
}
}

Danil Suits

unread,
Jan 12, 2016, 2:05:44 AM1/12/16
to DDD/CQRS
You can actually record the event of creating the aggregate without actual creating it, and leave the "replay" to do the creation.

Of course you can't (he said with unjustified bravado); it's the aggregate that you are creating that guards that state.  Why should aggregate X even know which events are generated when aggregate Y is created?  Why should that logic be shared across the different aggregates that might create Y?


Why can't you query it for state?

Because I don't have access to it.  I might do this:

class Order {
public function addLineItem(productId, availabilityService) {
    if (availabilityService.isDiscontinued(productId) throw new DomainError("You can't purchase a discontinued item");
}
 }

But I'm not willing to pretend that Order.addLineItem() can make any assertions about the state of the Product aggregate in a transactionally meaningful way.  Or put another way, product.isDiscontinued isn't within the Order aggregate boundary, and therefore can not be part of the invariant that the Order is supposed to enforce.

Because what this code fragment really says is "if product was discontinued at some arbitrary point in the past, then throw", and you don't need to the Product's consistency rules to make that check, you just need a representation of the Product's history.



 
 

 






 

@yreynhout

unread,
Jan 12, 2016, 3:51:09 AM1/12/16
to DDD/CQRS
1. It's a way of doing these things. Not my preferred way since it tends to suck logic into places it doesn't really belong, but then again that's debatable.
2. As long as it's clear which aggregate is going to be affected, and there's only one of them, I see no reason not to have objects collaborate. If I'd be doing this using the actor model, I would opt for messaging between aggregates instead of using query methods of another aggregate (duh).
3. Sometimes, yes. E.g. when the life cycles of two things are tied together in some, but not all use cases. Makes messaging easier by embedding both aggregate identifiers in the messages.
4. Solutions for this one depend on whether the relationship creates one of the aggregates (e.g. accepting a membership request (aggregate) causes a membership (aggregate) to become) or merely establishes it between two or more aggregates (e.g. adding a photo (aggregate) to a portfolio (aggregate)). Language might give a hint as to where the relationship best fits (in which aggregate so to speak) - there are many types of relationships. The relationship could have behavior of its own, of which dissolving it (if ever) is the most notable one. There are no hard rules around this, although there's a lot of prior art (here's looking at streamlined object modeling) to borrow from. "Real" invariants also give away the contours of your aggregates, meaning it might be okay to message one of the aggregates that the relationship has been established/dissolved instead of it being part of the transaction that establishes/dissolves the relationship. Getting the boundaries right on these types of things is not an easy task. Affecting only 1 aggregate per command is an additional guideline and other times it's a real hard constraint (e.g. due to the underlying storage or as part of designing a scalable system).

Juan Martin

unread,
Jan 12, 2016, 5:01:43 AM1/12/16
to DDD/CQRS
I was under the impression that a Repository was part of the domain layer? E.g., you should be able to write your entire application so that it runs in memory (RAM) and that it is ignorant of the persistence mechanism (whether using event sourcing or ORM). At it's simplest level, a Repository can encapsulate a basic array.

For me, you should implement the repository pattern so that it belongs to both domain and infrastructure layers, something like the "ports and adapters" architecture described here: http://alistair.cockburn.us/Hexagonal+architecture.
Inside of domain you should have an abstract class OrderRepositoryBase or better, a IOrderRepository interface that you can use as a full domain member, it's just like a collection of objects.
Inside of infrastructure you should have the class OrderRepositoryImpl which implements the interface and would be injected into the domain using IoC.
Regards,
Juan



Kijana Woodard

unread,
Jan 12, 2016, 9:29:32 AM1/12/16
to ddd...@googlegroups.com
I don't want "loading" or "saving" as part of an aggregate root.
I don't want repositories or aggregates being passed into aggregate root methods.

My preference is to rely on commands and events as the abstractions.

Take the rule "user name must be unique".
- Why?
- Then use something unique like email address.
- Ok fine, it's still not a constraint that can be maintained by *a* user. 
- In fact, by definition, the user in question doesn't exist yet.

Take rules about which products can be ordered or which users can take certain actions.

Those rules are arbitrary and change relatively rapidly as opposed to Order. Making them part of Order gives Order too many reasons to change. Further, different contexts have different rules:
- For channel partners allow orders > $xxxx
- Customer Care can order Discontinued products on behalf of any Customer
- During Bulk Import of data, x, y, z conditions are acceptable

Putting all that variety in Order, even tucked behind IMagic, gives Order too many reasons to change.

In short, all of that is making Order into a factory. It's constructing Commands for itself.
Given the variety of rules that all lead to stock standard Orders, I prefer Command creation to be outside of the aggregate.

For me, a single class is "too small" to contain all the bits. Trying to make one class know all the things leads to disaster [unreadable code]. While there is namespace, there's no good construct that I've found to "keep all the bits together", so I lean on a mixture of convention and folder structure to signal that "these classes are related and go about the business of dealing with Foo".




--
You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Greg Young

unread,
Jan 12, 2016, 9:37:07 AM1/12/16
to ddd...@googlegroups.com
"I don't want repositories or aggregates being passed into aggregate
root methods."

It is quite common to pass an aggregate as an argument to another
aggregate or am I missing something here. As an example
"EnrollStudentInClass" where we would pass the Class to the Student.

Cheers,

Greg
--
Studying for the Turing test

Kijana Woodard

unread,
Jan 12, 2016, 9:42:06 AM1/12/16
to ddd...@googlegroups.com
That was more a statement of my preferences [with reasons] rather than an exposition of "right or wrong".
I spend more time with Commands and Events than with TypeA.Do(TypeB).

I'm not sure what label to put on that. ;-p

Michael Ainsworth

unread,
Jan 12, 2016, 3:13:34 PM1/12/16
to DDD/CQRS
Greg, what do you think about passing repositories as arguments?

Greg Young

unread,
Jan 12, 2016, 3:19:09 PM1/12/16
to ddd...@googlegroups.com
It can be done but I have done it very rarely and more so in systems
not following CQRS

On Tue, Jan 12, 2016 at 9:13 PM, Michael Ainsworth
<michae...@gmail.com> wrote:
> Greg, what do you think about passing repositories as arguments?
>

Michael Ainsworth

unread,
Jan 12, 2016, 3:35:18 PM1/12/16
to DDD/CQRS
So in your example of EnrolStudentInClass, you have the command handler load the Class from the ClassRepository and the Student from the StudentRepository then pass the Class to the Student's enroll method?

That is the way I'd do it, but passing the ClassRepository and class ID to the Student object is exactly the same code, just located in a different place. If the rules of an aggregate need to coordinate with other aggregates, why not put that logic in the aggregate itself?

Ben Kloosterman

unread,
Jan 12, 2016, 7:22:13 PM1/12/16
to ddd...@googlegroups.com
It has the same flow but allows other things its about designing it  so the system stays correct  after years of modification ..  is passing a concrete type the same as the interface when making just the interface call  .. ie tight verses loose coupling , loose is the same "flow" as tight.

Its also about restricting options because when it comes to maintenance time the fastest hack will often be used . Oh i can query the read model here by adding a method on the repository , then other devs see that and copies that - now you no longer have a cqrs system.

I think i'm more like Kijana , Last project about 40 ARs i had the handlers in a different project and the domains didn't reference much at all.

eg projects
      Handler/ Facsade ( references all others)
      DataInfrastructure (Repository<T> , references domains - you can do it without it eg with a factory function Func<IAggregate> so only handler has access to the domains  )
            SqlData 
      DomainInfrastructure ( basically IEvent ,  IAggregate and some saga helpers , does not reference other projects) 
      DomainA  Just aggregates and entities  ( references only Domain infrastructure and events)
      DomainB  Just aggregates and entities ( references only Domain infrastructure and events)
      SharedEvents
      CrudDomain (uses DataInfrastructure)
      ReadDomainA (uses DataInfrastructure)             
      ReadDomainB

I was pretty firm on no data access code in the domain as i have seen what a lot of devs will do  to make things work. This is where many projects IMHO go down hill . 

Ben

On Wed, Jan 13, 2016 at 7:35 AM, Michael Ainsworth <michae...@gmail.com> wrote:
So in your example of EnrolStudentInClass, you have the command handler load the Class from the ClassRepository and the Student from the StudentRepository then pass the Class to the Student's enroll method?

That is the way I'd do it, but passing the ClassRepository and class ID to the Student object is exactly the same code, just located in a different place. If the rules of an aggregate need to coordinate with other aggregates, why not put that logic in the aggregate itself?

--

jarchin

unread,
Jan 12, 2016, 8:09:13 PM1/12/16
to DDD/CQRS

"For me, a single class is "too small" to contain all the bits."

What about using partial class?
All files can have their proper name (I'd certainly have an Apply.cs), all put in a folder with the aggregate name.

Kijana Woodard

unread,
Jan 12, 2016, 8:36:44 PM1/12/16
to ddd...@googlegroups.com
Try it out. Report back.

The designs I've tried wouldn't work with partial. Also, partial doesn't work across assemblies.

From: jarchin
Sent: ‎1/‎12/‎2016 7:09 PM
To: DDD/CQRS
Subject: Re: [DDD/CQRS] Aggregates/Repositories as Parameters andBidirectional Aggregate Relationships


"For me, a single class is "too small" to contain all the bits."

What about using partial class?
All files can have their proper name (I'd certainly have an Apply.cs), all put in a folder with the aggregate name.

Ben Kloosterman

unread,
Jan 12, 2016, 9:02:48 PM1/12/16
to ddd...@googlegroups.com
I also prefer small aggregates covering different view points I think this is important because those different view points have different consistency boundaries and terminology. That said i have had a very large one once but i injected the extra  behavior classes ( there were several)  , i prefer this to base classes ( eg Fragile base class / extension by composition) and partials. 

Regards,

Ben 

Michael Ainsworth

unread,
Jan 13, 2016, 6:38:02 AM1/13/16
to DDD/CQRS


On Tuesday, 12 January 2016 18:05:44 UTC+11, Danil Suits wrote:
Of course you can't (he said with unjustified bravado); it's the aggregate that you are creating that guards that state.  Why should aggregate X even know which events are generated when aggregate Y is created?  Why should that logic be shared across the different aggregates that might create Y?

Yes aggregates guard their own state. In C++ we have the concept of "friend" classes and functions. I'm using a private constructor for the Aggregate with the Repository (the friend) instantiating it. This allows the Repository to perform cross-aggregate coordination before construction.

Think of it this way - in a multiple-entity aggregate, each entity enforces it's own invariants and the aggregate root enforces the invariants between aggregates. In this sense, we have two facades - the entity protects the invariants of its data and the aggregate protects the invariants of its entities. I'm taking it a step further by having the repository enforces invariants between aggregates (in an eventually consistent rather than transactionally consistent manner).

When you write single-threaded applications that operate solely in memory (not even writing anything to disk), then you realise that invariants are nested within one another, and the entire program can be transactionally consistent. You can have direct memory pointers between your aggregates. As soon as you want to persist your data to disk, you have a "distributed" system because you can't guarantee that an aggregate deserialised from disk is going to have the same memory address, and so you have to allocate unique IDs to each of your aggregates, so that an "in-memory" aggregate can reference an "on-disk" aggregate and vice-versa.

Because I don't have access to it.  I might do this:

class Order {
public function addLineItem(productId, availabilityService) {
    if (availabilityService.isDiscontinued(productId) throw new DomainError("You can't purchase a discontinued item");
}
 }

But I'm not willing to pretend that Order.addLineItem() can make any assertions about the state of the Product aggregate in a transactionally meaningful way.  Or put another way, product.isDiscontinued isn't within the Order aggregate boundary, and therefore can not be part of the invariant that the Order is supposed to enforce.

How is your example any different to mine? Perhaps I don't understand what this "availability service" is, especially given that a discontinued/availability status is on the product anyway? If a Product is an aggregate that contains an isDiscontinued property and Order is an aggregate that contains LineItems (which reference the product), and the rule is that discontinued products cannot be ordered, having this check on the Product parameter in the Order method is useful. Sure, the product could be discontinued by another user in the time between when the original thread loads the Product and when the adds it to the order, but this is an "eventually consistent" issue, and the same thing can happen in ACID databases.

Danil Suits

unread,
Jan 13, 2016, 4:01:15 PM1/13/16
to DDD/CQRS

How is your example any different to mine? Perhaps I don't understand what this "availability service" is, especially given that a discontinued/availability status is on the product anyway? If a Product is an aggregate that contains an isDiscontinued property and Order is an aggregate that contains LineItems (which reference the product), and the rule is that discontinued products cannot be ordered, having this check on the Product parameter in the Order method is useful. Sure, the product could be discontinued by another user in the time between when the original thread loads the Product and when the adds it to the order, but this is an "eventually consistent" issue, and the same thing can happen in ACID databases.

My answer: the differences are small, but useful.  At least, I'm finding that the differences are useful to me, because they constrain the way I'm allowed to think about what is going on.  Yves might say I'm over thinking it; I'm comfortable with that.

interface Product extends Aggregate {
   
void discontinue();
   
boolean isDiscontinued();
}

vs

interface Product extends Aggregate {
   
void discontinue();
}

interface ProductProjection extends Projection {
   
boolean isDiscontinued();
}


The point being that this forces me to specify which aggregate is in the present, and which is in the past.

Introducing the AvailabilityService just clarifies for me that the Order aggregate doesn't care about all of the Product.


interface AvailabilityService extends Query {
   
boolean isDiscontinued(ProductId productId);
}

or even


interface AvailabilityService extends Query {
   
boolean isDiscontinued(ProductId productId, Time asOf);
}



interface AvailabilityService extends Query {
   
boolean isDiscontinued(ProductId productId, Version version);
}

when the Order is pinned to a particular version of the Product.  I'm providing, for the context of the current command, a way for the Order to interrogate some past state of the Product, without constraining the implementation.  Sure, the AvailabilityService can load the entire Product history if that's what makes sense; or load up to some specific event in the history of the product, or just hit the read model for a projection that is tuned for this use case, or....

Put another way, separating objects in the present from objects in the past allows helps clarify which data races matter.  If I'm running a command on the order aggregate, real time changes to the Product don't affect the correctness of my actions at all.  Real time changes to this Order, on the other hand, do invalidate the changes I'm currently making to the Order; if I lose a data race, then this command needs to start over (either a local retry, or throw it back to the client, depending on horses and courses).

This is another reason why I say that aggregate creation is weird.  To use your example, if we issue a checkout command to the Cart aggregate, which races should prevent our command from executing after it has successfully loaded the Cart?  The command is going to be persisting Order events, not Cart events, so the concurrently running emptyCart command doesn't affect our work at all -- in it's roll as a factory, the Cart "aggregate" has the properties of a projection, rather than the role of an aggregate.

Sending the checkout command to the Cart is consistent with the ubiquitous language, sending it to the Order is consistent with the persistence component.  The implementation is the same in either case.

Maybe I need to push on that harder.  There really isn't any difference between Cart.add(Product) and Product.placeInCart(cart).  Or even User.placeInCart(product).  You pass in a command, you get a list of events back.  The rest is just bookkeeping.


I'm taking it a step further by having the repository enforces invariants between aggregates (in an eventually consistent rather than transactionally consistent manner).

When you write single-threaded applications that operate solely in memory (not even writing anything to disk), then you realise that invariants are nested within one another, and the entire program can be transactionally consistent. You can have direct memory pointers between your aggregates. As soon as you want to persist your data to disk, you have a "distributed" system because you can't guarantee that an aggregate deserialised from disk is going to have the same memory address, and so you have to allocate unique IDs to each of your aggregates, so that an "in-memory" aggregate can reference an "on-disk" aggregate and vice-versa.

Whoa whoa whoa -- both of those paragraphs ring alarms.

First, I can't align your comment about enforcing environments with any of the definitions of "eventually consistent" that I've run into.

Second, I can't reconcile your disk metaphor with the message log metaphors that I use.  I got into event sourcing by reading about the Disruptor library.  So I'm used to the idea of multiple producers throwing messages into queue (which imposes an artificial ordering upon them), and having a single consumer applying those messages in turn to a single in memory representation of truth, and emitting events, which can in turn be loaded into another queue shared with other producers, and turtles all the way down.

If the consumer is allowed to reject the messages it sees because they don't align with its current view of state, then the consumer is an aggregate, the messages are commands, and the state being used to justify rejecting the command belongs within the aggregate boundary.

When you add the context of a second aggregate to the mix (ie using the state of the Product in the command acting on the Order), what you are doing is analogous to adding (some) ProductEvents to the OrderCommand queue.

Here's the bit that bothers me about that: if the Product.isDiscontinued state really does belong in a separate aggregate, why would be be important to use a more recent stale copy of that state then when the command was created?  The caller is trying to add the product to the cart, presumably the caller was also looking at some stale state of Product.isDiscontined.  Why is the stale product state loaded by the Order command handler more valid than the stale state that was loaded by the client when creating the command.

Especially since the Order command handler has no way of knowing that it has the more recent Product history.
  1. Client dispatches Product.reissue command
  2. Product accepts the reissue command
  3. Client dispatches the Order.addItem command
  4. Order rejects addItem because Product is discontinued ??
  5. Product_Reissued event becomes visible to Order command handler.

Are you sure that makes sense?  And does it still make sense if most command succeed -- meaning that 2. and 3. could be re-ordered?


Exercise: work through the same flow when Client and Order are not aggregates, but just Entities within some common aggregate boundary.



For what it is worth, I think Order having an "eventually consistent" view of Product can make sense, but to me it means a different thing; we can send to Order.addItem(Product(Version(200)) command, and the Order can add the Product even though it can't see that version yet.  Commands that need to see "the" state of another aggregate fail when the targeted version is still in the future.  So Order.addItem would work, but Order.validate() or Order.accept() would not because the necessary data hasn't arrived yet.




Michael Ainsworth

unread,
Jan 13, 2016, 5:47:46 PM1/13/16
to DDD/CQRS
Thanks for the input. Don't know that I agree with it, but it is helping me think about the problems from different angles.


On Thursday, 14 January 2016 08:01:15 UTC+11, Danil Suits wrote:
Introducing the AvailabilityService just clarifies for me that the Order aggregate doesn't care about all of the Product.

If decoupling like this is required for your use case, that's fine. However, I personally would be hesitant to decouple so much because at the extreme end you could end up having a "service" for each property of your aggregate.

interface Product extends Aggregate {
   
void discontinue();
}

interface ProductProjection extends Projection {
   
boolean isDiscontinued();
}


Accessing the projection (read model) from within the write model is generally considered a bad idea, and I'd agree with that.
 
or even


interface AvailabilityService extends Query {
   
boolean isDiscontinued(ProductId productId, Time asOf);
}
when the Order is pinned to a particular version of the Product.  I'm providing, for the context of the current command, a way for the Order to interrogate some past state of the Product, without constraining the implementation.  Sure, the AvailabilityService can load the entire Product history if that's what makes sense; or load up to some specific event in the history of the product, or just hit the read model for a projection that is tuned for this use case, or....

Why wouldn't you just add such a property to the Product aggregate itself? E.g., you rebuild the aggregate by replaying the events, and when you replay the ProductDiscontinued event, you store the date on which the Product was discontinued (the event timestamp most likely)? This is a simpler solution than decoupling with a service, you don't have to perform an arbitrary query on the event store, and you also don't have to query any "read models" from your write side. The only advantage I can see to using an "availability service" is that you can swap it out to switch between the Product's "current" availability, the "as-of-date" availability and the "as-of-version" availability that you specify, but in general this seems way to complex - the typical use case would be to prevent a user from ordering a product if it is currently discontinued.

Put another way, separating objects in the present from objects in the past allows helps clarify which data races matter.  If I'm running a command on the order aggregate, real time changes to the Product don't affect the correctness of my actions at all.

Only in your scenario of "product-discontinued-as-of-date" and "product-discontinued-as-of-version", and even then I don't know. I imagine you could still have a customer order a product that is discontinued. I think you'd be better off detecting this out-of-band by querying the read model.
 
Whoa whoa whoa -- both of those paragraphs ring alarms.

First, I can't align your comment about enforcing environments with any of the definitions of "eventually consistent" that I've run into.

Excuse the rant, but maybe I didn't articulate the concept correctly.

Application A is single-threaded application that stores everything in memory and writes the entire object graph to disk at once (think a Microsoft Word style application). Application A is transactionally consistent - that is, individual aggregate invariants as well as cross-aggregate invariants can be easily enforced. For example, the ParagraphStyle aggregate enforces that a text foreground colour is supplied, while the ParagraphStyleContainer enforces that all ParagraphStyle aggregates have a unique name.

Application B is a single-process but multi-threaded application accessed by concurrent users (think Google Docs). It can't read/write the entire object graph from/to disk for a whole swagger of reasons: performance (loading the entire object graph at once is costly), reliability (one user's changes will either be rejected or will cause another's changes to be rejected), etc, etc. And so Application B must compromise on consistency in order to increase availability. It does this by being more fine-grained, with individual aggregates being read-from/written-to disk to decrease the likelihood of concurrency issues.

When you add the context of a second aggregate to the mix (ie using the state of the Product in the command acting on the Order), what you are doing is analogous to adding (some) ProductEvents to the OrderCommand queue.

My example probably wasn't thought through. I didn't mean to suggest modifying multiple aggregates at once.
 
Especially since the Order command handler has no way of knowing that it has the more recent Product history.
  1. Client dispatches Product.reissue command
  2. Product accepts the reissue command
  3. Client dispatches the Order.addItem command
  4. Order rejects addItem because Product is discontinued ??
  5. Product_Reissued event becomes visible to Order command handler.

For what it is worth, I think Order having an "eventually consistent" view of Product can make sense, but to me it means a different thing; we can send to Order.addItem(Product(Version(200)) command, and the Order can add the Product even though it can't see that version yet.  Commands that need to see "the" state of another aggregate fail when the targeted version is still in the future.  So Order.addItem would work, but Order.validate() or Order.accept() would not because the necessary data hasn't arrived yet.


I think the main issue that you're trying to highlight is that of event ordering. I can kind of understand what you're getting at. In general I'd say the rule is that whenever an aggregate needs to modify itself in relation to the "current" state of another aggregate, there is a potential race condition. This you have to figure out how to handle on a case-by-case basis (automatically cancel the order in a saga/process manager, send an email to the user, notify a staff member to intervene, etc, etc). Whenever an aggregate needs to modify itself against a "particular" state of another aggregate, there is no race condition. The best way to do this is using some kind of service rather than the aggregate itself.

Does this reflect what you're getting at?
 

Michael Ainsworth

unread,
Jan 13, 2016, 6:33:46 PM1/13/16
to DDD/CQRS
On Thursday, 14 January 2016 09:47:46 UTC+11, Michael Ainsworth wrote:
... In general I'd say the rule is that whenever an aggregate needs to modify itself in relation to the "current" state of another aggregate, there is a potential race condition. This you have to figure out how to handle on a case-by-case basis (automatically cancel the order in a saga/process manager, send an email to the user, notify a staff member to intervene, etc, etc). Whenever an aggregate needs to modify itself against a "particular" state of another aggregate, there is no race condition. The best way to do this is using some kind of service rather than the aggregate itself.

I guess the term "current" is ambiguous, because different nodes could have different views on what the version of the aggregate is "current". Having said that, if you tag a particular version of a Product (as in the below example), there still could be a later version of that produce on another node (e.g., version 201, at which point the product was discontinued), right? If so, does tagging version numbers really solve the problem?

order.addProduct(ProductAtVersion(200));

Danil Suits

unread,
Jan 13, 2016, 7:54:38 PM1/13/16
to DDD/CQRS
No worries - writing it helped me think about the problem from different angles.


However, I personally would be hesitant to decouple so much because at the extreme end you could end up having a "service" for each property of your aggregate.

Primary thought: that's a pain I want to have.  If every property of an aggregate needs to be exposed for consumption by the other write models, then the model has gone badly wrong, and it's supposed to hurt.


you rebuild the aggregate by replaying the events, and when you replay the ProductDiscontinued event, you store the date on which the Product was discontinued (the event timestamp most likely)

Sure, you can implement the service that way if you like....

 the typical use case would be to prevent a user from ordering a product if it is currently discontinued.... I guess the term "current" is ambiguous,

I wouldn't say ambiguous so much as loaded.  That word is doing a huge amount of work, and the scaffolding is weak.

Your revised A and B examples are better for me, but I don't think the eventual consistency is there yet.  In particular, Application B is mostly partitioning.  Note that this is a good fit for the case where Udi writes that we don't need so much complexity in our solutions -- the domain doesn't have a lot of inherent contention to manage.

For eventual consistency to really kick in, we need to start reading each other's documents.  And even there, consistency guarantee isn't that my read will eventually catch up with your write.  As far as I can tell, eventual consistency only promises
  1. anything I can see will be internally consistent
  2. the latency to see a change is finite.

The best way to do this is using some kind of service rather than the aggregate itself.


Yes, but not for that reason.  Accessing the state via a Service/Query rather than via the Aggregate forces me to acknowledge that I am reflecting back nostalgically upon the past.

Once you notice that, you can recognize that there are a number of different places you can look back upon the past that are equally valid
  • The aggregate can do it (passing along the most recent version of the productId to the service, and the service compares this with a stale copy of the product history)
  • The command handler can do it (loading up a stale copy of the product history from wherever, and passing the state of the product to the command)
  • The application can do it (as part of the anti corruption layer, packing the data into the command)
  • The client can do it (getting the state of everybody from the read model, and packing it into the message it dispatches to the application)

Of course, if the aggregate is going to be getting the product state from the command, it needs to pitch a fit if the data in the command isn't consistent with its own state -- ie if the productId doesn't match.


In this trivial example, you might note that the application, or the command handler, could notice that the product has been discontinued and refuse to validate the command.  And you'd be right... but you can't do that without the business logic leaking out of the aggregate into the application layer.  If we were dealing with more complicated business rules, that combined information about the product with information about the current state of the order, where should


If so, does tagging version numbers really solve the problem?

 Solve, no.  But it surfaces the problem, and begins to introduce some language around the problem that makes sense.  For instance, you could move the discontinued product validation to Order.accept(), which is verifying that all of the product descriptions are at least as recent as some time.

I proposed a similar idea for the Specification example we talked about a while back; if I'm signing off on a document, I want to be sure the one that I am signing matches the one that I just read.  So my accept command references a specific version of the aggregate, which in turn references a specific version of other aggregates.  In short, it's a hack to ensure that an aggregate that references other aggregates still has immutable state (provided the referenced aggregates provide the same turtles all the way down).
 

Ben Kloosterman

unread,
Jan 13, 2016, 9:59:03 PM1/13/16
to ddd...@googlegroups.com


When you write single-threaded applications that operate solely in memory (not even writing anything to disk), then you realise that invariants are nested within one another, and the entire program can be transactionally consistent. You can have direct memory pointers between your aggregates. As soon as you want to persist your data to disk, you have a "distributed" system because you can't guarantee that an aggregate deserialised from disk is going to have the same memory address, and so you have to allocate unique IDs to each of your aggregates, so that an "in-memory" aggregate can reference an "on-disk" aggregate and vice-versa.

Nothing  special about this in C++ i did a lot of coding in it , but now use C# for GP and rust for lower level and the same issues exist in all languages. This is why we dont do what you have been saying linking aggregates and providing repositories in the domain - It places data access ( ie pluming in ) in the domain - which is one of the main things we want to avoid ..  I will  repeat this is not about no SQL or Oracle but no data access .  It is this data access / querying that is responsible for most of the polluting of the domain in typical OO systems which DDD and CQRS so tries to avoid .   So we use the command handler requests the aggregate from the repository  , the repository normally uses an identity map and ensures the request always point to the same exaggerate .  


Because I don't have access to it.  I might do this:

class Order {
public function addLineItem(productId, availabilityService) {
    if (availabilityService.isDiscontinued(productId) throw new DomainError("You can't purchase a discontinued item");
}
 }

This is not DDD or CQRS  but more typical OO ,.Why not check before calling the function / entering the domain and instantiating the aggregate  ?   

class Order {
public function addLineItem(productId, availabilityService) {
    if (availabilityService.isDiscontinued(productId) throw new DomainError("You can't purchase a discontinued item");
}



 
But I'm not willing to pretend that Order.addLineItem() can make any assertions about the state of the Product aggregate in a transactionally meaningful way.  Or put another way, product.isDiscontinued isn't within the Order aggregate boundary, and therefore can not be part of the invariant that the Order is supposed to enforce.

How is your example any different to mine? Perhaps I don't understand what this "availability service" is, especially given that a discontinued/availability status is on the product anyway? 


"Sure, the product could be discontinued by another user in the time between when the original thread loads the Product and when the adds it to the order, but this is an "eventually consistent" issue, and the same thing can happen in ACID databases." 

Absolutely correct so why do you do a read check in the write domain ?  It can be done before hand.. the extra ms will not greatly increase the chance of an issue.  Which is why i posted doing the check in the fascade / web service. 

I have said before this is the like the 4th choice  .it violates at least commands should rarely fail and possibly query in the write domain .. Sometimes you have to do it eg AvailabilityService may be a 3rd party web service and affect behavior rather than just deny   .. Also you have done a IO call ( very slow)  here so you stop the domain if single threaded . If multi threaded it can still change after the check and before ( or during ) the rest of your code runs . For single threaded its better to check this before sending the command to the domain eg in a multi threaded web call handler  ..then pass it to the domain. Note single threaded domains work really well because the read is often cached and because its small ( eg few strings)  in most systems you can just run it out of memory .

I think you may be running into the concept of what a CQRS domain is .. its something that is confusing , i called it split brain for a long time . A CQRS write domain is not the whole domain  , ideally  it just has all the mutation logic ( not validation!) and is by itself anemic  . This way it has a special place  it has few lines eg things like name with no logic, validation do not even need to exist in the write domain ( meaning its very small)   it is very simple  and  controls mutation and consistency - these things are so important and the relationship to the busines sthat deserve to be on a pedestal  .  There is much domain logic in the fascade / anti corruption layer  , read model and pre generating / validating commands but this is not the write domain and would typically be in a different assembly / dll .  I like to think of the write domain here as a simple higher level domain .   Also think of it this way a write domain would often have a 20+ year lifespan and would evolve , the rest of the domains and GUI's often get replaced far more often., 


Look at gregs simple cqrs domain.cs   https://github.com/gregoryyoung/m-r/blob/master/SimpleCQRS/Domain.cs , note the aggregates - the set of these are the write domain , note what they touch and how much plumbing there is . In a larger project I would pull out the aggregate-root / repository class out as it is used by many domains but not if its this small)   , note the aggregates do not touch or call the repository,.

I may be a militant but i have learned a few lessons the hard way  , cqrs allows very simple systems - i like it that way  , pollution's have a much heavy price then in polluted domains :-)

Regards, 

Ben
Reply all
Reply to author
Forward
0 new messages