CQRS with huge collections in the domain

975 views
Skip to first unread message

Matt

unread,
May 17, 2011, 8:07:16 PM5/17/11
to DDD/CQRS
Domain is Stock Market analysis.

Domain currently setup like so:

Stock (entity)

Price (entity)

Stock contains many Prices.

There can be thousands (possibly millions) of prices per Stock, as it
needs to keep historical prices for calculations of things like
Technical Indicators.

I have a command like "CalculateThirtyDaySimpleMovingAverage". This
command will need to retrieve the past 30 days of prices (for a stock)
to calculate.

I have another command like
"CalculateTwoHundredDaySimpleMovingAverage". This command requires
200 days of prices.

There may be alot more of commands like these which require different
amounts of prices. Some of the commands may require thousands/
millions of prices (possibly all prices back until the beginning of
time).

How would you handle this type of setup in CQRS. When the
'CalculateTechnicalIndicator' command arrives, and requires 'n' prices
for a stock. If I have Stock as an aggregate root, with a collection
of prices, then this will result in a performance issue because all
prices will be loaded into memory when in fact I might only need 30 of
them.

I could go down the path of using a 'PriceRepository' which had a
method like 'GetPricesByDateRange(int stockId, DateTime start,
DateTime end), and that way only the prices needed could be loaded,
but this is going against the CQRS principle of only loading up
Aggregate roots.

How would people handle this, with sticking to CQRS principles (and
ensuring performance doesn't become an issue).

Daniel Yokomizo

unread,
May 17, 2011, 8:37:00 PM5/17/11
to ddd...@googlegroups.com
Why use CQRS in this domain? What benefits you want to achieve?

IME domains geared towards data analysis and such are better served by
a more traditional architecture.

Matt

unread,
May 17, 2011, 8:48:32 PM5/17/11
to DDD/CQRS
The main reason I went down the CQRS path with this domain was to
increase my query performance for reporting. E.g. When a user views
say a Price Chart in a Web Page, having the queries separated from the
domain was extremely beneficial. This solved performance issues in
the Read-Model / Web UI side of things. However I'm now stuck with
similar issues of dealing with large collections of data in the domain
to perform calculations.

I understand CQRS may not be the ideal model for data analysis,
however I would still like to know how to deal with situations like
this (Aggregates with extremely large collections of data) in the
CQRS / DDD world.

On May 18, 10:37 am, Daniel Yokomizo <daniel.yokom...@gmail.com>
wrote:

Daniel Yokomizo

unread,
May 17, 2011, 9:03:04 PM5/17/11
to ddd...@googlegroups.com
Aggregate roots are mostly invariant and consistency boundaries.
There's usually no reason to have large aggregate roots (because
usually there are few entities per AR).

Anyway in your scenario I would suggest a few things:

You can have CQRS without event sourcing. Just external events and
command handlers doing (for example) transaction scripts.

If you do have a rich analysis domain I would try to work only with an
expression domain and leave the actual calculation in the query side.
For example if the user wants to make an analysis to compare some
trends or create a what if scenario you can just build a bunch of
expression value objects and interpret it on the query side to
generate some smart queries in SQL or map/reduce functions in a NoSQL
solution.

Matt

unread,
May 17, 2011, 9:15:12 PM5/17/11
to DDD/CQRS
Daniel, I appreciate your quick responses.

I am currently not using Event Sourcing. The underlying data store is
a normalized relational db. The domain (write model) side of things
is using NHibernate. The query/read side of things is just using raw
sql mapping to DTOs (using automapper).

Regarding the suggestion to move the 'calculation logic' into the
query side is an issue for my current situation. Some of these
'calculations' simply cannot be done purely in a query/sql. Some
require the use of third party libraries/components to perform the
actual calculation, others are complex operations that are best
modeled as some sort of 'domain service' that performs the
calculation. This is something I really feel belongs in the domain/
write model side of things. Once the calculation is done, an entity
is created ('TechnicalIndicator') and stored in the database. This
means I never have to perform that calculation again,(it can simply be
retrieved from the db as an entity (write model) or a dto (read
model)).

So given this situation I'm still stuck with the domain needing to
handle these large collections of prices to perform the calculations.

On May 18, 11:03 am, Daniel Yokomizo <daniel.yokom...@gmail.com>
wrote:

Nuno Lopes

unread,
May 18, 2011, 4:41:20 AM5/18/11
to ddd...@googlegroups.com
I Matt,

> How would you handle this type of setup in CQRS. When the
> 'CalculateTechnicalIndicator' command arrives, and requires 'n' prices
> for a stock. If I have Stock as an aggregate root, with a collection
> of prices, then this will result in a performance issue because all
> prices will be loaded into memory when in fact I might only need 30 of
> them.


You should consider that CQRS useful only when you need multiple representations of the same fact space. One of the reason for this need might be algorithmic. So if your problem is as you describe you need a more efficient representation to perform calculations.

A Traditional CQRS configuration is the following:

Transaction Model (a domain model responsible only to execute transactions, AGGREGATES), OLTP
Presentation Model (a model built towards presentation needs, VIEWS).

You can add other models.

Analysis Model (OLAP and so on)
Reporting Model
Archive Model

and so on.

The Transaction Model is the core. This model feeds all other models.

On another note, you may consider that each model uses different storage schemes for performance reasons.

> I have a command like "CalculateThirtyDaySimpleMovingAverage". This
> command will need to retrieve the past 30 days of prices (for a stock)
> to calculate.


The term command in CQRS is usually reserved for the Transaction Model. Unless the calculation needs to be done within a transaction and changes state it should not be called a Command but a Query.

> Once the calculation is done, an entity
> is created ('TechnicalIndicator') and stored in the database.

On another note you should probably consider that you have more then one bounded context.

Usually Analysis support Decisions yet they are done each in distinct bounded contexts and aren't necessesarly consistent by the split second.

Cheers,

Nuno

Matt

unread,
May 18, 2011, 5:35:51 AM5/18/11
to DDD/CQRS
This line confuses me a bit:

> The term command in CQRS is usually reserved for the Transaction Model. Unless the calculation needs to be done within a transaction and changes state it should not be called a Command but a Query.

The "calculation" will create a new piece of data (whether that is an
entity or a value object is up for debate), that needs to be persisted
somewhere. I don't see how this is not a 'Command'. The result of
the calculation can be retrieved in a Query later on when used for
analysis/ to make trading decisions.

The question still remains: when the calculation is performed in the
domain, what is the best way of dealing with a huge collection of
prices? This doesn't have to be specific to the Stock Market domain
used as an example here, but any domain where an Aggreate Root
contains an extremely large collection of child objects that need to
be used to perform business logic.

Nuno Lopes

unread,
May 18, 2011, 6:38:04 AM5/18/11
to ddd...@googlegroups.com
HI,

> This line confuses me a bit:
>
>> The term command in CQRS is usually reserved for the Transaction Model. Unless the calculation needs to be done within a transaction and changes state it should not be called a Command but a Query.
>
> The "calculation" will create a new piece of data (whether that is an
> entity or a value object is up for debate), that needs to be persisted
> somewhere. I don't see how this is not a 'Command'. The result of
> the calculation can be retrieved in a Query later on when used for
> analysis/ to make trading decisions.

Just because some value need to be persisted in some database it does not mean that it needs to be part of the Domain Model, or it is necessarily transactional. Its seams to me that you aren't describing any business rules then need enforced at all time. Yes it is business logic, but aren't really business rules (when this do that, if this then that, and so on).

From a data point of view a Domain Model is mainly a effective to fetch very specific facts about the domain. What happened, when, and why on a specific Aggregate at specific time T. What is the total of an Order made 3 months ago (T).

The scenario you are describing seams more like an overview over what happened across time, evolving multiple dimensions over a single type of fact (PriceChanged).

To this a different model might be more suited: http://www.dwreview.com/OLAP/Introduction_OLAP.html

http://msdn.microsoft.com/en-us/library/aa902683(v=sql.80).aspx

You can build yourself a generic analytical engine, and for that matter use domain modeling or you can use an OLAP tool from some vendor. Either way I would probably tackle it as a separated Bounded Context then say Client Portfolio Management.

Trance.

Laurynas Pečiūra

unread,
May 18, 2011, 6:53:58 AM5/18/11
to ddd...@googlegroups.com
+1 to Daniels post

From purist perspective: does calculation (however complex) change any external state? If not, it is a query. "AttachTechnicalIndicator" (after the calculation is complete) has all the reasoning to be a command.

Matt

unread,
May 18, 2011, 7:17:14 AM5/18/11
to DDD/CQRS
Nuno and Laurynas, thank you for taking the time to respond.

I'm starting to see now how this can be viewed as a query, and that
the prices & technical indicators could be treated as a different
model (bounded context). I guess I was stuck in the mindset that all
queries had to hit a database. If I go down the path of treating the
calculation as a Query in a separate bounded context, what is
considered best practice to share data between the two models. e.g.
The Portfolio Context would need to know prices & indicators to make
trading decisions + determine Order sizes. If that is the case how
can I pass say a list of prices + indicators into the Portfolio
Context. Would you pass in a list of Price DTOs and Technical
Indicator DTOs into the Portfolio Domain?

On May 18, 8:53 pm, Laurynas Pečiūra <laurynas.peci...@gmail.com>
wrote:

Laurynas Pečiūra

unread,
May 18, 2011, 8:08:22 AM5/18/11
to ddd...@googlegroups.com
Is there any place where a "TechnicalIndicator" would make sense outside "Portfolio" context? If not, then send the "AddTechnicalIndicator" command to the relevant aggregate root in "Portfolio" context. Passing them around as Value Objects is an option as well.

Off topic. Discovering the Context Map (a representation of the Bounded Contexts involved in a project and the actual relationships between them and their models) is by far the hardest and most crucial step in modeling for me. My suggestion would be to take your time with that. There is a nice paper that might help you a bit.

Greg Young

unread,
May 18, 2011, 8:23:03 AM5/18/11
to ddd...@googlegroups.com
I have domain experience here and will be writing a response. However
it may be better to take this offline due to the domain specific
stuff.

These types of systems are generally not "CQRS" but are pure event based.

Also a "Technical Indicator" generally is associated with a "Group of
Instruments"

2011/5/18 Laurynas Pečiūra <laurynas...@gmail.com>:

--
Les erreurs de grammaire et de syntaxe ont été incluses pour m'assurer
de votre attention

Ales Vojacek

unread,
May 18, 2011, 8:46:23 AM5/18/11
to ddd...@googlegroups.com
+1 for share solution here, I'm interested in too.
If it is possible, of course.
Dne 18.5.2011 14:23, Greg Young napsal(a):

Rinat Abdullin

unread,
May 18, 2011, 8:56:28 AM5/18/11
to ddd...@googlegroups.com
+1 for sharing, if possible :)

Rinat

Laurynas Pečiūra

unread,
May 18, 2011, 9:02:26 AM5/18/11
to ddd...@googlegroups.com
+1 for above.

Nuno Lopes

unread,
May 18, 2011, 9:07:34 AM5/18/11
to ddd...@googlegroups.com
+1 for sharing.

Nuno

James H

unread,
May 18, 2011, 10:05:29 AM5/18/11
to ddd...@googlegroups.com
Hi Matt, admittedly I don't have much CQRS experience, but your long running commands should be handled autonomously and asynchronously by some sort of messaging component (e.g. NSB, MT, pick your poison here).  I would think technical indicator calculation and persistence only need to be performed as prices arrive (or change if you model supports that) rather than by user request.  A 30 day SMA is the same for everyone.  

Long running user initiated analysis style commands (e.g. calculate value at risk for a particular set of instruments and curves) should be async as well.  A user would submit the command and check back (or be notified) once the process was complete.  I believe that can still fit in a CQRS model as I've seen it described by Udi.

Greg Young

unread,
May 18, 2011, 10:13:44 AM5/18/11
to ddd...@googlegroups.com
Are all your algorithms known? Or are they dynamic too like "calculate
moving average(60)"?

--

Matt

unread,
May 18, 2011, 6:14:22 PM5/18/11
to DDD/CQRS
Yes they will be all "known". I made the decision to restrict the
algorithms to a known set so that they could all be calculated as soon
as a price changes.

Matt Callahan

unread,
May 19, 2011, 12:20:12 AM5/19/11
to DDD/CQRS
I just realised there is very similar discussion to what I originally was asking here:

Although my question was more around handling very very large sets of data in the domain, this discussion was talking more generally about how set based calculations should be performed. 

Greg, I'd still be very interested to hear your point of view on how to handle these situations. It would be great if you could elaborate on the "pure event based" type of system you were talking about.  How does this differ from a traditional CQRS/DDD setup?

Tom Janssens

unread,
May 19, 2011, 3:31:04 AM5/19/11
to ddd...@googlegroups.com
I am no domain expert in this matter, but this would be how I would solve it (KISS):

a NewStockPricePublished event handler generates a view which is composed like this:

class CalculatedValue
{
  string StockID;
  string Name; // f.e. EMA[10],SMA[20],BOL.HI[20],BOL.LO[20],RSI[10]
  DateTime Timestamp; // or some similar uid
  decimal CalculatedValue;
}



Lorenzo

unread,
May 19, 2011, 4:20:28 AM5/19/11
to DDD/CQRS
Hi,
I'm not expert at all here, but, in my understanding, I have the
impression that the "CalculateThirtyDaySimpleMovingAverage" COMMAND,
it is not really a command, but rather a QUERY i.e.
"ShowThirtyDaySimpleMovingAverage".
I would agree with Tom's solution (if I understand it correctly) and
having an event handler that updates a view
"ThirtyDaySimpleMovingAverage"
Does it make any sense?

Lorenzo

Matt Callahan

unread,
May 19, 2011, 4:44:34 AM5/19/11
to ddd...@googlegroups.com
If we treat the calculation as a "Query", the result of that query still somehow has to get into the Domain.  When new "technical indicators" are calculated, I would want something like a "TechnicalIndicatorCalculated" domain event so that systems (or other AR's in the domaind) could react to that event and do some other processing.  

In this scenario, I would have to do a Query (which would do the calculation), and then pass the results of that query into the Domain as a Command.  (this was mentioned in one of the previous responses)

So something like this:
1. queryService.CalculateTechnicalIndicator(... )  [Query]
2. commandService.RegisterNewTechnicalIndicator(...) [Command]

The bit where I feel uncomfortable with this approach, is what is actually co-ordinating these steps above?  How can I trust that what is being passed into the RegisterNewTechnicalIndicator command is correct data?

This is getting me thinking about the Query side of things from a new angle.  Nearly all examples of CQRS on the internet which talk about the read model /query side, talks in terms of DTO's being displayed on UI/Screen.  In this case the query DTO is really going to be just passed into the domain.  This feels a bit strange to me.  My gut is telling me that its really the domains responsibility to be doing these calculations - that way all logic of how it is calculated is encapsulated inside the domain model.  Any thoughts?

On a side note, thank you to everyone that is responding to this discussion - I am finding it really helpful in refining my thinking - even if I'm still not comfortable with any of the solutions just yet :)

Lorenzo

unread,
May 19, 2011, 5:24:53 AM5/19/11
to DDD/CQRS
Matt,
I guess we are getting there...
If the "knowledge" of how to to the calulation belongs to the domain,
possibly there is where the logic need to go.
So the commandService.RegisterNewTechnicalIndicator(...) will have to
do the calculation and publish an event, something like
"TecnichalIndicatorsCalculated"
Your "read" side should pick up that event and update the data for the
views.
I believe it should be reasonable to consider the fact that the
calculation might be resourse consuming and might impagct on the
responsiveness of the command (unless done async, with all the
connected problems.)
My feeling is that you might need to split the work in two different
commands:

commandService.RegisterPriceChanged(){
applyEvent(commandService.PriceChangedRegistered)
sendCommand(commandService.UpdateTechnicalIndicator)
}
commandService.UpdateTechnicalIndicator(){
applyEvent(commandService.NewTechnicalIndicatorUpdated)
}

queryService.subscribe(PriceChangedRegistered){
'update read model for the PRICES
}
queryService.subscribe(NewTechnicalIndicatorUpdated){
'update read model for the TECHNICAL INDICATORS
}

Lorenzo

Matt Callahan

unread,
May 19, 2011, 5:36:39 AM5/19/11
to ddd...@googlegroups.com
Lorenzo,

Yes, that is very very close to how I am currently thinking. I agree entirely with the different commands and events you mention.

However the issue I still face is:

in the 'UpdateTechnicalIndicator' command, when this arrives to perform the actual calculation, it (command handler, or domain service) will need to load up a huge amount of historical prices.  What is the best way to do this?  Remembering the actual number of prices required, will be different for each type of 'Technical Indicator'.

I can think of a few options:

Option 1: Command handler calls something like: _priceRepository.GetPricesByStockAndDateRange(..), and pass these prices into a method on a AR (or Domain service) which performs the calculation.  This seems to go against all CQRS wisdom because the repositories should only load up a single AR (if I understand correctly).

Option 2: All the prices are passed into the domain with the Command.  So the command would have a collection of 'PriceDTO' inside it.  This option however was where I was feeling very uncomfortable because I would much prefer the prices to be coming from inside the domain i.e. a trustworthy source.  

Option 3: Maybe I'm missing an Aggregate Root which would change my whole mindset of how this is to be solved?

Jimit

unread,
May 19, 2011, 6:02:48 AM5/19/11
to ddd...@googlegroups.com
I agree with Nuno above. This sounds like a scenario best suited for an analytical model. Queries should be posted to that model.

Nuno Lopes

unread,
May 19, 2011, 6:03:27 AM5/19/11
to ddd...@googlegroups.com
Hi Matt,

You need to understand the proper approach is highly dependent on your algorithms.

Typically in CQRS, you don't query your Transactional Model, you issue commands.

> 1. queryService.CalculateTechnicalIndicator(... ) [Query]


As I said before don't know if CalculateTechnicalIndicator is a command or a query. But assuming you want indicators to be consistent in a time line with each other I would assume that you need an Aggregate somewhere.

I have never been in a project related to Stock. So sorry if this example looks naive.

For instance, say you want to calculate the Average of Prices since the beginning of time of some Stock. Would you need to process millions of StockPriceChanged every time there is a change? NOOOOOOO.

Consider an Aggregate, AverageStockPrice of the stock of some company. When a new price changed gets in, it updates the total, and easily computes the Average. It can keep track of averages across time. It does not need to analyze millions of price changes, it done incrementally.

AverageStockPrice.Handle(stockPriceChanged) -> AverageStockPriceChanged(); // NOT CQRS.

Or if you want to go for the CQRS way:

StockPriceChanged Handler:

AverageSockPrice.Calculate(stockPriceChanged.newPrice) -> AverageStockPriceCalculated(....);

I prefer the first.

You can have a massive number of indicators being computed in parallel like this.

My advice would be to focus on those algorithms and get ways to optimize their computations fead with StockPriceChanged events. In other words, make a computation model for those. The difficulty is when parameters are arbitrary. Say in the above, the time line over which the average o be computed is fully arbitrary, hence you are left with calculating the average over all the events every time.

Does it really matter if CalculateTechnicalIndicator is a Query or Command?

You can then use CQRS on top to push results to Views or whatever other Bounded Context.

Cheers,

Nuno


Richard Dingwall

unread,
May 19, 2011, 6:45:56 AM5/19/11
to ddd...@googlegroups.com
In quant terms, it depends if you want your technical indicators to be
a 'live' or 'offline' model.

Live models will update a running total as every tick arrives (i.e.
triggered by events), in the fastest way possible (caching etc). This
is suited to things like HFT and algo trading and might trigger
further events - e.g. you might have entry/exit strategies as DDD
sagas waiting for particular technical indicator events indicating
favourable conditions to execute buy/sell orders.

Offline models, on the other hand, are simply queries over the read
model (select avg(price) from tick...). This is what you would use for
simpler ad hoc reporting/analysis.

Both styles are a good example of where it is acceptable to use domain
logic on the read side.

--
Richard Dingwall
http://richarddingwall.name

Michael.

unread,
May 19, 2011, 6:57:12 AM5/19/11
to DDD/CQRS
Matt,

The things you describe sound more like a chain of Event Handlers
rather than Commands on the domain.

I also have no domain knowledge but here is how I could see this
working:

1. Something in the actual domain publishes a StockPriceUpdatedEvent
2. The CalculateThirtyDaySimpleMovingAverage EventHandler picks up the
event. It then queries a table on the Read side to get the historical
prices it needs to do its calculation.
3. Once the calculation is done, the EventHandler publishes a
TecnichalIndicatorCalculatedEvent (which includes the stock, type of
indicator, new value, etc)
4. An event handler picks up the TechnicalIndicatorCalculatedEvent and
writes it to the appropriate table on Read side.

Michael.

Matt Callahan

unread,
May 19, 2011, 7:12:35 AM5/19/11
to ddd...@googlegroups.com
Michael,

A few questions:

For step 2: "The CalculateThirtyDaySimpleMovingAverage EventHandler picks up the

event. It then queries a table on the Read side to get the historical
prices it needs to do its calculation."

Would you consider this event handler part of the domain? Or would it sit outside the domain?  (I currently consider the object that does the calculation to be part of the domain)

For step 3: "Once the calculation is done, the EventHandler publishes a
TecnichalIndicatorCalculatedEvent"

Is it considered normal / best practice to publish another event from an Event Handler?  

Nuno Lopes

unread,
May 19, 2011, 10:08:17 AM5/19/11
to ddd...@googlegroups.com

A note

> For step 2: "The CalculateThirtyDaySimpleMovingAverage EventHandler picks up the
> event. It then queries a table on the Read side to get the historical
> prices it needs to do its calculation."


I thnk you probably mean The CalculateThirtyDaySimpleMovingAverage CommandHandler

CalculateThirtyDaySimpleMovingAverage is definitely not an event (what fact or occurrence does it represent?)

Nuno


Richard Dingwall

unread,
May 19, 2011, 1:08:06 PM5/19/11
to ddd...@googlegroups.com
On 19 May 2011 12:12, Matt Callahan <matt...@gmail.com> wrote:
> Michael,
> A few questions:
> For step 2: "The CalculateThirtyDaySimpleMovingAverage EventHandler picks up
> the
> event. It then queries a table on the Read side to get the historical
> prices it needs to do its calculation."
> Would you consider this event handler part of the domain? Or would it sit
> outside the domain?  (I currently consider the object that does the
> calculation to be part of the domain)
> For step 3: "Once the calculation is done, the EventHandler publishes a
> TecnichalIndicatorCalculatedEvent"
> Is it considered normal / best practice to publish another event from an
> Event Handler?

Domain logic doesn't have to only be on the write side. Calculating
technical indicators is definitely part of your domain model.

And it is no problem publishing events from event handlers.

James

unread,
May 19, 2011, 3:53:24 PM5/19/11
to ddd...@googlegroups.com
Assuming you want a 30 day moving average for the previous 30 days,
every day. Set up a saga that is created at midnight each day and
start consuming messages that are relevant to calculate the technical
indicator. At the end of the 30 days (saga timeout), saga sends the
TechnicalIndicatorCalculated command to the aggregate or it can send a
CalculateTechnicalIndicator command and pass in the data the saga has
been capturing for the previous 30 days.

Again, no experience in this industry.

James

Udi Dahan

unread,
May 19, 2011, 6:12:09 PM5/19/11
to DDD/CQRS
Averages are things that don't require historical data to be
calculated - you can do it with event streams. All you need to store
is the previous average and number of data points:

Average(n+1) = Average(n)* (n-1) / n + New_Value / n

Thus, a 30 day moving average is a simple saga that handles the price
changed event whose data is Previous_Average and N. It could then
publish an event indicating the new average which is then stored in
your view model. You have one instance of this saga for each stock.

Your domain model is a saga.

Cheers,

-- Udi Dahan

Matt Callahan

unread,
May 19, 2011, 6:42:49 PM5/19/11
to ddd...@googlegroups.com
Thankyou again for all the responses.

Unfortunately the Simple Moving Average is the simplest possible example of a Technical Indicator.

Some of the more complex ones I don't even calculate myself, I pass a series of prices into a 3rd party component which does the calculation for me.  

I will have to think more about "domain logic" living in the read model side of things.  In an "eventually consistent" world (which I'm currently not in - both models currently use the one data source) I feel this may be problematic, because the calculations may be running off stale data - which is not acceptable in such a critical system dealing with large sums of money in the stock market. 

Matt Callahan

unread,
May 19, 2011, 6:45:48 PM5/19/11
to ddd...@googlegroups.com
James, thanks for this idea.  I will have to look more into Saga's.  I don't quite understand them properly yet.

Greg Young

unread,
May 19, 2011, 7:00:45 PM5/19/11
to ddd...@googlegroups.com
This is a "saga" only if you equate anything that can be implemented
as a state machine to be a saga and don't think about the intent of
the pattern. --There are some who would make the argument (correctly)
that any imperative code can be implemented as a state machine and
thus all code would be sagas.-- There is no concept of a long running
transaction.

In reality its just a small FSM to do the work.

--

Greg Young

unread,
May 19, 2011, 7:02:55 PM5/19/11
to ddd...@googlegroups.com
Pretty much all of the technical indicators can be done in this way
(we did them).

As I mentioned previously this sounds like a pure eventing system. The
best way to think about it is in terms of cascading. Events come into
your system (PriceChanged) you have a series of FSM (Finite State
Machines) that listen to these events and on their own produce a
series of new events (let's say 30 minute moving average changed).
Then these events might go to other handlers who in turn produce their
own events (BuySignalFound). Etc etc etc. When we model the system we
look at it in terms of cascading ... Does this make sense?

Greg

--

Matt Callahan

unread,
May 19, 2011, 7:50:24 PM5/19/11
to ddd...@googlegroups.com
Yes Greg it does make sense. This is something I have been moving towards recently.  Originally all these calculations were done as an end of day batch process, but I was never really comfortable with that approach.  I like the cascading events style.

Question: can the event handlers & command handlers interact with BOTH the Query Model + the Write Model?  e.g. When an event for 'PriceChanged' is handled by one of the technical indicator handlers, can this handler go to the Query model and say "give me all prices for this stock for the past 1000 days", and then pass those prices into the 'write side' (via a command?) to perform the calculation?  If not, where would the prices come from if not from the query model?

Greg Young

unread,
May 19, 2011, 7:51:51 PM5/19/11
to ddd...@googlegroups.com
There is no "query side" and "write side"... They are all just cascading events.

Matt Callahan

unread,
May 19, 2011, 7:52:46 PM5/19/11
to ddd...@googlegroups.com
So where is the data stored?  Is it just one underlying data source and we pretty much throw the CQRS idea out the window in this scenario?

Greg Young

unread,
May 19, 2011, 7:59:12 PM5/19/11
to ddd...@googlegroups.com
For most technical indicators you don't need the previous data just a
current state on the event stream. Udi gave a good example of this
with averages ... This is more of an event centric system, you may
have a read model that consumes and save the
"TechnicalIndicatorUpdatedEvent" and say stores it for querying but
the thing generating it generally isnt coming off a stored model

Matt Callahan

unread,
May 19, 2011, 8:03:53 PM5/19/11
to ddd...@googlegroups.com
Hmmm you are getting me thinking (a good thing).  I will have to ponder this for a bit.

One thing that doesn't sit quite right is that you say "for MOST technical indicators you don't need previous data".  MOST, not all.  How do you handle the ones that do need previous data?  What happens if that previous data is thousands of prices?  Are all these prices "stored" with each FSM?

Greg Young

unread,
May 19, 2011, 8:05:25 PM5/19/11
to ddd...@googlegroups.com
We were able to make incremental all that we had for technical
indicators. Some are hard.

Matt Callahan

unread,
May 19, 2011, 8:07:10 PM5/19/11
to ddd...@googlegroups.com
Greg, thank you for your insights.  I appreciate that you can't reveal too much in this domain.

I will go away and try some things out and see what I come up with.

Cheers,
Matt.

Greg Young

unread,
May 19, 2011, 8:08:51 PM5/19/11
to ddd...@googlegroups.com
The thing is if you do it in the cascading style things become very
performant (think 2 min avg on 1000s per second)

Aleš Vojáček

unread,
May 19, 2011, 8:10:03 PM5/19/11
to ddd...@googlegroups.com
One thing is hard to imagine for me.
I thing that events must not be time ordered, so I thing that for some
period can be hard to use events to compute time based data.
Or I'm out of context?
A.

Aleš Vojáček

unread,
May 19, 2011, 8:11:49 PM5/19/11
to ddd...@googlegroups.com
oh sry for bad spelling, tired and my english is not good.

Aleš Vojáček

unread,
May 19, 2011, 8:16:39 PM5/19/11
to ddd...@googlegroups.com
But If i'm trying say 30mins avg, so then some events can come after
that. I know that it is almost imposible, but I have experience that
these type of bugs are worst. I mean that it will happen only 13.
every odd month at 13:14 :-)
A.

GonZa

unread,
May 19, 2011, 10:08:38 PM5/19/11
to ddd...@googlegroups.com
Greg,

    Where to follow the off line stuff ? - I'm interested in that specific domain.

Nuno Lopes

unread,
May 20, 2011, 5:06:13 AM5/20/11
to ddd...@googlegroups.com
Hi Matt

In light of Udi and Greg's recent insights may be my post reveals to be more useful then naive: http://groups.google.com/group/dddcqrs/msg/4768e4eda12f5dac 

I would pay attention on Greg's comments since is a strong background in that domain which can provide you with a better mind set to handle this.

Having said this a few general comments of mine that other might disagree.

1) Aggregates can handle Events. CQRS is not prescriptive in that regard either. It just tells you to separate Commands from Queries and you should design towards a 1-1 relationship between a Command and an Aggregate Method. Yet it is moot about events. Some people "convert" Event to Commands, they find it more natural in their domain.

2) CQRS should be used when you need multiple structural representations of a set of facts in the Domain for several reasons (there is a thread where this is explored). This may be or not you case, but it does not seam the case at hand.

3) Sagas are Aggregates.

4) An FSM can be encapsulated in a Aggregate.

If you are doing computations based on cascading Events (using this terminology) a fews hurdles you might face.

The case of calculating a moving average or an average in general is simple to solve without going through millions of events as several people said.

Eventually you will probably face this scenario

IndicatorTFunction(Indicator1, Indicator2) ...

When using cascading events Indicator1 and Indicator2 will arrive to the function in the form of Events.  This means that may or not arrive our of order, not everything will arrive at the same time. 

The easiest thing to do is to avoid it by computing Indicator1, Indicator2 in the same Aggregate computing IndicatorTFunction. You will avoid the ordering problem.

If that is not the best approach and each Indicator is computed in separate Aggregates you will need a way to correlate these values in order to compute the function.

Since basic fact that fires this computation is PriceChanged just declare it along with the indicator.

Example

StockPriceChange(changedStockId, newStockPrice) gets in the machine.

Indicator1.Handle(aPriceChange) -> Indicator1Changed(newValue, changedStockId, newStockPrice);
Indicator2.Handle(aPriceChange) -> Indicator2Changed(newValue, changedStockId, newStockPrice);

IndicatorT.Handle(aIndicator1Changed) -> .
IndicatorT.Handle(aIndicator2Changed) -> IndicatorTChanged(newValue, changedStockId, newStockPrice);

This is similar to a long running transaction, but its a long running computation :)

Your domain model you computational model for indicators.

Now you have told us that you use a third party module to calculate indicators. If that is the case, to achieve your goal there should be a way to feed it with a stream of price changes. If that is not the case you may accumulate a time a time series of price changes and feed it in batches as per requirements.

Hope it helps,

Nuno
PS:  "Do not try to bend the spoon, that's impossible. Instead only try to realize the truth... there is no spoon."

Matt

unread,
May 22, 2011, 7:57:40 PM5/22/11
to DDD/CQRS
Hi Nuno,

Thanks for this summary. I would be interested to hear whether others
agree or disagree with your comments. I am currently trying some of
these ideas out in code. In a few weeks time hopefully it will all
make sense, or I will have some more questions. (Probably a bit of
both :)

cheers,
Matt.
> > On Fri, May 20, 2011 at 10:05 AM, Greg Young <gregoryyou...@gmail.com> wrote:
> > We were able to make incremental all that we had for technical
> > indicators. Some are hard.
>
> > On Thu, May 19, 2011 at 8:03 PM, Matt Callahan <mattca...@gmail.com> wrote:
> > > Hmmm you are getting me thinking (a good thing).  I will have to ponder this
> > > for a bit.
> > > One thing that doesn't sit quite right is that you say "for MOST technical
> > > indicators you don't need previous data".  MOST, not all.  How do you handle
> > > the ones that do need previous data?  What happens if that previous data is
> > > thousands of prices?  Are all these prices "stored" with each FSM?
>
> > > On Fri, May 20, 2011 at 9:59 AM, Greg Young <gregoryyou...@gmail.com> wrote:
>
> > >> For most technical indicators you don't need the previous data just a
> > >> current state on the event stream. Udi gave a good example of this
> > >> with averages ... This is more of an event centric system, you may
> > >> have a read model that consumes and save the
> > >> "TechnicalIndicatorUpdatedEvent" and say stores it for querying but
> > >> the thing generating it generally isnt coming off a stored model
>
> > >> On Thu, May 19, 2011 at 7:52 PM, Matt Callahan <mattca...@gmail.com>
> > >> wrote:
> > >> > So where is the data stored?  Is it just one underlying data source and
> > >> > we
> > >> > pretty much throw the CQRS idea out the window in this scenario?
>
> > >> > On Fri, May 20, 2011 at 9:51 AM, Greg Young <gregoryyou...@gmail.com>
> > >> > wrote:
>
> > >> >> There is no "query side" and "write side"... They are all just
> > >> >> cascading
> > >> >> events.
>
> > >> >> On Thu, May 19, 2011 at 7:50 PM, Matt Callahan <mattca...@gmail.com>
> > >> >> > On Fri, May 20, 2011 at 9:02 AM, Greg Young <gregoryyou...@gmail.com>
> > >> >> > wrote:
>
> > >> >> >> Pretty much all of the technical indicators can be done in this way
> > >> >> >> (we did them).
>
> > >> >> >> As I mentioned previously this sounds like a pure eventing system.
> > >> >> >> The
> > >> >> >> best way to think about it is in terms of cascading. Events come
> > >> >> >> into
> > >> >> >> your system (PriceChanged) you have a series of FSM (Finite State
> > >> >> >> Machines) that listen to these events and on their own produce a
> > >> >> >> series of new events (let's say 30 minute moving average changed).
> > >> >> >> Then these events might go to other handlers who in turn produce
> > >> >> >> their
> > >> >> >> own events (BuySignalFound). Etc etc etc. When we model the system
> > >> >> >> we
> > >> >> >> look at it in terms of cascading ... Does this make sense?
>
> > >> >> >> Greg
>
> > >> >> >> On Thu, May 19, 2011 at 6:45 PM, Matt Callahan <mattca...@gmail.com>
> > >> >> >> wrote:
> > >> >> >> > James, thanks for this idea.  I will have to look more into
> > >> >> >> > Saga's.
> > >> >> >> >  I
> > >> >> >> > don't
> > >> >> >> > quite understand them properly yet.
>
> > >> >> >> > On Fri, May 20, 2011 at 5:53 AM, James <jhicks0...@gmail.com>
> > >> >> >> > wrote:
>
> > >> >> >> >> Assuming you want a 30 day moving average for the previous 30
> > >> >> >> >> days,
> > >> >> >> >> every day.  Set up a saga that is created at midnight each day
> > >> >> >> >> and
> > >> >> >> >> start consuming messages that are relevant to calculate the
> > >> >> >> >> technical
> > >> >> >> >> indicator.  At the end of the 30 days (saga timeout), saga sends
> > >> >> >> >> the
> > >> >> >> >> TechnicalIndicatorCalculated command to the aggregate or it can
> > >> >> >> >> send
> > >> >> >> >> a
> > >> >> >> >> CalculateTechnicalIndicator command and pass in the data the saga
> > >> >> >> >> has
> > >> >> >> >> been capturing for the previous 30 days.
>
> > >> >> >> >> Again, no experience in this industry.
>
> > >> >> >> >> James
>
> > >> >> >> >> On Thu, May 19, 2011 at 12:08 PM, Richard Dingwall
> > >> >> >> >> <rdingw...@gmail.com>
> > >> >> >> >> wrote:
> > >> >> >> >> > On 19 May 2011 12:12, Matt Callahan <mattca...@gmail.com>
Reply all
Reply to author
Forward
0 new messages