Cheers,
-- Udi Dahan
http://groups.google.com/group/dddcqrs/browse_thread/thread/69496d381d9afb76/f55eb0c2723cc17d
greg comes to rescue in the middle somewhere, asking the question,
"what are the business implications of a duplicate? What are the
chances of it happening"
= if neither of them not too big, simply let it go through and handle
it asynchronsously with some sort of compensating command or whatever.
He's stating consistency is overrated.
In the thread Adam also gives a solution in the very special occassion
where consistency IS indeed needed, with introducing a SHA1 lookup of
email addresses from the commandhandler / domain, cant remember the
details.
--
Stefan Holmberg
Systementor AB
Blåklockans väg 8
132 45 Saltsjö-Boo
Sweden
Cellphone : +46 709 221 694
Web: http://www.systementor.se
I'd recommend not reinventing the unique-constraint wheel.
Kind regards,
-- Udi Dahan
-----Original Message-----
From: ddd...@googlegroups.com [mailto:ddd...@googlegroups.com] On Behalf
Of Jeff Doolittle
Sent: Thursday, September 02, 2010 9:02 AM
To: DDD/CQRS
As for no 2 it wont do you much good. It another hint (like no 1), but
since readmnodel is (preferably) asynchronously updated it might not
be correct, yet.
So some sort of #4 is always needed - or introduce a sha1 table on
email like Adam suggests in the long thread I pointed to.
No 3
1) check the read model in the client before submitting the command
(seems like a no brainer, why wouldn't you at least do this?)
2) check the read model in the command handler and throw if it already
exists (questionable - should I really be checking the read model from
a command handler? ... keep in mind Event Sourcing: I can't query the
domain objects by specific properties)
3) 1 & 2
4) catch the duplicate on the CreateNewUserEventHandler and then
perform a compensating command to rollback the duplicate, then send
some sort of notification regarding the issue
5) 1 & 4
6) 1 & 2 & 4
--
Nuno
Sent from my iPhone
As for introducing the extra table if you worry about that, the whole
"solution" is the fact that "contraint check is outside of the
domain". The domain instead handles the duplicate cases.
My event handlers are separate processes chasing the eventlog based on
SLA, and so far I have had no reason to make them multithreaded if
thats what you
mean with "what if another event looking the same pops in" - so I am
simply not sure about those type of scenarios.
--
> Storing say aggregateid along with email. So the duplicate check
> handler does not parse/read eventstore but simply queries its very own
> table.
Where is that table described using pure Event Stores as defined by Greg? How consistent are the Tables you advised in relationship to the Event Store?
Are we implementing it ourselves or relying on some third party product data facilities?
If are implementing ourselves how smart is our implementation?
Are we doing a full table scan and locking the entire table or are being smart about it? If so serializing every single write to the Even Store buckets (key, value)
Or are we organizing indexes on a BTree and locking and freeing the necessary subtrees has we scan?
What about inserts in those indexes? Are we locking the all thing again as such serializing writes in our even store too?
Or are being smart again and locking only what is needs when we rebalance the btree?
Then we have availability. What happens if the system containing the table goes down? All systems goes down? Use secondary a secondary system for backup? Are we distributing our table and using some form of quorum technique?
Listen to what Udi said:
"I'd recommend not reinventing the unique-constraint wheel."
I think what Udi is advising us to reuse the facilities provided by current products in the market for that. There is a lot of knowledge in them. Databases providing these facilities are not BullshitDatabase systems unless you know very little about it.
Basically look for a real databases providing "unique constraints" unless you are planning to build a product out of your solution.
They don't necessarily need to be commercial. For instance, you can use reuse MySQL unique constraints facilities in conjunction with Cassandra, but then you need 2PC across both systems has far as I understand which can bring a penalty. You can shard your index tables etc etc etc.
Or you can use just use a commercial RDBMS (MSSQL, Oracle etc etc) storing the AR in a "Blob" and put fields requiring unique constraints outside and define an index over them.
If you have a FREE product that does this all in a full distributed system I would for sure like to know.
This is pretty much how I understand Udi observation.
Finely, implementing unique constrains has nothing to do with consistency but keeping your system sound (http://en.wikipedia.org/wiki/Argument#Soundness). Consistency is a another beast (http://en.wikipedia.org/wiki/Consistency).
Consistency may be overrated but soundness is an imperative. The impact of lack of soundness basically means that your assets are flowing throw tiny tiny holes in you wallet and no one knows why, or worst you don't notice that it is.
I'm assuming that we are discussing very large datasets, otherwise why bother. On another note, unique constraints can be used not only to establish uniqueness over all Aggregates of a kinf (say email of a Costumer) but also within an Aggregate. (say that the same product cannot be inserted twice in an order, not a very good example but illustrates its usage)
Hope it helps.
Nuno
of course my "special table" is in SQL Server or whatever, a real
database. I apologize if I somehow gave you the impression I have
implemented my own database???
"using pure Event Stores" ?. Handles events by reading from eventstore
and inserts into the special table. Which has contraints. So I am
using the database for contraints handling.
as for "Or you can use just use a commercial RDBMS (MSSQL, Oracle etc
etc) storing the AR in a "Blob" and put fields requiring unique
constraints outside and define an index over them." you do have a
valid point, as it would break case the break already in
commandhandler.
Cause what I mean by saying I am indeed reinventing the
unique-contraint wheel, its exactly what I do right now with handling
it later and "rollback" with a compensating command. But basically its
good enough for me atm.
--
I thought you where implementing your own unique constraint mechanism. Table can mean many things :)
> as for "Or you can use just use a commercial RDBMS (MSSQL, Oracle etc
> etc) storing the AR in a "Blob" and put fields requiring unique
> constraints outside and define an index over them." you do have a
> valid point, as it would break case the break already in
> commandhandler.
Yes. Much simpler.
> But basically its
> good enough for me atm.
Ok. In cases where you don't have high concurrency requirements on writes over the same Aggregate it might just work.
In the scope of persistence my impression over this is that that one solution is creating a problem whose solution requires the first solution. So one feeds the other. The event store requires an external index table to maintain unique constraints. To avoid 2PC between both sometimes we need to compensate, as such validating the need for an Event Store in the first place. To me it is silly.
But if both Event Store and the index Table are modeled in the same datastore supporting transactions, you don't need to compensate at all.
All in all if we just decided for the event store for some other reason then persistence and compensation and an Aggregate instance does not change often I see its benefits. As you may not need to roll back that much.
Cheers,
Nuno
Ok, were understanding each other:) Actually I did code some lowlevel
database storage, along with a simple SQL parser, back in 1999 or so,
but it was not that funny so I stay away from that nowadays :)
as I said, I fully understand your points and the simplicity of it
should make it the obvious solution.
So why am I not doing it your way? While in practice I do use a
fullblown RDBMS (SQL Server) for my eventstorage but my goal is to get
rid of it. I therefore try to avoid introducing transactions/uow into
my commandhandlers. When I notice I need a UOW I know I probably
havn't modelled my AR right or it is a special case like the one were
talking about. And I work my way around it if possible.
More, for maximum throughput (while it might not be really needed in
my modest systems yet but still) : is checking for duplicates
THAT important for the business? Will duplicates happen often?
Cause we will get better throughput by not doing the checking from the
commandhandler and just allowing it and go on with the next one.
Whos right and whos wrong??...Its a personal opinion I guess, and what
I think is the correct way now might change later on when I do get
more experience.
--
> Ok, were understanding each other:)
Good. I guess the difference between one and the other is:
1) We assume that that an unique constraint will be violated so we first check for it before writing (in the same transaction, one transaction)
2) We assume that an unique constraint will not be violated so we let i write and then check and rollback if necessary (3 transactions).
Since we have made checks in the UI (leaking business rules), the handler (leaking business) the probability of happening in the Repository is greatly reduced.
The second increases performance but makes the contract more complex only simplified if both clients and server work together.
My fear (probably emotional) is that if we "blindly" solutions based on this premisses I leave my system open to very very nasty attacks. An attack, leading the system to perform a huge amount of compensating actions, putting it into crawl. No to mention feeling the pipes with lots of messages.
Either solutions have little to do with UoW. UoW is relevant when we execute transactions across multiple entities, or in DDD terms ARs.
Cheers,
Nuno
PS: It is not about being write or wrong, but looking at several options ending up with best option for job.
Do what this guy says.... hi/Lo.
I use this to generate ids for the domain objects (when the business object requires a friendly id). I do it this way because I have less db hits to generate a sequential number. It's not ideal hitting the dbase everytime and the previous code will fail .... two reads to the same table at the same time will have duplicates. Hilo is a good trade off for performance.
Personally I use guid ids and wouldn't use friendly ids unless I had too.... which is almost never.
Actually guids are still good.choice and should still be used for your id. The long value is is just another field.