Hi Allard
Sadly it's a little bit complicated. Actually I'm not running MySQL, given that one of the requirements our DevOps team gave us was to avoid relational databases... and also to avoid Mongo DB :-D
So, the Event Store is a custom implementation (wrote by myself) based on Cassandra. I implemented the duplicated key check on the event entry table using the mechanism offered by Cassandra called "Lightweight Transaction" (INSERT ... INTO ... IF NOT EXISTS) , which allows me to check if the insertion of the event is actually performed or not.
The main drawback is that I cannot execute the statement in a batch, because in the same method I'm updating the event entry table and another table that exposes events with the global index as a clustering key, in order to let the gap-based event tracker work correctly.
Anyway, let's go back to the issue.
My service will actually run with more that one instance, but for our first load test we configured the service with a single instance, but we kept the production configuration, with the Distributed Command Bus active (think of it as a mistake). This explains why we ran the test with the distributed configuration active on a single instance :-D
Anyway, suppose we're in a production environment, with more than one instance of the service (distributed mode: on). If I understood well, the Command Gateway, through the ConsistentHash, redirects all the commands for a specific aggregate always to the same node. With a high load, I think I can experiment the duplicated key error I described above. The use of AtomicLong to handle the increment of the sequence number is the first solution that came to my mind to face the problem.
I think that we're going to perform a load test with a real distributed configuration (more than one instance) very soon. As soon as I have some results, I will share them with you.
Cheers
Franco