Hi there,
I am taking my first steps in the world of distributed computing and I am creating a small in-house
ASP.NET 5.0 web application for scheduling scientific simulations in particle physics and producing plots and reports from simulation results. The system consists of three components:
1. A simulation service doing all the configuration, scheduling, orchestration and data persistance. It also provides the API for the user Front-End-SPA.
2. Worker nodes that poll the simulation service for simulation batch jobs, then run said simulations and send the results back.
3. A plotting service that turns the simulation results into colourful images.
Since this is just a small internal system, I want to keep the complexity low. Hence, I would like to use DDD and CQRS for clarity, but on a small beginner's scale.
In particular, for the simulation service, I was thinking of using the same database for both the read side and the write side, where the write model mainly consists of database views and a thin DTO access layer. Also, on the write side I would not use event sourcing, but rely on a more traditional CRUD-approach with tables. I also try to avoid a third-party message bus / message broker, again for simplicity. All components would talk via direct but often asynchronous HTTP-Requests with each other.
Now my question is:
What is the easiest and best way to implement some sort of persistant state machine / process manager / saga in my simulation service without using any kind of complex third-party message broker or infrastructure and without having to adopt event sourcing?
In particular, how would I solve the following questions with standard
ASP.NET 5.0 tools and without big third-party frameworks:
1. As far as I understood the concept, process managers transform events into commands using internal state. How would I do that without event sourcing? Are these "process manager events" conceptually different from the "event sourcing events"?
2. Should I persist these "process manager events", which would end up being some sort of event sourcing again? Or can I just use a `SimulationProcess` and a `PlottingProcess` database-table for the book-keeping of the overall process state?
3. How can the process manager decide about its state transitions when I don't use a message broker? Since the process manager only calls command handlers which have return type void, I suppose I have to inject some sort of callback function to the command handler function, that informs the process manager of the domain logic outcome, right?
4. How do I implement these process managers? How do they correspond to System.Threading.Task instances? Do I have one Task per process manager instance, i.e. one Task per individual root aggregate id?
5. I guess I would then have to hold a list of all those process manager instances in memory. Is that even possible when the database grows and I have thousands of records in my database?
6. How does the "process manager loop" work? Who's scheduling/triggering the process manager cycles? Sometimes there can be both external triggers such as a `StartPlotting` command sent by a user, and internal triggers such as timer events or file system polling events. Should I use
ASP.NET `IHostedService` instances for this spinning loop?
7. How can a process manager schedule commands that should be executed in the future? Something like "Execute `CheckBatchJobTimeout` command in 10 minutes"?
I am a bit lost here. I keep reading the recommendation to avoid event sourcing as long as possible for small systems, but I cannot find any implementation of persistent state machines / saga /process manager in a more traditional CQ(R)S setting.
I would be extremely grateful for any code snippets, blog articles or other resources.
Greetings,
Joerg