Hi everyone,
We've been growing unhappy with the design of how we spin up our jobs and are considering making some changes. I'm curious what everyone else does. The job process is the only single-point-of-failure in our pipeline and we're looking for more robustness as well as more decoupling and dynamic deployment capabilities.
Our app exposes a web service where users can submit requests for processing with various options. We take those incoming requests and spawn new processes to execute the cascading workflow on the same app server that handles the request. If there is any hiccup with the job process the entire workflow will fail. And with some jobs running for multiple hours, restarting, even with checkpoints, is expensive and time consuming.
We are considering moving the the main process that starts our cascading workflow off of the app servers to somewhere else. Where do you run yours?
John Lavoie