Yesterday, there was a question about spinning up new Agents on demand. We want to go a little further. We have use for a system that would do the following for a pipeline:
* create new nodes (VMs with specific amounts of RAM, disk etc.) from a resource pool (e.g. OpenStack or AWS).
* configure the nodes into a small cluster.
* deploy new software for test across cluster (e.g. two DB nodes, a few web, a load balancer, ID management etc.).
* run tests (perhaps via yet more temporary nodes).
* save the results.
* tear down all the nodes and release their resources back into the pool.
We want to completely release Agent resources between runs to maximize the flexibility and minimize the cost. Jobs may vary wildly in terms of the resources they need. Systems like AWS can cost a lot less if we can release the Agent nodes as soon as the job is done.
One way I was thinking this might work (maybe not very scalable?) would be to have a small number of dedicated Agents watching the job queue. When a new job appears, an Agent grabs it and sets up the environment (new instances/Agents) for it. If I could force the new Agents to map to the one job, then I think I could do this.
I'm searching for ideas here. Am I approaching this the wrong way?
Best,
Kyle