Hello,
One problem I have not worked out yet is this: Having a pipeline on one GoCD server (via an agent) call a pipeline on another GoCD server (via the API). The point of this would be to relieve the server-single-point-of-failure and server-bottle-neck we are seeing at our site. I see that the server is not horizontally scalable and think this is a big problem with the design. Some things (such as polling GIT) can only happen on the server and if the server locks up (ours does frequently) everything stops. Horizontally scaling the server might help. Calling one server via the API from another might “fake it”.
You could see if setting up a post-commit notification (https://api.gocd.org/current/#notify-materials) works for you. If it does, it usually reduces the load on your GoCD server (especially due to material polling) by a lot.
You should also be using Postgres as the DB for any large-ish instances of GoCD. It usually helps with performance and is easier to back up the data etc.
Regards,
Aravind