In the Erlang world, when people ask this question, we often say: "identify the truly concurrent activities in the system--make those into processes". To wit, a common mistake is to apply too much concurrency in a system where none is needed. If you run a new goroutine, then there has to be a concurrent benefit of doing so, in particular, the current goroutine should be able to do something else in the meantime.
Another common mistake is to build a pipeline of goroutines and channels in a case where you can just spawn a new goroutine for each incoming request. The former solution runs the risk of reinventing the Go schedulers on top of the Go schedulers and this is usually a losing proposition. Furthermore, each goroutine in the latter solution can track its own state, where a pipeline has to "impersonate" the data that it is currently processing. The pipeline often leads to the need of spawning more goroutines as well in order to handle things on the side, and each node in the pipeline ends up being a directly-coded node.js event loop.
You can also try to think about the system as communicating agents. Imagine a group of human beings and how they would communicate and delegate: some times it is more efficient to do a piece of work yourself rather than ask someone else to do it. Some times, it is the other way around. Concurrent programming isn't much different, though note that there is one big difference: in a concurrent system, you often have shared knowledge among all the agents. In a (truly) distributed system, you have a correspondence to epistemic logic: different agents knows different parts of the state and will have to ask for data.