Here's an entirely unscientific method to determine the overhead of profiling. The Go distribution contains a set of basic benchmarks, one of which is a loopback based http client server benchmark. Running the benchmark with and without profiling gives a rough ballpark for the overhead of profiling.
lucky(~/go/test/bench/go1) % go test -run=XXX -bench=HTTPClientServer
BenchmarkHTTPClientServer-4 20000 84296 ns/op
ok _/home/dfc/go/test/bench/go1 4.274s
lucky(~/go/test/bench/go1) % go test -run=XXX -bench=HTTPClientServer -cpuprofile=/tmp/c.p
BenchmarkHTTPClientServer-4 20000 85316 ns/op
ok _/home/dfc/go/test/bench/go1 4.402s
You could use this to experiment with the other kinds of profiles; memory, block, trace, etc.
If you wanted to go a step further you could adding profiling to your own project with my github.com/pkg/profile
package then compare the results of a http load test with and without profiling enabled.