A slight comment before I attempt to answer your question: I see a couple of extraneous calls to bind() in the main() method. You may have better results if server.bind() is only called right before awaitShutdown(). Also, the default is to performSerialization() so you wouldn't need to specify that--although it's OK to be explicit. :)
You mention your only reaching concurrency of 5,000. The question is how are you testing it. Many times the load tool is actually loading up. For example ApacheBench (ab) has issues. JMeter mostly works, but you can still have problems spinning up so many threads. Especially if the testing tool is on the same box with the service.
Also, if you're concerned about concurrency for only an echo, then you don't need the background executor threads--there's some overhead in the message passing between the front-end I/O worker threads and the back-end executor threads. However, that's probably not real world. The only thing blocking is the serialization, so in this case you don't need the back-end executors (you can set them to 0).
Other than that, it matters whether your using connection keep-alive or not and how many ephemeral ports are available (ulimit on *nix machines).
Enjoy,
--Todd