Gazebo / ROS is a more complete solution, whether in desktop or web mode. SpacegraphJS is still just a prototype. Its Ammo.JS + Three.JS browser-based design does have significant performance limitations but could be optimized (ex: WebCL)
The browser-based front-end can be responsible for initiating and controlling the interaction with the cognition engine, probably through websockets for lower-latency I/O. The attached cognition engine back-end, serving a websockets connection, would be entirely decoupled from the front-end, receiving sensor data and transmitting motor commands. RLGlue follows a similar model:
http://glue.rl-community.org/wiki/Main_Page
Craig's
AngryBotsInAiWorld is worth further development too but possibly without a server involved. Have you seen Quake and other FPS ported to WebGL? Those older FPS do not involve a physics engine so they are less computationally intensive and so can more easily support larger worlds and multiplayer action. Bots in such a world might overwhelm current versions of NuPIC if their entire "retina" of thousands of pixels were used as sensory input. Instead maybe they can, at first, distill the perceived scene to numeric geometry (without cheating) such as knowledge about distance to nearby objects like walls and players, measured at discrete intervals (ex: every 10 degrees rotating around the bot). This is how conventional FPS bots must operate.
I imagine that 2D and especially 3D physics engine provides interaction subtleties that would test AI unlike any other method. Very slight movements can result in drastic changes to the world and the agent's body. In other words, the degree of freedom in the control signal space can be arbitrarily large.