VirtualGL is used for server-side OpenGL rendering, i.e. for accessing a
GPU remotely, i.e. it assumes that the GPU is on the same machine on
which the OpenGL applications are running. Given that the GPU is on
your client machine, VirtualGL cannot be of much help there. To answer
your question about whether the 2D and 3D X server can be on the same
machine-- yes, if using an X proxy such as TurboVNC, but generally there
is no point to doing that unless both the 2D and 3D X server are on the
remote machine. The purpose for that configuration is to prevent any
X11 traffic from transiting the network (since X proxies convert X11
drawing commands into an image stream.) When using VirtualGL, the 3D X
server has to be on the same machine on which the applications are
running. Effectively the "VirtualGL Server" and the "Application
Server" will always be the same machine, and that machine must have a
GPU and an X server (the 3D X server) attached to that GPU.
What you're trying to do is use a remote 2D X server with a local 3D X
server, which is not what VirtualGL is designed to do. With VirtualGL,
the 3D X server is always remote, and the 2D X server can be either
local (if using the VGL Transport) or remote (if using the X11 Transport
with an X proxy.) Generally VirtualGL is used to display 3D
applications from a machine with greater 3D capabilities to a machine
with lesser 3D capabilities. What you are doing would work fine if you
were running the XDMCP server on the machine with the GPU and connecting
to it remotely from the machine without the GPU. The reverse, however,
is not a problem that VirtualGL can solve.
In short, you are setting up a "silent station" with a lot of RAM and a
high-performance CPU, but that station also needs a GPU in order to run
3D applications remotely from it using VirtualGL. Otherwise, you're
better off logging in locally to your GPU-equipped machine, using SSH
with X11 tunneling to connect to the remote machine, and running 3D
applications without VirtualGL. That would cause the GLX/OpenGL
commands to be sent over the network, which is far from ideal, but it's
the only reasonable way to run OpenGL applications with hardware
acceleration when the client has a GPU but the application server
doesn't. VirtualGL is specifically meant to work around the problems
with that approach, which is why I emphasize that the approach is far
from ideal (refer to
https://virtualgl.org/About/Background), but again,
VirtualGL requires a GPU in the application server.