How to interpret profiler information to improve performance

30 views
Skip to first unread message

Simon Lacroix

unread,
Aug 26, 2023, 7:35:45 PM8/26/23
to VirtualGL User Discussion/Support
Hi, 
 
      I am trying to run an application using TurboVNC and VirtualGL and I have been trying to see if there is a way to improve the performance I am seeing. Running VirtualGL directly on the server with a monitor hooked up to it (to take VNC out of the loop), I get the following profiler result

Readback    -  534.96 Mpixels/sec- 1298.21 fps
Blit        -   46.99 Mpixels/sec-  114.04 fps
total       -   10.99 Mpixels/sec-   26.64 fps

How would one interpret the large difference in fill rate between blit and total ? 

Running glxspheres64 gives me the following : 

Readback    -  457.69 Mpixels/sec-  410.12 fps
Blit        -   52.15 Mpixels/sec-   46.73 fps
Total       -   44.79 Mpixels/sec-   40.13 fps

Thanks 

Simon Lacroix

unread,
Aug 26, 2023, 7:36:37 PM8/26/23
to VirtualGL User Discussion/Support
The hardware platform I am using in a NVIDIA jetson Xavier AGX with an integrated GPU

Simon Lacroix

unread,
Aug 27, 2023, 1:11:56 PM8/27/23
to VirtualGL User Discussion/Support
A bit more info, the application that I am using uses openGL to render a GUI to the display but also to render media content to be recorded or streamed. Using VirtualGL seems to reduce the performance of the whole app quite a bit, I get maybe less than half the frame rate for the recorded and streamed media I normally get. I did an experiment and used x11vnc to mirror the main display instead, the GUI refresh rate isn't great but the rest of the app performs relatively normally. 

Is it possible that VirtualGL calls are blocking the app and slowing it down and is there a way this could be avoided ? 

Thanks 

Simon Lacroix

unread,
Aug 31, 2023, 6:22:22 PM8/31/23
to VirtualGL User Discussion/Support
Update, I seem to be getting about 50% better performance when I use the TurboVNC -vgl option as opposed to be running through vglrun and tiger VNC. Still the application that I need to run requires a bit more performance any insights as to how I can better profile and find the bottleneck would be much appreciated ! 

Thanks 

mtkapl...@gmail.com

unread,
Aug 31, 2023, 6:43:18 PM8/31/23
to virtualgl-users
Hi, 

What about testing with UltraVNC and 256 color?


--
You received this message because you are subscribed to the Google Groups "VirtualGL User Discussion/Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to virtualgl-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/virtualgl-users/af241b34-b994-4127-834b-45d241110a5dn%40googlegroups.com.

Simon Lacroix

unread,
Sep 5, 2023, 8:12:47 AM9/5/23
to VirtualGL User Discussion/Support
Thanks, I can give that a try but is UltraVNC expected to be that much faster than TurboVNC ?

DRC

unread,
Sep 5, 2023, 10:34:35 AM9/5/23
to virtual...@googlegroups.com

No, the UltraVNC suggestion is a red herring.  UltraVNC uses the TurboVNC encoder, but it only supports Windows servers, whereas TurboVNC only supports Un*x servers.  They are orthogonal solutions, not interchangeable solutions.

DRC

unread,
Sep 5, 2023, 11:21:13 AM9/5/23
to virtual...@googlegroups.com

In the interest of being methodical, let me enumerate the things that are unexpected and how to possibly diagnose them:


1. 47-52 Mpixels/sec is way too slow for blitting on a local display that has a GPU connected.

   - Run glxinfo on the local display.  That will tell you whether the local display is using the GPU.  (However, this is a shot in the dark, because you shouldn't be able to get 460-540 Mpixels/sec of readback performance without a GPU, unless modern versions of Mesa are a lot faster than I think they are.)

   - Run fbxtest (from the VirtualGL source) on the local display.  That will tell you whether the raw blitting performance of the local display matches the blitting performance you observed when you ran 'vglrun +pr glxspheres64'.


2. Except for the slow blitting performance, the output of 'vglrun +pr glxspheres64' is otherwise expected, since it shows a readback throughput that is consistent with that of other nVidia GPUs and a total throughput that is limited by the slowest pipeline stage (the blitting thread.)  However, the output of 'vglrun +pr {your_application}' is not expected, since it shows a total throughput that is much less than the throughput of the slowest pipeline stage.

   - Is there any way that your application could print the output of glGetString(GL_RENDERER) or implement some other method to verify that the GPU is actually in use?  The behavior is odd enough that it makes me suspect that the GPU may not be in use in all cases.

   - Is there any way that your application can measure its frame rate when running on the local display without VirtualGL?

   - Do you actually observe 10 fps in the application?  In other words, does the GUI appear to only be refreshing at that rate?

   - If you are using VirtualGL with a 3D X server, then try using it with an EGL device instead (pass '-d /dev/dri/card0' to vglrun) and see if that changes anything.

   - If you are using a 3D X server, then make sure that

     Option "HardDPMS" "false"

     is in either the Device or the Screen section of xorg.conf.  Otherwise, the nVidia drivers will throttle down the GPU to a ridiculously slow level when the screen saver activates.  (However, this usually slows the readback performance to a crawl, which you haven't observed, so I doubt that this is the cause of the issue.  Also, obviously the screen saver wasn't active when you used VGL on the local display, so I mention HardDPMS mostly to ensure that that issue, which is probably unrelated, doesn't interfere with the observations of the issue you reported.)


3. Passing -vgl to /opt/TurboVNC/bin/vncserver (or setting '$useVGL = 1;' in turbovncserver.conf) basically just runs the entire window manager with vglrun, so all applications launched from the window manager will have VirtualGL preloaded into them.  This enables GPU acceleration for the window manager itself, and it also allows you to launch 3D applications with GPU acceleration without invoking them using 'vglrun'.  However, this is not expected to have any effect on blitting performance.

   - Try running fbxtest in a TurboVNC session launched with -vgl and compare the raw blitting performance to that of a TurboVNC session launched without -vgl.

   - If there is a way to verify in your application that the GPU is in use, such as calling glGetString(GL_RENDERER), then please do so and verify that the GPU is in use in both cases (the TurboVNC session launched with -vgl and the TurboVNC session launched without -vgl.)

   - Try using a non-compositing window manager, such as MATE or Xfce, and verify whether that affects the results.


Otherwise, I don't have a clue.  I would need to understand more about your environment, such as:

- How is your application performing 3D rendering?  Does it have its own off-screen rendering and blitting mechanism that possibly interferes with VirtualGL's?  (Some applications do their own Pbuffer redirection, readback, and blitting, so it is necessary to set VGL_READBACK=none with such applications.  Effectively that means that VGL is used only for redirecting the OpenGL context to a GPU.  Its readback and transport mechanisms are disabled.)

- What window manager are you using?

- What operating system are you using?


Note also that one of the professional services I provide is remote diagnosis and resolution of such issues.  Please contact me off-list to discuss this.

DRC

--

mtkapl...@gmail.com

unread,
Sep 8, 2023, 6:50:15 AM9/8/23
to virtualgl-users
On an average connection, no. But, it has support to limit number of colors. I'm using this method when faster access required without many colors.  

Simon Lacroix

unread,
Sep 11, 2023, 8:52:34 AM9/11/23
to VirtualGL User Discussion/Support
Hey DRC, thanks for the thorough response I got sidetracked and had to put this on hold for a little bit but plan on getting back on it this week and will report back ! 

This gives me alot to dig a bit deeper and hopefully move the goalpost 

Reply all
Reply to author
Forward
0 new messages