Slow Rendering of Dual GoPro VR180

Leon

unread,

Jan 25, 2025, 9:33:59 PMJan 25

to Kartaverse

Hello everyone,

I've built myself a VR180 rig with two GoPro Hero 13s and the ultrawide lens mods for 177 FoV, and after following a couple of tutorials and some trial and error, I can sucessfully generate 8K VR180 videos with Resolve Studio that look good in a Quest 3.

However, I have been hitting a significant bottleneck - rendering takes absolutely forever.

My first attempt with the workflow from hugh hou (kvrCreateStereo -> kvrCropStereo -> GlobalAlign -> kvrLensStereo -> kvrViewer) works, but rendering a 1 minute test video took 2 hours and 22 minutes.

So I generated an STMap, and built a pipeline that works just the same, see the following screenshot:

Screenshot 2025-01-25 221911.png

Everything still looks correct with the STMap - however, rendering times have not improved much. I aborted rendering the 1 minute clip after 1 1/2 hours at around 70%.

Am I missing something to optimize here?

I tried replacing kvrCreateStereo with the built-in Combiner node (as that shows that the data is on GPU, not Mem), but no difference.

Looking at ressources in the task manager, I get spikes in CPU and GPU usage to 100%, but neither are consistently being utilized, there's always only spikes. I get the feeling frames are being copied between GPU/Mem too much (as there is a copy spike with each gpu utilization spike in task manager), but I saw no way to get STMapper to only show GPU, it would always show GPU+Mem, and GlobalAlign also only shows Mem.

Some background info:

- resolve studio 19.1.2 build 3

- timeline resolution is set to 8192x4096 29.97fps

- cpu: Ryzen 3700X, RAM: 32GB, GPU: RTX 2070 Super, SSD: Samsung 990 evo

I realize my system is not the most up-to-date cutting-edge spec anymore, but is that really the bottleneck that i'm hitting here, or it it something i'm doing wrong in my pipeline?

Thanks for any insights,

Leon

Andrew Hazelden

unread,

Jan 26, 2025, 7:15:24 PMJan 26

to Kartaverse

Hi Leon,

On my macOS system, when I am doing generic VR compositing tasks in the Resolve Fusion page on 8K resolution media I tend to get somewhere between 4 FPS and 8 FPS render performance. It's possible to use approaches like "Render In Place" on the Resolve edit page to bake a Fusion page effect to disk so that it plays back faster when you need to assemble a longer timeline.

If you need things to be realtime performance wise, it's possible to generate an STMap template and take that into a program like TouchDesigner. You can then use that approach to do live streaming of VR content with fully realtime image processing of the warping and stitching tasks.

Here is a page from the Kartaverse docs that discusses using Fusion created STMaps with TouchDesigner:

https://kartaverse.github.io/Kartaverse-Docs/#/TouchDesigner

The design goal of the Kartaverse kvrFisheyeStereo node was to help support STMap creation.

Regards,

Andrew Hazelden

P.S. In your example comp, where you have the "crop" and "instanced_crop1" nodes, you might enjoy checking out the kvrCropStereo node. It combines that task into a single operation.

Also, if you get funky sizing issues on your SBS formatted output, the built-in Fusion "Autodomain" node, when placed at the end of a chain of fuse nodes can help fix the issues.

On Saturday, 25 January 2025 at 22:33:59 UTC-4 Leon wrote:

Hello everyone,

I've built myself a VR180 rig with two GoPro Hero 13s and the ultrawide lens mods for 177 FoV, and after following a couple of tutorials and some trial and error, I can sucessfully generate 8K VR180 videos with Resolve Studio that look good in a Quest 3.

However, I have been hitting a significant bottleneck - rendering takes absolutely forever.
My first attempt with the workflow from hugh hou (kvrCreateStereo -> kvrCropStereo -> GlobalAlign -> kvrLensStereo -> kvrViewer) works, but rendering a 1 minute test video took 2 hours and 22 minutes.

So I generated an STMap, and built a pipeline that works just the same, see the following screenshot:

Leon

unread,

Jan 27, 2025, 5:53:21 AMJan 27

to Kartaverse

Hello Andrew,

thanks for your answer. In the meantime, I have rented a server with an RTX4090 and still only get around 1 FPS, so there must be something else I'm doing wrong.

As you can see in my screenshot, I'm already using the STMapper-based workflow, and have generated an STMap using the "Dual Fisheye STMap Creation v001" composition.

The resulting STMap.exr is around 116MB - that seems rather big - did I maybe do something wrong there?

I found your "Render Time Profiler" to better see what's going on, and it seems that GlobalAlign is taking the longest time... That's odd, isn't that a built-in node? Shouldn't that be fast? Similarly, the Loader also takes a significant time.

Screenshot 2025-01-27 111709.png

I have an example project and example footage here: https://drive.google.com/drive/folders/1ZbXgEIW7mJYk7shbki2dqdj3rFidYhDk?usp=sharing

The composition in the screenshot is from the "Birbs (Crop STMap)" Timeline. You will have to adjust the paths to the Video and Audio media.

The STMap was generated with the "Birbs (Crop STMap Gen)" Timeline. And the original workflow without the STMapper can be found in "Birbs".

Could this also be something related to Windows/macOS differences? I don't have a mac to test this with, so hopefully you can answer that question by taking a look at my example project.

I have also tried for hours to replace nodes in the pipeline with others (i.e. kvrCropStereo instead of the Crop, kvrCreateStereo instead of the Combiner, kvrViewer instead of STMapper, kvrCropStereo instead of GlobalAlign, ...),

just to see if that would speed things up - but I only saw marginal improvents, going from somewhere around 6s of Max time in the Render Time Profiler to 3s. Still nowhere near the 4-8 FPS you are getting.

Thanks,

Leon

unread,

Jan 27, 2025, 8:06:25 PMJan 27

to Kartaverse

Thanks to the tip from Per Hansen, I was able to get up to 2 FPS now with the RTX4090 now by the GlobalAlign and baking in stereo alignment via the Center property of the kvrCropStereo in the STMap Generation Composition. Now the Loader is the slowest part in the Render Time Profiler.

It's now also the only node that shows "Mem", everything else shows "GPU" or "GPU+Mem". I didn't find any setting or different node that would put the STMap also into GPU memory. Am I missing something here?

I also tried a bunch of other things (i.e. different nodes), but this is the fastest I could get it. Still a far reach from the 4 to 8 FPS mentioned by Andrew, but at least somewhat workable now.

Maybe the unified memory on apple silicon makes the copies between RAM and VRAM not noticable.