OpenVDB file load causes noise artifacts in PxrCryptomatte outputs

65 views
Skip to first unread message

Paul Kilgo

unread,
Nov 12, 2022, 4:36:58 PM11/12/22
to OpenVDB Forum
Hello,

I am looking for some advice debugging a RenderMan procedural, which by only loading an OpenVDB file appears to cause noise artifacts in the output of PxrCryptomatte.

We are using OpenVDB 8.1.0 and RenderMan 24.3. I've managed to delete parts of the procedural down to where it only loads a file. I am compiling both the procedural and OpenVDB with gcc 9.3 on CentOS 7.9, but I can also see the issue on gcc 6.3 and Rocky Linux 8.6. I've also tried other OpenVDB versions 4.1.0 and 9.1.0. I'm happy to try the latest commit on GitHub, but I'll need to do a little extra work to be able to build it.

Also important to note, this is unfortunately caused by a proprietary asset. So far I haven't reproduced this on anything but proprietary assets. But as far as I can tell there is nothing special about it. It is 4.5G on disk, produced by Houdini 19.0, has 7 grids (but I forced OpenVDB to load the one named "surface"), and the problematic grid is just a normal FloatGrid level-set. I'm including the vdb_print output if that is helpful at the end of this message.

In the case I'm presenting here, there is actually nothing in the scene. I just have RenderMan invoke my procedural, I run a few lines of OpenVDB code, and stop there. The outputs I get are black images with some noise artifacts. It's behaving like some memory corruption is going on somewhere, and somehow PxrCryptomatte is the thing that is affected most often. I have seen the procedural crash before on "memory corruption" problems --but not in this particular case I am presenting.

I did some systematic disabling of the OpenVDB code base to try and narrow down where the error might be originating. After disabling large blocks of code, I eventually got to this line of code which I can disable and the problem goes away.

https://github.com/AcademySoftwareFoundation/openvdb/blob/ea786c46b7a1b5158789293d9b148b379fc9914c/openvdb/openvdb/tree/LeafNode.h#L1371

It looks like `meta` comes from the result of this function. Best I can tell, it wasn't returning a nullptr or anything like that.

https://github.com/AcademySoftwareFoundation/openvdb/blob/ea786c46b7a1b5158789293d9b148b379fc9914c/openvdb/openvdb/io/Archive.cc#L917

It's important to note: I can't only disable the line of code to make the problem go away. I think something more fundamental is wrong, and can be triggered by other parts of the code base as well.

Since I suspect memory corruption, I've also run through valgrind to see if it can detect anything. I have tried valgrind on both an equivalent simple program, and directly on the procedural. The simple program shows possible innocuous-looking leaks which seem to stem from TBB. Of course, it detects a lot more when I actually run prman through it. The most related one is a use of uninitialized values coming from PxrCryptomatte.so. Though, this seems to happen whether or not OpenVDB is involved.

==37293== Conditional jump or move depends on uninitialised value(s)
==37293==    at 0x29116BFB: ??? (in /local/prman/24.3.2208291/lib/plugins/PxrCryptomatte.so)
==37293==    by 0x2969059F: ??? (in /local/prman/24.3.2208291/lib/plugins/PxrSampleFilterCombiner.so)
==37293==    by 0x8D1FD9C: ??? (in /local/prman/24.3.2208291/lib/libprman.so)


What does seem to help is try to force TBB (2019 Update 9) to use one thread. But for whatever reason, that doesn't translate into a solution for to the original procedural's source code.

tbb::task_scheduler_init tsi(1);

Any ideas on this? I don't expect anyone to be able to reproduce without an asset, but thought someone out there might have some experience integrating OpenVDB into RenderMan.

Here is the example procedural which can produce the corruption:

extern "C" RtVoid
Subdivide2(
    RtContextHandle ctx,
    RtFloat detail,
    RtInt argc,
    RtToken const toks[],
    RtPointer const vals[])
{
    //tbb::task_scheduler_init tsi(1);
    openvdb::initialize();
    openvdb::io::File f(PATH_TO_VDB);
    f.open();
}

Here is the output of vdb_print on the asset that causes the problem:

VDB version: 8.1/224
creator: Houdini 19.0.622/GEO_VDBTranslator

Name: surface
Information about Tree:
  Type: Tree_float_5_4_3
  Configuration:
    Root(1 x 8), Internal(8 x 32^3), Internal(1,713 x 16^3), Leaf(1,118,866 x 8^3)
  Background value: 0.012
  Min value: -0.012
  Max value: 0.012
  Number of active voxels:       309,934,076
  Number of active tiles:        0
  Bounding box of active voxels: [-808, -362, -728] -> [926, 3515, 909]
  Dimensions of active voxels:   1735 x 3878 x 1638
  Percentage of active voxels:   2.81%
  Average leaf node fill ratio:  54.1%
  Number of unallocated nodes:   0 (0%)
Memory footprint:
  Actual:                2.290 GB
  Active leaf voxels:    1.155 GB
  Dense equivalent:     41.056 GB
  Actual footprint is 5.58% of an equivalent dense volume
  Leaf voxel footprint is 50.4% of actual footprint
Additional metadata:
  class: level set
  file_bbox_max: [926, 3515, 909]
  file_bbox_min: [-808, -362, -728]
  file_compression: blosc + active values
  file_mem_bytes: 2458919028
  file_voxel_count: 309934076
  is_local_space: false
  is_saved_as_half_float: false
  name: surface
  value_type: float
  vector_type: invariant
Transform:
  voxel size: 0.002
  index to world:
     [0.002, 0, 0, 0]
     [0, 0.002, 0, 0]
     [0, 0, 0.002, 0]
     [0, 0, 0, 1] 

Reply all
Reply to author
Forward
0 new messages