Navisworks File Formats

2 views

Skip to first unread message

Lorin Mandaloniz

unread,

Aug 5, 2024, 7:13:32 AM8/5/24

to nodkofiper

Themodel was clearly far too big. So, goal one was to find some way that you could interact with a model that was too big to render in real time. Once we solved that, our next challenge was to interact with models that were too big to fit in memory. At the tail end of the 32 bit operating system era, we had to deal with models that were too big to fit into the available virtual address space.

The solution I came up with, was to break down the model into small, self contained, renderable instances. Instances are rendered, and loaded, in priority order, most important objects first. Detail drops out while you interact with the model, then fills in when you stop interacting.

This immediately rules out using existing CAD formats or vendor neutral formats (not that they existed back then). I needed a format that stored self contained instances that could be loaded on demand and rendered in any order.

Whatever we do to prepare the data has to be completely automatic, with no user intervention. We have to support all the different types of source CAD format that our users work with. We have to support a workflow where the source models are changing frequently. That means minimizing the turn around time to update to the latest models. We need to make sure that any additional metadata the user creates, to support the design review process, still makes sense as the underlying models change.

Navisworks was created as a single user desktop product. As such, workflows revolve around files. The overall project is represented by a set of design files. Anything beyond the most simple project will have more than one design file. Large projects typically have hundreds. The project is broken down into separate design files by discipline, the capacity limits of the design tools and the need for multiple designers to work independently.

The user may want to share the aggregated project model more widely. They can publish the model as a self contained NWD file (Navisworks Data) that contains the complete model and metadata. Combining into a single file makes the project more convenient to share and faster to load. The NWD file is a snapshot at a particular point in time. To update the model you have to go back to the NWF and publish a new NWD.

Navisworks was written in, at the time cutting edge, now old school, object oriented C++. You can think of the state of an application as an arbitrary graph of objects in memory connected by references (C++ pointers). An arbitrary graph can include multiple references to the same object and cycles.

Serialization is the process of turning that graph of objects in memory into a stream of bytes that can be written out as a file. Deserialization is the reverse process of reading from a stream of bytes and using that to reconstruct an equivalent graph of objects in memory. This is a well solved problem with lots of standard algorithms.

There is some underlying structure. The input and output stream classes support versioning, binary or text output, compressed (using zlib) or uncompressed. We used a rigorous versioning policy where every change in the format, no matter how small, resulted in a new file version with code that could read and write both the new and previous versions. Navisworks can, in theory, still read files created by the first versions of Navisworks.

The Navisworks scene graph is a DAG (Directed Acyclic Graph) with three types of nodes. A root node corresponding to a design file, internal groups and geometry at the leaves. Each node has a user name, a class name and a source id.

The user name is whatever the user named the object (could be blank). The class name is the thing in the original design file that the node represents. The groups and geometries in the diagram might be Inserts and Meshes in a DWG file, or Instances and Walls in an RVT file. The source id is whatever identifier the source design file uses. It might be an entity handle for a DWG, or an element id from an RVT.

Each node has a set of attributes. Similar to nodes, each attribute also has a name and class name. There are a few different types of attribute which fall into three categories. First, attributes that represent transforms used to position objects in 3D space. Second, materials that determine the appearance of objects. Finally, there are property attributes. Each property attribute contains a list of (name,value) pairs. Nodes can have multiple property attributes, with each attribute representing a different category of properties. The properties are whatever the Navisworks file converters can extract from the source design file.

The spatial graph is used for all spatially oriented operations such as rendering, picking, collision detection and clash detection. The leaves are self-contained instances consisting of a bounding box, transform, material and geometry definition. All are stored in a form optimized for rendering. Geometry and materials are shared between instances. Large logical geometric objects are split into multiple instances in the spatial graph to ensure efficiency of spatial operations. Each leaf node in the instance tree has a list of corresponding instances in the spatial graph.

Originally, design file conversion required file converters to build a complete representation of the model using the logical scene graph. Navisworks would then traverse the scene graph, accumulating transform and material attributes along each path and generating spatial graph instances whenever it reached a leaf node. The accumulated transform and material attributes would be combined and converted into rendering optimized transform and material representations.

The file formats for the early versions of Navisworks were simply the result of serializing the model representation and session metadata. The file started with a header that defined the file version and whether the content was binary or text, compressed or uncompressed. The rest of the file was a single serialized stream of data.

The main parts of the Navisworks data model are written out as separate streams, in the order that they are usually read. There are separate streams for the logical scene graph, the spatial hierarchy and the set of instances. Each feature with its own metadata (clash tests, saved viewpoints, selection sets, etc.) also uses a separate stream.

All geometry definitions are stored in a special chunked stream. Each definition is written as a separate chunk which can be independently loaded. Just like the overall container, the stream includes a directory of chunks. However, unlike the overall container, chunks are not separately compressed. Individual geometry chunks are too small for effective compression. Instead the output stream of chunks is divided into fixed size 64KB segments for compression. Zlib compression operates on 64KB of data at a time, so this scheme gives us compression nearly as good as compressing the whole stream. A segment directory is used to keep track of the segment boundaries.

When loading geometry chunks, the compressed segments are decompressed on demand into a temporary file. The geometry chunks are serialized in priority order based on the default view for the model. This ensures that during the initial load, geometry is largely read sequentially.

Luckily, property access is not as time critical as geometry access. There tend to be two types of access. Either properties are being accessed for a single logical instance (an object has been selected), or properties for all objects are being accessed in traversal order (searching). We explicitly serialize attributes from multiple nodes into each chunk by traversing over the logical scene graph and writing them to the same stream. Once we have more than 64KB in the output, we start a new chunk. A directory in the root node keeps track of which range of nodes corresponds to each chunk.

Attributes can be shared between multiple nodes. At the time, Material and Transform nodes were involved in maintenance of the spatial graph. To keep things simple, only unshared Property attributes are serialized into the chunked property stream and loaded on demand. The remaining attributes are serialized with the rest of the logical scene graph.

Navisworks already had support for 3D DWF. However, a DWF file can contain multiple sheets, where each sheet is either a 3D model or a 2D drawing. Navisworks would only load the default 3D sheet. The Navisworks file formats could only store a single 3D model.

The first job was to add support for multiple sheets. We already had a multi-stream container format. So, all we had to do was add more streams for each additional sheet plus a stream that stored a list of sheets and the ids of the corresponding streams for each sheet.

Navisworks ignores the additional streams when first loading the file. If the user opens the sheets browser, Navisworks loads the list of sheets from the sheets stream. If the user selects another sheet to display, Navisworks loads the model from the corresponding streams and switches the currently active model to the new one. Previously loaded sheets are kept in memory, unless memory is running low and space needs to be reclaimed.

We handled the strict render ordering requirement by using the Z coordinate to specify the required order. With the view locked down to an orthographic camera looking down the Z axis, we can let Navisworks do its prioritized rendering thing, with the depth buffer ensuring the rendered order looks correct.

DWF files can contain sheets which are entirely unrelated, or more commonly, represent different views of the same model. Navisworks aggregation lets you combine an arbitrary set of sheets into a single file.

A common workflow in ADR is to select an object in one sheet and then zoom into the same object on another sheet. To do that, you need identifiers which are unique across multiple sheets in a file, and ideally universally unique. DWF used UUIDs for that purpose.

In many cases, metadata is related to instances in the model. For example, a clash result includes the two instances that are clashing. In memory, each instance is represented by a pointer to an instance tree node. When serialized, the instance is represented by an integer id. During serialization of the logical scene graph, Navisworks assigns an incrementing integer id for each instance tree node in traversal order. The metadata streams have access to that mapping, so as metadata is serialized, the corresponding integer id for each instance can be written out.