The Gst-nvtracker plugin allows the DS pipeline to use a low-level tracker library to track the detected objects over time persistently with unique IDs. It supports any low-level library that implements NvDsTracker API, including the reference implementations provided by the NvMultiObjectTracker library: IOU, NvSORT, NvDeepSORT and NvDCF trackers. As part of this API, the plugin queries the low-level library for capabilities and requirements concerning the input format, memory type, and additional feature support. Based on these queries, the plugin then converts the input frame buffers into the format requested by the low-level tracker library. For example, the NvDeepSORT and NvDCF trackers use NV12 or RGBA, while IOU and NvSORT requires no video frame buffers at all.
The Gst-nvtracker plugin supports retrieval of the user-defined miscellaneous data from the low-level tracker library through NvMOT_RetrieveMiscData API, which includes useful object tracking information other than the default data for the current frame targets; for example, past-frame object data, targets in shadow tracking mode, full trajectory of terminated targets and re-identification features. More details on the types of miscellaneous data and what they means can be found in Miscellaneous Data Output section. The users are allowed to define other types of miscellaneous data in NvMOTTrackerMiscData.
NvSORT: The NvSORT tracker is the NVIDIA-enhanced Simple Online and Realtime Tracking (SORT) algorithm. Instead of a simple bipartite matching algorithm, NvSORT uses a cascaded data association based on bounding box (bbox) proximity for associating bboxes over consecutive frames and applies a Kalman filter to update the target states. It is computationally efficient since it does not involve any pixel data processing.
NvDCF: The NvDCF tracker is an online multi-object tracker that employes a discriminative correlation filter for visual object tracking, which allows independent object tracking even when detection results are not available. It uses the combination of the correlation filter responses and bounding box proximity for data association.
The source code of the Gst-nvtracker plugin is provided as a part of DeepStream SDK package under sources/gst-plugins/gst-nvtracker/ when installed on a system This is to allow users to make direct changes in the plugin whenever needed for their custom applications and also to show the users as to how the low-level libraries are managed and how the metadata is handled in the plugin.
The Gst-nvtracker plugin works in the batch processing mode by default. In this mode, the input frame batch is passed to and processed by a single instance of low-level tracker library. The advantage of batch processing mode is to allow GPUs to work on bigger amount of data at once, potentially increasing the GPU occupancy during execution and reducing the CUDA kernel launch overhead. Depending on the use cases, however, a potential issue is that there is a possibility that GPU could be idling (also referred to as GPU bubble) in some compute stages in the tracker unless the end-to-end operation within the module is carried out solely on the GPU. This is indeed the case if some of the compute modules in the tracker runs on CPU. If there are other components in the DeepStream pipeline that uses GPU (e.g., GPU-based inference in PGIE and SGIE), such CPU blocks in tracker can be hidden behind them, not affecting the overall throughput of the pipeline.
The newly-introduced Sub-batching feature allows the plugin to split the input frame batch into multiple sub-batches (for example, a four-stream pipeline can use two sub-batches in the tracker plugin, each of which takes care of two streams). Each sub-batch is assigned to a separate instance of low-level tracker library, where the input to the corresponding sub-batch is processed separately. Each instance of low-level tracker libraries runs on a dedicated thread running independently, allowing parallel processing of sub-batches and minimizing the GPU idling due to CPU compute blocks, which eventually results in higher resource utilization.
Because sub-batching assigns separate low-level tracker library instances to different sub-batches, it allows the user to configure each individual sub-batch differently with different low-level tracker library configuration files. This can be utilized in multiple ways like setting varied compute backends across sub-batches, using varied tracking algorithms across sub-batches or modifying any other configuration that is supported in low-level tracker configuration file. More detailed example use-cases are discussed in Setup and Usage of Sub-batching (Alpha) section.
The color formats supported for the input video frame by the NvTracker plugin are NV12 and RGBA. A separate batch of video frames are created from the input video frames based on the color format and the resolution that is required to the low-level tracker library.
If the tracker algorithm does not generate confidence value, then tracker confidence value will be set to the default value (i.e., 1.0) for tracked objects. For IOU, NvSORT and NvDeepSORT trackers, tracker_confidence is set to 1.0 as these algorithms do not generate confidence values for tracked objects. NvDCF tracker, on the other hand, generates confidence for the tracked objects due to its visual tracking capability, and its value is set in tracker_confidence field in NvDsObjectMeta structure.
Supports splitting of a batch of frames in sub-batches which are internally processed in parallel resulting in higher resource utilization. This feature also enables specification of a different config file for each sub-batch.
The plugin performs this query once during initialization stage, and its results are applied to all contexts established with the low-level library. If a low-level library configuration file is specified, it is provided in the query for the library to consult.The query reply structure, NvMOTQuery, contains the following fields:
uint8_t numTransforms: The number of color formats required by the low-level library. The valid range for this field is 0 to NVMOT_MAX_TRANSFORMS. Set this to 0 if the library does not require any visual data.
NvBufSurfaceMemType memType: Memory type for the transform buffers. The plugin allocates buffers of this type to store color- and scale-converted frames, and the buffers are passed to the low-level library for each frame.The support is currently limited to the following types:
The context handle is opaque outside the low-level library. In the batch processing mode, the plugin requests a single context for all input streams. In per-stream processing mode, on the other hand, the plugin makes this call for each input stream so that each stream has its own context.This call includes a configuration request for the context. The low-level library has an opportunity to:
Review the configuration and create a context only if the request is accepted. If any part of the configuration request is rejected, no context is created, and the return status must be set to NvMOTStatus_Error. The pConfigResponse field can optionally contain status for specific configuration items.
Once a context is initialized, the plugin sends frame data along with detected object bounding boxes to the low-level library whenever it receives such data from upstream. It always presents the data as a batch of frames, although the batch can contain only a single frame in per-stream processing contexts. Note that depending on the frame arrival timings to the tracker plugin, the composition of frame batches could either be a full batch (that contains a frame from every stream) or a partial batch (that contains a frame from only a subset of the streams). In either case, each batch is guaranteed to contain at most one frame from each stream.
pParams is a pointer to the input batch of frames to process. The structure contains a list of one or more frames, with at most one frame from each stream. Thus, no two frame entries have the same streamID. Each entry of frame data contains a list of one or more buffers in the color formats required by the low-level library, as well as a list of object attribute data for the frame. Most libraries require at most one-color format.
pTrackedObjectsBatch is a pointer to the output batch of object attribute data. It is pre-populated with a value for numFilled, which is the same as the number of frames included in the input parameters.
If a frame has no output object attribute data, it is still counted in numFilled and is represented with an empty list entry (NvMOTTrackedObjList). An empty list entry has the correct streamID set and numFilled set to 0.
The output object attribute data NvMOTTrackedObj contains a pointer to the detector object (provied in the input) that is associated with a tracked object, which is stored in associatedObjectIn. You must set this to the associated input object only for the frame where the input object is passed in. For a pipeline with PGIE interval=1, for example:
In case that a video stream source is removed on the fly, the plugin calls the following function so that the low-level tracker library can remove it as well. Note that this API is optional and valid only when the batch processing mode is enabled, meaning that it will be executed only when the low-level tracker library has an actual implementation for the API. If called, the low-level tracker library can release any per-stream resource that it may be allocated:
DeepStream SDK provides a single reference low-level tracker library, called NvMultiObjectTracker, that implements all four low-level tracking algorithms (i.e., IOU, NvSORT, NvDeepSORT, and NvDCF) in a unified architecture. It supports multi-stream, multi-object tracking in the batch processing mode for efficient processing on CPU and GPU (and PVA for Jetson). The following sections will cover the unified tracker architecture and the details of each reference tracker implementation.
b1e95dc632