Using trtexec to generate an engine file from an ONNX model works error with two RTSP input sources

Issue Overview

Users are experiencing errors when attempting to generate an engine file from an ONNX model using the trtexec command on the Nvidia Jetson Orin Nano Dev board. The specific issue arises when trying to run applications that utilize multiple RTSP input sources.

Symptoms

  • Successful engine file generation with a single RTSP input source.
  • Errors occur when two or more RTSP input sources are used, leading to warnings and failures in creating the inference context.

Context

  • Environment Specifications:
    • TensorRT Version: 8.5
    • GPU Type: Jetson Orin Nano (4GB)
    • CUDA Version: 11.4
    • CUDNN Version: 8.6.0
    • Operating System: Ubuntu 20.04
    • Python Version: 3.8.10
    • Baremetal or Container: Baremetal

Frequency and Impact

The issue appears consistently when attempting to use multiple RTSP streams, severely impacting the user experience by preventing the application from functioning as intended.

Possible Causes

  • Hardware Limitations: The Jetson Orin Nano may not have sufficient resources (e.g., memory or processing power) to handle multiple streams simultaneously.

  • Model Configuration: The ONNX model may have a static max batch size of 1, which conflicts with requests for higher batch sizes when multiple inputs are used.

  • Software Bugs or Conflicts: Potential bugs in the DeepStream SDK or TensorRT that affect multi-stream processing.

  • Driver Issues: Incompatibilities between the installed drivers and the TensorRT version could lead to unexpected behavior.

  • Configuration Errors: Incorrect settings in configuration files (e.g., config_face_nvinfer.txt) that do not match the model’s requirements.

Troubleshooting Steps, Solutions & Fixes

Step-by-Step Instructions

  1. Verify Environment Setup:
    Ensure that all software components are correctly installed and compatible:

    • Check TensorRT, CUDA, and CUDNN versions.
    • Ensure that DeepStream SDK is properly configured.
  2. Modify the ONNX Model:
    If the ONNX model has a static batch size of 1, modify it to allow for higher batch sizes:

    • Install necessary dependencies:
      git clone https://github.com/NVIDIA/TensorRT.git
      cd TensorRT/tools/onnx-graphsurgeon/
      make build
      python3 -m pip install dist/onnx_graphsurgeon-*-py2.py3-none-any.whl
      pip3 install onnx
      
    • Use a script to modify the batch size:
      import onnx
      import onnx_graphsurgeon as gs
      
      batch = 2
      
      graph = gs.import_onnx(onnx.load("face.onnx"))
      
      for input in graph.inputs:
          input.shape[0] = batch
      
      reshape_nodes = [node for node in graph.nodes if node.op == "Reshape"]
      for node in reshape_nodes:
          node.inputs.values[0] = batch
      
      onnx.save(gs.export_onnx(graph), "dynamic.onnx")
      
    • Create a new TensorRT engine:
      /usr/src/tensorrt/bin/trtexec --onnx=dynamic.onnx --saveEngine=face1.engine
      
  3. Update Configuration Files:
    Modify config_face_nvinfer.txt to set the correct batch size:

    batch-size=2
    
  4. Run Tests:
    Test with multiple input sources using the updated engine:

    python3 deepstream_imagedata-multistream.py \
    file:///opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-imagedata-multistream-test/darkface2.mp4 \
    file:///opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-imagedata-multistream-test/darkface2.mp4 frames/
    
  5. Check for Errors:
    Monitor logs for any warnings or errors related to engine creation and inference context initialization.

Recommended Fixes

  • Utilize the --batch flag while generating the engine file if applicable.

  • Ensure that --optShapes, --shapes, and other flags are used correctly according to TensorRT documentation.

Best Practices

  • Regularly update all software components (TensorRT, CUDA, CUDNN) to their latest versions.

  • Test configurations with different models and input scenarios to isolate issues effectively.

Unresolved Aspects

Further investigation may be needed regarding specific model configurations or potential bugs within TensorRT or DeepStream SDK that could lead to these issues when handling multiple streams.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *