Using TensorRT with ONNX Models in Jetson Inference
Issue Overview
Users are experiencing confusion regarding the use of ONNX models with TensorRT in the Jetson Inference framework. Specifically, the main concerns are:
- Understanding if the ONNX models exported by Jetson Inference are compatible with TensorRT.
- Comparing performance between different model formats (ONNX from Jetson Inference vs. TRT from Darknet).
- Clarifying whether Jetson Inference’s ONNX models inherently include TensorRT optimization.
This issue affects users who are training custom models (like SSD-MobileNet v2) using Jetson Inference and want to compare their performance with other frameworks like Darknet’s YOLOv4-tiny.
Possible Causes
- Lack of clear documentation on the relationship between ONNX and TensorRT in Jetson Inference.
- Confusion about file formats (.onnx vs .trt) and their implications for model optimization.
- Misunderstanding of how Jetson Inference handles ONNX models internally.
- Uncertainty about the compatibility of different model architectures (SSD vs. YOLO) within the Jetson Inference framework.
Troubleshooting Steps, Solutions & Fixes
-
Understanding ONNX and TensorRT in Jetson Inference:
- Jetson Inference uses TensorRT to run ONNX models.
- The first time an ONNX model is loaded, Jetson Inference optimizes it with TensorRT.
- The optimized model is saved as a TensorRT engine file (with a .engine extension).
-
File Format Clarification:
- .onnx: The initial exported model format.
- .engine: The TensorRT-optimized version of the model, created by Jetson Inference.
- .trt: Likely equivalent to .engine, used by other frameworks like Darknet.
-
Using ONNX Models with Jetson Inference:
- Export your trained model to ONNX format.
- Use the ONNX model with Jetson Inference normally.
- Jetson Inference will automatically optimize it with TensorRT on first load.
-
Comparing Different Model Architectures:
- It is appropriate to compare the performance of SSD-MobileNet v2 from Jetson Inference with YOLOv4-tiny from Darknet.
- Both will be running with TensorRT optimization, allowing for a fair comparison.
-
Limitations and Considerations:
- Jetson Inference’s pre/post-processing is not set up for YOLO models.
- To compare YOLO models, you’ll need separate code to handle YOLO’s specific pre/post-processing requirements.
-
Best Practices for Performance Comparison:
- Ensure both models (SSD and YOLO) are running with TensorRT optimization.
- Use the same input data and evaluation metrics for both models.
- Consider factors like inference time, accuracy, and resource usage in your comparison.
-
For Further Assistance:
- Consult the Jetson Inference documentation for detailed information on model compatibility and optimization.
- If issues persist, reach out to the Jetson developer community or NVIDIA support for more specific guidance.
By following these steps and understanding the relationship between ONNX and TensorRT in Jetson Inference, users can effectively work with their custom models and make accurate performance comparisons across different architectures.