INT8 Calibration Reduces Accuracy of PyTorch MNIST Model on Jetson Orin Nano
Issue Overview
Users have reported a significant drop in accuracy when using INT8 calibration for the PyTorch MNIST model on the Nvidia Jetson Orin Nano Dev board. Specifically, the accuracy plummets to less than 10%, with one user noting that out of 10,000 inferences, only 916 were correct, resulting in an accuracy of 9.16%. In contrast, using FP32 or FP16 data types yields over 97% accuracy.
The issue arises during the inference phase after modifying the sample code located at /usr/src/tensorrt/samples/python/network_api_pytorch_mnist/sample.py
to support INT8 calibration, referencing another sample from /usr/src/tensorrt/samples/python/int8_caffe_mnist/
. Users have confirmed that they generated a new INT8 calibration cache but still faced the accuracy drop. The Jetpack version being used is 4.6.1, and the hardware is identified as the Jetson Nano.
The impact of this problem is substantial for users relying on accurate model predictions for applications, particularly in academic or research settings.
Possible Causes
-
Calibration Cache Issues: If the calibration cache is not generated correctly or does not align with the model architecture, it may lead to poor performance.
-
Model Architecture Differences: The PyTorch and Caffe models may have different architectures that affect how INT8 calibration is applied.
-
Configuration Errors: Incorrect modifications in the sample code could lead to improper handling of weights and layers during inference.
-
Driver or Software Bugs: There may be bugs in the software stack or driver that affect INT8 processing.
-
Environmental Factors: Power supply issues or temperature variations could impact performance during inference.
-
User Errors: Misconfigurations or incorrect data handling when generating calibration caches may lead to reduced accuracy.
Troubleshooting Steps, Solutions & Fixes
-
Verify Calibration Cache Generation
- Ensure that a new INT8 calibration cache is generated specifically for the PyTorch MNIST model.
- Use the following lines in your modified
sample.py
to generate and save the cache:calibration_cache = "mnist_calibration.cache" calib = MNISTEntropyCalibrator(train_set, cache_file=calibration_cache)
-
Check Model Architecture Compatibility
- Confirm that you are using a calibration cache generated specifically for the PyTorch MNIST model rather than one from a different architecture (e.g., Caffe).
-
Review Code Modifications
- Ensure that modifications made to
sample.py
are correct and consistent with how weights are assigned across all layers. Pay special attention to thepopulate_network()
function.
- Ensure that modifications made to
-
Run Calibration and Validation Steps
- Follow these commands to set up your environment correctly:
cd /usr/src/tensorrt/data/mnist sudo wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz sudo wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz sudo gzip -dk t10k-images-idx3-ubyte.gz sudo gzip -dk train-images-idx3-ubyte.gz
- Install necessary dependencies:
sudo apt install python3-pip libboost-all-dev export CPATH=$CPATH:/usr/local/cuda-11.4/targets/aarch64-linux/include export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-11.4/targets/aarch64-linux/lib pip3 install pycuda --user numpy requests pillow
- Follow these commands to set up your environment correctly:
-
Test with Different Batch Sizes
- Experiment with different batch sizes during both calibration and inference to see if performance improves.
-
Use Recommended Sample Code
- Consider copying the
calibrator.py
file fromint8_caffe_mnist
into your working directory and apply suggested patches tosample.py
as follows:diff --git a/samples/python/network_api_pytorch_mnist/sample.py b/samples/python/network_api_pytorch_mnist/sample.py index e5e95de2..3a5d47f8 100644 --- a/samples/python/network_api_pytorch_mnist/sample.py +++ b/samples/python/network_api_pytorch_mnist/sample.py @@ -24,9 +24,12 @@ import numpy as np import pycuda.autoinit import tensorrt as trt +from calibrator import load_mnist_data, load_mnist_labels, MNISTEntropyCalibrator sys.path.insert(1, os.path.join(sys.path[0], "..")) import common # You can set the logger severity higher to suppress messages (or lower to display more messages). TRT_LOGGER = trt.Logger(trt.Logger.WARNING) def build_int8_engine(weights, calib, batch_size=32): # ... config.set_flag(trt.BuilderFlag.INT8) config.int8_calibrator = calib # ...
- Consider copying the
-
Conduct Inference Tests
- After implementing changes and verifying configurations, run inference tests again using:
python3 sample.py
- After implementing changes and verifying configurations, run inference tests again using:
-
Monitor Logs for Errors
- Pay attention to any warnings or errors logged during execution that may indicate underlying issues with configuration or data handling.
-
Seek Community Support
- If problems persist, consider sharing your modified files and results with community forums for further assistance.
By following these steps and recommendations, users should be able to diagnose and potentially resolve issues related to INT8 calibration affecting accuracy on the Nvidia Jetson Orin Nano Dev board.