Nvidia Jetson Orin Nano Dev Board: PyTorch 2.1.0 Installed Successfully, but Torchvision Installation Fails with CUDA Version Mismatch Error
Issue Overview
Users are experiencing difficulties installing PyTorch and Torchvision on the Nvidia Jetson Orin Nano Dev Board with Jetpack 6.0 DP. While PyTorch 2.1.0 installs successfully, attempts to install Torchvision result in a CUDA version mismatch error. The error message indicates that the detected CUDA version (11.5) does not match the version used to compile PyTorch (12.2), even though the Jetpack 6.0 DP image only includes CUDA 12.2.
The issue occurs in various contexts, including:
- Building the "Hello AI World" project from source and then running the
install-pytorch.sh
script - Installing PyTorch and Torchvision using pip wheels
- Modifying CMake configuration files to resolve the mismatch
Users have verified that CUDA 12.2 is correctly installed on their systems by checking the /usr/local
directory. However, the CUDA version mismatch error persists during Torchvision installation, preventing users from completing the setup and utilizing transfer learning features.
Possible Causes
-
Incorrect CUDA version detection: The PyTorch and Torchvision installation processes may be incorrectly detecting an older CUDA version (11.5) instead of the installed CUDA 12.2. This could be due to:
- Misconfigured environment variables or paths
- Conflicting CUDA installations or remnants of previous versions
- Issues with the Jetpack 6.0 DP image or the installation process
-
Incompatibility between PyTorch and Torchvision versions: The specific versions of PyTorch (2.1.0) and Torchvision being installed may have compatibility issues or require matching CUDA versions. The mismatch error suggests that the PyTorch version was compiled with CUDA 12.2, while Torchvision expects CUDA 11.5.
-
CMake configuration issues: The CMake build files for the "Hello AI World" project or other dependencies may have hardcoded references to CUDA 11.5, causing conflicts with the installed CUDA 12.2 version.
-
Jetpack 6.0 DP image inconsistencies: There might be inconsistencies or bugs in the Jetpack 6.0 DP image that are causing the CUDA version mismatch, even though CUDA 12.2 is supposed to be included.
Troubleshooting Steps, Solutions & Fixes
-
Verify CUDA installation:
- Check the contents of the
/usr/local
directory to ensure that CUDA 12.2 is properly installed and symlinked. - Run
nvcc --version
to verify the installed CUDA version reported by the NVIDIA CUDA Compiler.
- Check the contents of the
-
Clean up previous CUDA installations:
- Remove any remnants of previous CUDA versions that may be conflicting with CUDA 12.2.
- Ensure that the
PATH
,LD_LIBRARY_PATH
, and other relevant environment variables are correctly set to point to the CUDA 12.2 installation.
-
Use compatible PyTorch and Torchvision versions:
- Experiment with different versions of PyTorch and Torchvision that are known to be compatible with each other and with CUDA 12.2.
- Consider using the PyTorch and Torchvision versions provided in the official NVIDIA NGC containers for Jetpack 6.0 DP.
-
Modify CMake configuration files:
- Inspect the CMake configuration files for the "Hello AI World" project and any other relevant dependencies.
- Look for hardcoded references to CUDA 11.5 and update them to match the installed CUDA 12.2 version.
- Rebuild the project after making the necessary modifications.
-
Use NVIDIA containers as a workaround:
- If the above steps do not resolve the issue, consider using the NVIDIA NGC containers for Jetpack 6.0 DP as a temporary workaround.
- The containers come with pre-installed and compatible versions of PyTorch, Torchvision, and other dependencies.
- Follow the container usage instructions provided by NVIDIA to run your project within the container environment.
-
Report the issue to NVIDIA:
- If none of the above steps resolve the CUDA version mismatch error, consider reporting the issue to NVIDIA through their official support channels.
- Provide detailed information about your setup, the steps you have taken, and the specific error messages encountered.
- Collaborate with NVIDIA’s support team to identify any potential bugs or incompatibilities in the Jetpack 6.0 DP image or the PyTorch/Torchvision installation process.
By following these troubleshooting steps and exploring the suggested solutions, users may be able to overcome the CUDA version mismatch error and successfully install PyTorch and Torchvision on their Nvidia Jetson Orin Nano Dev Board with Jetpack 6.0 DP. If the issue persists, users may need to rely on NVIDIA containers as a temporary workaround while seeking further assistance from the NVIDIA community and support channels.