CUDA 12.4 on Jetson Orin – CUDA driver/runtime API version mismatch
Issue Overview
Users are experiencing a CUDA driver/runtime API version mismatch on the Nvidia Jetson Orin Nano Dev board. The primary symptoms include:
-
Error Messages: Users encounter runtime errors, specifically
RuntimeError: GET was unable to find an engine to execute this computation
, indicating a mismatch between the CUDA version used by PyTorch (12.4) and the supported CUDA driver version (12.2). -
Context of Occurrence: This issue arises during the setup of PyTorch 2.4.0, which has pre-built wheel files specifically for CUDA 12.4 but fails to run due to the Jetpack 6 environment, which only supports up to CUDA 12.2.
-
System Specifications: The Jetpack version in use is Jetpack 6, with the following relevant outputs from
nvidia-smi
:NVIDIA-SMI 540.2.0 Driver Version: N/A CUDA Version: 12.2 GPU Name: Orin (nvgpu)
-
Frequency of Issue: This problem appears consistently for users trying to run applications requiring CUDA versions higher than what is supported by their current driver.
-
Impact on User Experience: The inability to run specific applications like PyTorch with the required CUDA version significantly hampers development and deployment of AI models on the Jetson Orin Nano.
Possible Causes
The following potential causes have been identified for the issue:
-
Driver Limitations: The current driver only supports up to CUDA version 12.2, leading to incompatibility with applications requiring higher versions.
-
CUDA Toolkit Installation: Although users have installed the CUDA Toolkit 12.4, it does not function correctly due to the underlying driver limitations.
-
Configuration Errors: Users may not have configured their environment correctly, leading to conflicts between installed versions of CUDA and those required by specific libraries.
-
Hardware Compatibility Issues: There may be inherent limitations in the Jetson Orin Nano’s hardware that prevent it from fully supporting newer CUDA versions without appropriate driver updates.
-
Environmental Factors: Insufficient power supply or overheating may exacerbate performance issues but are less likely to be the root cause of version mismatches.
Troubleshooting Steps, Solutions & Fixes
To resolve the CUDA version mismatch issue, follow these comprehensive troubleshooting steps:
-
Verify Installed Versions:
- Check installed CUDA versions using:
nvcc --version nvidia-smi
- Check installed CUDA versions using:
-
Install Compatible PyTorch Wheel:
- Use a compatible PyTorch wheel that aligns with your current CUDA driver:
pip install torch==2.4.0+cu122 --extra-index-url https://download.pytorch.org/whl/cu122
- Use a compatible PyTorch wheel that aligns with your current CUDA driver:
-
Use
cuda-compat
Package:- Consider installing the
cuda-compat
package that allows for upgrading to newer CUDA releases without changing Jetpack or BSP software:sudo apt-get install cuda-compat
- Consider installing the
-
Testing with Different Configurations:
- If issues persist, try testing with different combinations of PyTorch and torchvision versions:
- For torchvision, use a nightly build if necessary:
pip install --force-reinstall torchvision==0.20.0.dev20240703 --no-deps --index-url https://download.pytorch.org/whl/nightly/cu124
- For torchvision, use a nightly build if necessary:
- If issues persist, try testing with different combinations of PyTorch and torchvision versions:
-
Check for Triton Installation:
- If using Torch Inductor, ensure Triton is correctly installed as it may lead to errors if missing:
- Check for pre-built Triton packages or consider building from source while managing memory usage during compilation.
- If using Torch Inductor, ensure Triton is correctly installed as it may lead to errors if missing:
-
Documentation and Updates:
- Regularly check for updates in Jetpack and CUDA documentation to stay informed about compatibility and available patches.
-
Best Practices:
- Always ensure that your software stack (CUDA, PyTorch, etc.) is compatible before installation.
- Maintain a backup of working configurations to quickly revert if issues arise after updates.
-
Unresolved Issues:
- Further investigation may be needed regarding Triton compatibility and availability of pre-built packages for Jetson devices.
- Users should monitor forums and Nvidia’s technical blogs for updates on new releases or patches that may resolve these issues in future iterations of Jetpack or CUDA.
By following these steps, users can effectively troubleshoot and resolve issues related to CUDA version mismatches on their Nvidia Jetson Orin Nano Dev boards.