Torch2trt Installation Issue on Jetson Orin Nano 8GB Development Kit

Issue Overview

Users are experiencing difficulties when attempting to build a container using jetson_containers on the Jetson Orin Nano 8GB Development Kit. The specific issue arises when trying to include the torch2trt package in the container build process. The error occurs during the installation of torch2trt, preventing the successful creation of the container.

Key details:

  • Device: Jetson Orin Nano 8GB Development Kit
  • L4T Version: 35.4.1
  • JetPack Version: 5.1.2
  • CUDA Version: 11.4.315
  • Ubuntu Version: 20.04 (focal)

The error message indicates a problem with importing the tensorrt module, specifically failing to load the libnvdla_compiler.so shared object file.

Possible Causes

  1. Incorrect Docker Runtime: The default Docker runtime may not be set to NVIDIA, causing issues with GPU-dependent packages.

  2. Missing or Misplaced Libraries: The libnvdla_compiler.so file is present in the system but may not be in the expected location for the container build process.

  3. Package Ordering: The order of packages specified in the build command may affect the installation process, particularly for ROS and PyTorch.

  4. Incompatible Package Versions: There might be version conflicts between the installed packages and the torch2trt requirements.

  5. BuildKit Compatibility: The use of BuildKit for container building might be causing conflicts with the NVIDIA runtime.

Troubleshooting Steps, Solutions & Fixes

  1. Set Default Docker Runtime to NVIDIA:
    Ensure that the default Docker runtime is set to NVIDIA. Edit the /etc/docker/daemon.json file:

    {
        "runtimes": {
            "nvidia": {
                "path": "nvidia-container-runtime",
                "runtimeArgs": []
            }
        },
        "default-runtime": "nvidia"
    }
    

    After editing, restart the Docker service:

    sudo systemctl restart docker
    
  2. Verify Library Locations:
    Check if the libnvdla_compiler.so file is in the correct location. If it’s not in the expected path, create a symbolic link:

    sudo ln -s /usr/lib/aarch64-linux-gnu/tegra/libnvdla_compiler.so /usr/lib/libnvdla_compiler.so
    
  3. Adjust Package Order:
    Try changing the order of packages in the build command. Place PyTorch before ROS:

    ./build.sh --name=arsd pytorch l4t-pytorch cuda-python opencv ros:foxy-desktop realsense
    
  4. Update JetPack and L4T:
    Ensure you have the latest compatible versions of JetPack and L4T installed on your Jetson device. Check the NVIDIA website for the most recent releases compatible with your hardware.

  5. Manual torch2trt Installation:
    If the issue persists, try manually installing torch2trt outside the container:

    git clone https://github.com/NVIDIA-AI-IOT/torch2trt
    cd torch2trt
    python3 setup.py install
    

    If successful, you can then try to include it in your container build.

  6. Check TensorRT Installation:
    Verify that TensorRT is correctly installed and configured in your environment. You may need to reinstall or reconfigure TensorRT if issues persist.

  7. BuildKit Considerations:
    If using BuildKit causes issues, you can temporarily disable it by setting an environment variable before building:

    export DOCKER_BUILDKIT=0
    

    Then run your build command as usual.

  8. Review Log Files:
    Examine the build logs located in /home/omar/jetson-containers/logs/ for more detailed error information that might provide additional insights into the issue.

If the problem continues after trying these solutions, consider reaching out to NVIDIA’s developer forums or the jetson-containers GitHub repository for more specialized assistance. Provide detailed information about your setup, the steps you’ve taken, and the full error logs to help in diagnosing the issue.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *