Nvidia Jetson Orin Nano Dev Board Crash Issue After 20 Minutes

Issue Overview

The Nvidia Jetson Orin Nano Development Kit (8GB DevKit) is experiencing a critical issue where the system shuts down approximately 20 minutes after booting. The problem occurs after flashing JetPack 5.1.1 rev. 1 using SDKManager. When the crash happens, only the power LED remains on, while the network LED and fan turn off. The minicom connection is also lost during the crash. This issue significantly impacts the usability and functionality of the device, preventing users from running applications or conducting extended development work.

Possible Causes

Several factors could be contributing to this unexpected shutdown:

  1. Software Conflict: The installed JetPack version might have compatibility issues with the Orin Nano hardware.

  2. Power Management Issues: There could be a problem with the power management system, causing the device to enter an unintended suspend state.

  3. Thermal Management: The system might be overheating, leading to a protective shutdown.

  4. Hardware Defect: A potential hardware issue could be causing the system to crash after a certain period of operation.

  5. Driver Conflicts: Incompatible or buggy drivers might be triggering the shutdown.

  6. Display Manager Issues: The presence of the GNOME Display Manager (gdm3) has been identified as a potential cause for involuntary suspensions.

Troubleshooting Steps, Solutions & Fixes

  1. Collect Diagnostic Information:

    • Use a proper USB-TTL connection to capture the full boot sequence and crash logs.
    • Ensure correct pin connections: Ground (pin 7), TX (pin 4), RX (pin 3) on the Orin Nano.
    • Run sudo dmesg -wT to capture real-time kernel messages leading up to the crash.
  2. Remove GNOME Display Manager:

    • This solution has been reported to resolve the issue.
    • Execute the following command:
      sudo apt remove gdm3
      
    • Reboot the system and monitor for improvements.
  3. Check for Overheating:

    • Monitor system temperatures using the tegrastats command.
    • Ensure proper ventilation and consider additional cooling if necessary.
  4. Update JetPack and Drivers:

    • Check for available updates to JetPack and install them.
    • Update all system packages:
      sudo apt update && sudo apt upgrade -y
      
  5. Disable Power Management Features:

    • Edit the GRUB configuration file:
      sudo nano /etc/default/grub
      
    • Add pcie_aspm=off to the GRUB_CMDLINE_LINUX_DEFAULT line.
    • Update GRUB:
      sudo update-grub
      
  6. Check for Hardware Issues:

    • Inspect the board for any visible damage or loose connections.
    • If possible, test with a different power supply to rule out power-related issues.
  7. Analyze System Logs:

    • Review system logs for any recurring errors or warnings:
      sudo journalctl -b -1
      
  8. Perform a Clean Flash:

    • As a last resort, consider re-flashing the system with the latest compatible JetPack version.
    • Use the NVIDIA SDK Manager for a clean installation.
  9. Contact NVIDIA Support:

    • If the issue persists after trying these solutions, reach out to NVIDIA support with detailed logs and system information.

Remember to test the system thoroughly after each change to identify which solution resolves the issue. The removal of gdm3 has been reported as an effective fix, but it’s important to understand the implications of removing the display manager on your specific use case.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *