Flash.sh (and tegraflash.py) fails to erase QSPI on non-SDK modules (P3767 Orin Nano 8GB)
Issue Overview
Users are experiencing issues with the Nvidia Jetson Orin Nano Dev board, specifically related to the failure of the flash.sh
and tegraflash.py
scripts to erase QSPI memory on non-SDK modules (P3767). The problem manifests when attempting to re-flash the module after an initial successful flash.
Symptoms
- The initial flash operation succeeds, allowing the system to boot to the UEFI bootloader.
- Upon asserting force recovery and attempting a second flash, the process fails, leading to a non-functional state where the board continuously reboots.
- The erase operation appears to complete in an unusually short time (~1 second), suggesting a potential partial erase rather than a full one.
Context
- The issue occurs during the flashing process using L4T version 35_4_1.
- It has been observed on multiple P3767 modules, regardless of whether the devkit carrier or a custom carrier board is used.
Hardware/Software Specifications
- Nvidia Jetson Orin Nano Dev board (P3767)
- L4T 35_4_1
- QSPI memory type: MX25U51279G
Frequency and Impact
The problem consistently arises after the first successful flash, rendering subsequent attempts ineffective. This significantly impacts user experience by preventing further development or testing on these modules.
Possible Causes
-
Hardware Incompatibilities or Defects: Non-SDK modules may have different configurations or defects that affect their compatibility with flashing tools.
-
Software Bugs or Conflicts: There may be bugs in the
mb2
flashing code that fail to handle block protection properly during chip erase operations. -
Configuration Errors: Incorrect parameters or flags used during the flashing process could lead to failures.
-
Driver Issues: Outdated or incompatible drivers may contribute to improper functioning of the flashing tools.
-
Environmental Factors: Power supply inconsistencies could affect the stability of the flashing process.
-
User Errors or Misconfigurations: Misunderstanding of required procedures for flashing non-SDK units could lead users to incorrect steps.
Troubleshooting Steps, Solutions & Fixes
Step-by-Step Instructions
-
Initial Flash Attempt
- Use the following command for a clean initial flash:
NO_ROOTFS=1 NO_RECOVERY_IMG=1 ./flash.sh jetson-orin-nano-devkit-nvme external
- Use the following command for a clean initial flash:
-
Check for Block Protection
- After the first successful flash, boot into Linux and check if block protection is enabled:
dmesg | grep qspi_mtd
- After the first successful flash, boot into Linux and check if block protection is enabled:
-
Use Initrd Flash Method
- If subsequent flashes fail, try using the
initrd
method:sudo ./tools/kernel_flash/l4t_initrd_flash.sh -p "--no-systemimg -c bootloader/t186ref/cfg/flash_t234_qspi.xml" --network usb0 jetson-orin-nano-devkit nvme0n1p1
- If subsequent flashes fail, try using the
-
Force Recovery Mode
- After using initrd flash, force recovery mode and attempt re-flashing with
mb2
again:# Enter recovery mode as per your hardware setup
- After using initrd flash, force recovery mode and attempt re-flashing with
-
Monitor Erase Operation
- Pay attention to log outputs during erase operations. A significantly shorter duration may indicate a failure:
[ 8.9948 ] Erasing spi: 0 ......... [Done]
- Pay attention to log outputs during erase operations. A significantly shorter duration may indicate a failure:
-
Investigate Logs
- Review logs (
host.txt
,jetson_console.log
,jetson_console2.log
) for error messages related to partition tables or memory operations.
- Review logs (
Recommended Fixes
-
If block protection is detected, consider clearing it before attempting further flashes.
-
Ensure that you are using compatible hardware and updated software versions.
Best Practices
-
Always perform a clean flash on new modules before any modifications.
-
Avoid power cycling during critical flashing stages unless necessary.
Unresolved Aspects
Further investigation is needed into why block protection bits are set in non-SDK modules and how this affects subsequent flashing attempts. Additionally, better error handling in the mb2
chip erase process would enhance user experience and reliability.