Disabling IOMMU for PCIe DMA while Enabling MSI on Nvidia Jetson Orin Nano
Issue Overview
Users working with a PCIe driver for a custom Xilinx FPGA-based endpoint device on the Nvidia Jetson Orin Nano are experiencing difficulties in configuring the system to disable IOMMU for PCIe DMA while keeping MSI (Message Signaled Interrupts) enabled. The problem manifests when modifying the device tree node pcie@14160000
. Removing certain IOMMU-related lines from the device tree results in either the loss of MSI interrupts from the FPGA or errors and incorrect DMA data.
When all IOMMU-related lines are removed, MSI interrupts are lost. Removing only some lines allows MSI interrupts but introduces errors and incorrect DMA data. The errors include unhandled context faults in the ARM SMMU (System Memory Management Unit) and MC (Memory Controller) errors related to VPR (Video Protection Region) violations.
Possible Causes
-
Incorrect Device Tree Configuration: The device tree modifications are not correctly balancing the IOMMU and MSI requirements.
-
SMMU Bypass Issues: The default SMMU configuration might be preventing the desired behavior.
-
Kernel Configuration Mismatch: The kernel may not be configured to support the desired IOMMU and MSI settings.
-
Driver Compatibility: The custom PCIe driver may not be fully compatible with the Jetson Orin Nano’s IOMMU and MSI implementation.
-
JetPack Version Incompatibility: The issue might be related to a specific JetPack version that doesn’t support the required configuration.
Troubleshooting Steps, Solutions & Fixes
-
Modify Kernel Configuration:
- Locate the kernel configuration file at
Linux_for_Tegra/source/public/kernel_src/kernel/kernel-5.10/arch/arm64/defconfig
. - Add the following line to disable SMMU bypass by default:
# CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is not set
- Locate the kernel configuration file at
-
Update Device Tree:
- In the
pcie@14160000
node of the device tree, remove only these two lines:#iommus = <&smmu_niso0 TEGRA_SID_NISO0_PCIE4>; #dma-coherent;
- Keep the following lines to maintain proper IOMMU mapping:
#iommu-map = <0x0 &smmu_niso0 TEGRA_SID_NISO0_PCIE4 0x1000>; #iommu-map-mask = <0x0>;
- In the
-
Rebuild and Flash the Kernel:
- After making the changes, rebuild the kernel and flash it to your Jetson Orin Nano.
- Follow the standard Nvidia documentation for rebuilding and flashing the kernel on Jetson devices.
-
Verify JetPack Version:
- Ensure you are using a compatible JetPack version.
- If issues persist, consider updating to the latest JetPack version available for the Jetson Orin Nano.
-
Check FPGA Configuration:
- Verify that your FPGA design correctly implements the PCIe endpoint and MSI functionality.
- Ensure that the FPGA is configured to use MSI interrupts and not legacy interrupts.
-
Analyze Error Logs:
- The
arm-smmu
errors indicate unhandled context faults. These may be resolved by the kernel configuration change. - The
mc-err
messages suggest VPR violations. This might be addressed by the device tree modifications.
- The
-
Custom Driver Modifications:
- Review your custom PCIe driver code to ensure it’s compatible with the Jetson Orin Nano’s PCIe implementation.
- Consider adding error handling for IOMMU-related issues in your driver.
-
IOMMU Debugging:
- Enable IOMMU debugging in the kernel to get more detailed information about IOMMU transactions.
- Add
iommu.passthrough=0 iommu.strict=1
to the kernel command line for verbose IOMMU logging.
-
Consult Nvidia Developer Forums:
- If the issue persists, consider posting detailed logs and your configuration on the Nvidia Developer Forums for further assistance from the Jetson community.
By following these steps, you should be able to configure your Jetson Orin Nano to disable IOMMU for PCIe DMA while keeping MSI enabled for your custom FPGA-based PCIe endpoint device. If you continue to experience issues, further investigation into the specific hardware configuration and driver implementation may be necessary.