GPU Acceleration Performance on Jetson Orin Nano 8G Significantly Lower Than Expected

Issue Overview

Users are experiencing unexpectedly low GPU acceleration performance on the Jetson Orin Nano 8G compared to laptop GPUs. In a vector addition benchmark, the GPU performance was only 2.4 times faster than the CPU, while the same code on a laptop (CPU: R7-5800H, GPU: 2060) achieved nearly 8 times speedup. This discrepancy is concerning for users expecting higher GPU performance from the Jetson Orin Nano 8G.

Specific details:

CPU time: 42ms
GPU time: 17ms
Jetpack version: 5.1.1 (inferred from CUDA 11.4)
CUDA version: 11.4.315
Test program: Vector addition using CUDA

Possible Causes

Power management settings: The Jetson Orin Nano may be operating in a lower power mode, limiting its performance.
Dynamic clock frequencies: The GPU clock might not be locked to its maximum frequency, causing inconsistent performance.
Workload characteristics: The specific workload may not be optimized for the Jetson Orin Nano’s GPU architecture.
Memory bandwidth limitations: The Jetson Orin Nano’s shared memory architecture might be causing bottlenecks.
CUDA kernel configuration: The chosen grid and block sizes may not be optimal for the Jetson Orin Nano’s GPU.
Comparison discrepancy: Comparing embedded GPU performance to a discrete laptop GPU may not be a fair comparison due to architectural differences.

Troubleshooting Steps, Solutions & Fixes

Maximize device performance:
- Set the power mode to maximum:
```
sudo nvpmodel -m 0
```
- Lock clocks to maximum frequency:
```
sudo jetson_clocks
```
Verify current power mode:
- Check the current power mode:
```
sudo nvpmodel -q
```
- Ensure it’s set to the highest available mode (e.g., 10W for Jetson Orin Nano)
Optimize CUDA kernel configuration:
- Experiment with different grid and block sizes in the kernel launch:
```
// Try different values for grid and block size
vector_add_gpu<<<grid_size, block_size>>>(dev_a, dev_b, dev_c, n);
```
- Use CUDA occupancy calculators to find optimal launch configurations
Profile the application:
- Use NVIDIA Nsight Systems to profile the application and identify potential bottlenecks
- Look for memory transfer overheads, kernel launch times, and GPU utilization
Optimize memory transfers:
- Use pinned memory for host allocations to improve transfer speeds
- Consider using unified memory if appropriate for the workload
Benchmark with different data types:
- Test with both single-precision (float) and double-precision (double) to see if there’s a significant difference
Compare with other Jetson Orin Nano benchmarks:
- Run standard benchmarks like rodinia or parboil to compare your device’s performance with published results
Check for thermal throttling:
- Monitor device temperatures during extended runs to ensure thermal limits are not being reached
Update software:
- Ensure you’re running the latest JetPack and CUDA versions available for the Jetson Orin Nano
Optimize CPU code:
- Ensure OpenMP is properly configured and utilizing all available CPU cores
- Consider using vectorized instructions (e.g., NEON for ARM) in the CPU implementation
Adjust expectations:
- Understand that embedded GPUs like those in Jetson devices may not achieve the same speedups as discrete GPUs in laptops or desktops
- Focus on relative performance improvements within the Jetson ecosystem rather than comparing to non-embedded systems

If these steps do not resolve the issue, consider reaching out to NVIDIA developer forums with detailed benchmark results and system information for further assistance.

Issue Overview

Possible Causes

Troubleshooting Steps, Solutions & Fixes

Camera Unit Sample Exit Errors on Nvidia Jetson Orin Nano

The encoding capability of Orin Nano at 1080p30 may impact the real-time capture of a 1080p60 USB camera

Installing Simple Ubuntu on Jetson Orin Nano

Flashing Issues with Jetson Orin Nano Developer Kit

PCIe Interface of Jetson Orin Nano

Finding /bootloader/generic/BCT/ directory

Leave a Reply Cancel reply

More toubleshooting Docs

Info

Development Resources & Official Guides

Follow us on:

Issue Overview

Possible Causes

Troubleshooting Steps, Solutions & Fixes

Similar Posts

Leave a Reply Cancel reply

More toubleshooting Docs

Info

Development Resources & Official Guides

Follow us on: