Error on “Tutorial – Small Language Models (SLM)”
Issue Overview
Users are encountering a subprocess.CalledProcessError when following the tutorial for Small Language Models (SLM) on the Nvidia Jetson Orin Nano Developer Kit (8GB) with a 128GB SD card. The error occurs during the execution of a command intended to run a model, leading to screen freezing for several minutes before the error is displayed. The specific command causing the issue is:
jetson-containers run $(autotag nano_llm) \
python3 -m nano_llm.chat --api=mlc \
--model princeton-nlp/Sheared-LLaMA-2.7B-ShareGPT
The error log indicates that the process was killed due to an Out of Memory (OOM) condition, as suggested by the SIGKILL signal. This suggests that the model being loaded is exceeding the available memory resources of the device, which has only 8GB of shared RAM. The issue appears to be consistent across multiple users, indicating a potential limitation in memory handling when executing large models.
Possible Causes
-
Hardware Limitations: The Jetson Orin Nano has limited RAM (8GB), which may not be sufficient for running large models like Sheared-LLaMA-2.7B, particularly when additional memory is required for processing.
-
Out of Memory (OOM) Killer: The OOM killer in Linux terminates processes when the system runs out of memory. This is likely what triggered the subprocess.CalledProcessError.
-
Configuration Errors: Incorrect command-line parameters or environment settings could exacerbate memory usage.
-
Software Bugs: Potential bugs in the software stack related to Docker or the specific libraries being used could lead to inefficient memory usage.
-
Driver Issues: Incompatibilities or bugs in GPU drivers might affect how memory is allocated and managed during execution.
-
User Errors: Misconfiguration in setting up Docker containers or running commands without appropriate flags could lead to excessive memory consumption.
Troubleshooting Steps, Solutions & Fixes
-
Check Available Memory:
- Use the command
free -h
to check current memory usage and availability before running the model.
- Use the command
-
Increase Swap Space:
- If not already done, increase swap space to help mitigate memory limitations:
sudo fallocate -l 16G /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile
- This creates a swap file of 16GB which can be adjusted based on available storage.
- If not already done, increase swap space to help mitigate memory limitations:
-
Disable ZRAM:
- Disable ZRAM if it is enabled, as it might conflict with swap settings:
sudo systemctl stop zram-config.service
- Disable ZRAM if it is enabled, as it might conflict with swap settings:
-
Run Without Desktop UI:
- If applicable, disable any graphical user interface (GUI) to free up additional RAM resources.
-
Modify Command-Line Parameters:
- Reduce memory usage by adjusting command parameters:
--max-context-len=512
- This change can help lower the amount of memory required for processing.
- Reduce memory usage by adjusting command parameters:
-
Monitor Resource Usage:
- Use tools like
htop
ornvidia-smi
to monitor CPU and GPU resource usage during execution to identify bottlenecks.
- Use tools like
-
Test with Different Model Sizes:
- If possible, test with smaller models to verify that the setup works correctly without hitting memory limits.
-
Update Software and Drivers:
- Ensure that all relevant software packages and drivers are up-to-date by checking NVIDIA’s documentation and repositories for updates related to Jetson Orin Nano.
-
Consult Documentation:
- Refer to NVIDIA’s official documentation on mounting swap and other configurations that may assist in optimizing performance on constrained hardware.
-
Community Support:
- Engage with community forums for additional insights or similar experiences from other users who may have encountered and resolved this issue.
Code Snippets & Commands
- To check current memory usage:
free -h
- To create a swap file:
sudo fallocate -l 16G /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile
Unresolved Aspects
- There may be unresolved issues related to specific library versions or Docker configurations that could still be contributing factors.
- Further investigation into whether similar problems occur with different models or configurations would be beneficial for comprehensive troubleshooting.
This document serves as a guide for users experiencing similar issues with their Nvidia Jetson Orin Nano while following tutorials for running large language models, particularly focusing on managing memory effectively during execution.