How to Run Voice Chat Agent with NanoLLM on Nvidia Jetson Orin Nano
Issue Overview
Users are experiencing difficulties running the voice_chat agent of NanoLLM on the Nvidia Jetson Orin Nano Dev board. The primary issue is that there’s no output when speaking into the microphone, indicating that the voice chat functionality is not working as expected. Additionally, after some modifications, users encountered an exception related to the phi-2 model, specifically a RuntimeError stating that the model does not have an embed() function in the specified path.
Possible Causes
-
Incorrect setup: The initial configuration might not be properly set up, leading to the microphone input not being recognized or processed.
-
Pipeline issues: The voice chat pipeline may not be correctly configured, causing a breakdown in the audio processing chain.
-
Model compatibility: The phi-2 model seems to have compatibility issues with the current setup, possibly due to missing or incompatible functions.
-
Context length limitations: The error might be related to the maximum context length being exceeded during conversations.
-
Software version mismatch: The current version of MLC/TVM might not be fully compatible with the phi-2 model, leading to runtime errors.
Troubleshooting Steps, Solutions & Fixes
-
Use Agent Studio for visual inspection:
- Utilize Agent Studio 1 to set up the pipeline and visually inspect what’s happening.
- Independently test the ASR (Automatic Speech Recognition), LLM (Language Model), and TTS (Text-to-Speech) components.
-
Test individual components:
- Run the tests under nano_llm/test to confirm ASR and TTS functionality:
python3 NanoLLM/nano_llm/test/asr.py python3 NanoLLM/nano_llm/test/tts.py
- Run the tests under nano_llm/test to confirm ASR and TTS functionality:
-
Modify the voice_chat pipeline:
- If the initial setup doesn’t work, try modifying the pipeline of voice_chat.
- Ensure all components are properly connected and configured.
-
Adjust context length:
- Try changing the maximum context length to see if it affects the behavior:
sudo jetson-containers run -v ~/NanoLLM/:/opt/NanoLLM eb86 python3 -m nano_llm.agents.voice_chat --api mlc --model /data/models/phi-2 --quantization q4f16_ft --asr=whisper --tts=piper --max-context-len=512
- Try changing the maximum context length to see if it affects the behavior:
-
Try alternative models:
- If issues persist with the phi-2 model, test with other models like Llama-3-8B-Instruct to isolate model-specific problems.
-
Use a different LLM backend:
- Attempt to run the voice chat agent with a different LLM backend:
sudo jetson-containers run -v ~/NanoLLM/:/opt/NanoLLM eb86 python3 -m nano_llm.agents.voice_chat --api hf --model /data/models/phi-2 --quantization q4f16_ft --asr=whisper --tts=piper
- Attempt to run the voice chat agent with a different LLM backend:
-
Update MLC/TVM version:
- Consider upgrading to the latest version of MLC/TVM to incorporate recent fixes and improvements.
- Check the NanoLLM repository for any available updates or patches.
-
Check hardware compatibility:
- Ensure that the USB microphone and USB speaker are compatible with the Nvidia Jetson Orin Nano Dev board.
- Verify that the devices are properly recognized by the system.
-
Inspect log files:
- Check system logs and application-specific log files for any error messages or warnings that might provide additional insight into the issue.
-
Community support:
- If problems persist, consider reaching out to the NanoLLM community or Nvidia Jetson forums for further assistance and to check for any known issues or workarounds.