Voice-Powered AI Chatbot

Video Demo:

Description:

This project is a Voice-Powered AI Chatbot that allows users to interact with an AI assistant through voice input. The chatbot records the user's speech, transcribes it into text, sends it to a large language model (LLM) via the Groq API, and then speaks out the response using Text-to-Speech (TTS).

Features

Real-time Audio Recording: Captures user's voice input using PyAudio (or an alternative like SoundDevice if necessary).
Speech-to-Text (ASR): Uses a pre-trained Wav2Vec2 model from Hugging Face to transcribe the recorded speech into text.
AI Response Generation: Sends the transcribed text to Groq API (LLM) for generating a meaningful response.
Text-to-Speech (TTS): Converts the AI-generated response into speech using pyttsx3.
Interruptible Response: Users can stop the chatbot from speaking by pressing Enter.
Multi-threading: Ensures smooth execution of tasks like recording, processing, and speech output.

Project Structure

├── project/
│   ├── main.py                 # Main script for running the chatbot
│   ├── requirements.txt        # List of required Python dependencies
│   ├── .env                    # Stores API keys and environment variables
│   ├── output.wav              # Recorded audio file
│   ├── transcription.txt       # Transcribed text file
│   ├── README.md               # Project documentation

main.py

Handles audio recording using either PyAudio or SoundDevice.
Processes recorded audio and saves it in WAV format.
Uses a speech-to-text model to transcribe audio into text.
Interacts with the Groq API to generate AI-based responses.
Uses TTS to read the response aloud to the user.
Allows interruption via a keyboard event to stop speech playback.

requirements.txt

Contains all dependencies required for the project, such as:

pyaudio
sounddevice
scipy
wave
keyboard
pyttsx3
dotenv
transformers
groq

.env

Stores API keys and other environment variables.

GROQ_API_KEY=your_api_key_here

output.wav

The recorded audio file that is transcribed into text.

transcription.txt

Stores the transcribed text from the recorded audio.

Installation and Setup

1. Clone the Repository

git clone https://github.com/your-repo/voice-chatbot.git
cd voice-chatbot

2. Install Dependencies

pip install -r requirements.txt

3. Set Up Environment Variables

Create a .env file in the project directory and add your Groq API Key:

GROQ_API_KEY=your_api_key_here

4. Run the Chatbot

python main.py

Challenges and Design Decisions

PyAudio vs SoundDevice: Initially, PyAudio had installation issues in CS50.dev, so SoundDevice was used as an alternative. However, PyAudio provided better recording stability.
Interruptible Speech: A separate thread was created to allow users to stop the chatbot’s speech using Enter.
Groq API Integration: The chatbot requests a longer response from the AI model for more detailed answers.
Transcription Accuracy: Wav2Vec2 was chosen for its high accuracy in recognizing speech.

Future Improvements

Improve UI/UX with a web-based interface using Flask or React.
Add support for multiple languages in speech-to-text and text-to-speech.
Use Whisper AI for better transcription accuracy.
Optimize real-time response processing.

Conclusion

This project demonstrates how voice interaction can enhance AI-powered chatbots. By combining Speech Recognition, LLM-based text generation, and Text-to-Speech, the chatbot provides a seamless and intuitive user experience.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
output.wav		output.wav
project.py		project.py
requirements.txt		requirements.txt
transcription.txt		transcription.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice-Powered AI Chatbot

Video Demo:

Description:

Features

Project Structure

main.py

requirements.txt

.env

output.wav

transcription.txt

Installation and Setup

1. Clone the Repository

2. Install Dependencies

3. Set Up Environment Variables

4. Run the Chatbot

Challenges and Design Decisions

Future Improvements

Conclusion

About

Releases

Packages

Languages

kris70lesgo/Gate.Ai

Folders and files

Latest commit

History

Repository files navigation

Voice-Powered AI Chatbot

Video Demo:

Description:

Features

Project Structure

main.py

requirements.txt

.env

output.wav

transcription.txt

Installation and Setup

1. Clone the Repository

2. Install Dependencies

3. Set Up Environment Variables

4. Run the Chatbot

Challenges and Design Decisions

Future Improvements

Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages