python-rag-pipeline

This project implements a lightweight Retrieval-Augmented Generation (RAG) pipeline using Hugging Face Transformers (powered by PyTorch), scikit-learn, and Python. The pipeline:

Retrieves relevant context from a corpus using semantic similarity
Generates coherent responses using GPT-2 Large
Ensures complete, well-formatted sentences
Removes duplicate content and repeated phrases
Handles errors gracefully

While the implementation includes various controls for response quality through parameter tuning and post-processing, as with any language model, outputs may occasionally diverge from the expected response or include hallucinated content.

The implementation focuses on both accuracy and response quality through careful parameter tuning and post-processing steps.

See below to get the project intialised, configured, and the dependencies installed.

Setup

Clone the repository:

git clone <repository-url>
cd <repository-directory>

Create and activate a virtual environment:

python3 -m venv rag_env
source rag_env/bin/activate

Install the necessary dependencies:

pip install transformers scikit-learn torch sentence-transformers

Usage

Run the RAG pipeline with a sample query:

python lightweight_rag.py

You may receive an OpenSSL warning upon running the script. This can be safely ignored, or you can update to a different Python distribution that supports the required version of OpenSSL, recompile Python, or install an older version of urllib3.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
README.md		README.md
commands.sh		commands.sh
lightweight_rag.py		lightweight_rag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

python-rag-pipeline

Setup

Usage

About

Releases

Packages

Languages

semsion/python-rag-pipeline

Folders and files

Latest commit

History

Repository files navigation

python-rag-pipeline

Setup

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages