Skip to content

sprnjt/mini-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mini-RAG: Simple Retrieval-Augmented Generation Pipeline

A lightweight RAG system that retrieves relevant document chunks and generates grounded answers using LLMs.

Features

  • Document Chunking: Splits documents into meaningful segments
  • Semantic Embeddings: Uses sentence-transformers/all-MiniLM-L6-v2
  • FAISS Vector Search: Fast similarity search
  • Grounded Generation: LLM answers strictly based on retrieved content
  • Multiple LLM Options: OpenRouter API or Local HuggingFace models

Project Structure

mini-rag/
├── documents/              # Your documents (.txt or .md files)
├── src/
│   ├── chunker.py          # Document chunking
│   ├── embedder.py         # Embedding generation
│   ├── indexer.py          # FAISS vector indexing
│   ├── retriever.py        # Semantic retrieval
│   ├── generator.py        # Local LLM generator
│   ├── generator_openrouter.py  # OpenRouter API generator
│   └── rag_pipeline.py     # Main orchestrator
├── main.py                 # CLI interface
├── requirements.txt
└── README.md

Installation

cd mini-rag

# Create virtual environment (recommended)
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # Linux/Mac

# Install dependencies
pip install -r requirements.txt

Option 1: Run with OpenRouter LLM (Recommended)

OpenRouter provides access to various LLMs via API. Free models available!

Step 1: Get API Key

  1. Go to https://openrouter.ai/keys
  2. Sign up / Login
  3. Create a new API key
  4. Copy the key

Step 2: Set API Key

PowerShell:

$env:OPENROUTER_API_KEY = "sk-or-v1-your-key-here"

Command Prompt:

set OPENROUTER_API_KEY=sk-or-v1-your-key-here

Linux/Mac:

export OPENROUTER_API_KEY="sk-or-v1-your-key-here"

Step 3: Run RAG

# Single query
python main.py --query "What packages does Indecimal offer?" --use-openrouter

# Interactive mode
python main.py --interactive --use-openrouter

# Use different model
python main.py --query "Your question" --use-openrouter --openrouter-model "meta-llama/llama-3.2-3b-instruct:free"

Available Free Models on OpenRouter:

Model ID
Google Gemma 3n google/gemma-3n-e2b-it (default)
Meta Llama 3.2 3B meta-llama/llama-3.2-3b-instruct:free
Mistral 7B mistralai/mistral-7b-instruct:free
Microsoft Phi-3 Mini microsoft/phi-3-mini-128k-instruct:free

Option 2: Run with HuggingFace Local LLM

Run LLMs locally on your machine. Requires more RAM (~4GB+).

Step 1: Run RAG with Local LLM

# Single query (uses TinyLlama by default)
python main.py --query "What packages does Indecimal offer?" --use-local-llm

# Interactive mode
python main.py --interactive --use-local-llm

# Use different local model
python main.py --query "Your question" --use-local-llm --llm-model "microsoft/phi-2"

Recommended Local Models:

Model Size RAM Required
TinyLlama-1.1B-Chat (default) 1.1B ~3GB
Microsoft Phi-2 2.7B ~6GB
Mistral-7B-Instruct 7B ~16GB

First Run Note:

The first run will download the model (~2-4GB). This only happens once.


Option 3: Run without LLM (Simple Mode)

For testing without LLM - returns raw retrieved context.

python main.py --query "What packages does Indecimal offer?"
python main.py --interactive

CLI Options

Flag Description
-q, --query Question to ask
-i, --interactive Interactive mode
-d, --docs Documents directory (default: documents)
-k, --top-k Number of chunks to retrieve (default: 3)
--use-openrouter Use OpenRouter API
--openrouter-model OpenRouter model ID
--openrouter-key API key (or use env var)
--use-local-llm Use local HuggingFace model
--llm-model Local model name
--chunk-size Chunk size in characters (default: 500)

Example Output

======================================================================
QUERY: What packages does Indecimal offer?
======================================================================

RETRIEVED CONTEXT (Top 3 chunks):
----------------------------------------------------------------------

[1] Source: doc2.md (Score: 0.85)
    Package Pricing: Essential ₹1,851/sqft, Premier ₹1,995/sqft...

[2] Source: doc1.md (Score: 0.78)
    Indecimal provides end-to-end home construction support...

======================================================================
GENERATED ANSWER:
======================================================================

Based on the provided documents, Indecimal offers four packages:
1. Essential: ₹1,851/sqft
2. Premier (Most Popular): ₹1,995/sqft
3. Infinia: ₹2,250/sqft
4. Pinnacle: ₹2,450/sqft

----------------------------------------------------------------------
⏱ Retrieval: 25.3ms | Generation: 1234.5ms
----------------------------------------------------------------------

Adding Your Own Documents

  1. Place .txt or .md files in the documents/ folder
  2. Run the pipeline - documents are automatically indexed
python main.py --query "Your question about your documents" --use-openrouter

License

MIT License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages