Mini-RAG: Simple Retrieval-Augmented Generation Pipeline

A lightweight RAG system that retrieves relevant document chunks and generates grounded answers using LLMs.

Features

Document Chunking: Splits documents into meaningful segments
Semantic Embeddings: Uses sentence-transformers/all-MiniLM-L6-v2
FAISS Vector Search: Fast similarity search
Grounded Generation: LLM answers strictly based on retrieved content
Multiple LLM Options: OpenRouter API or Local HuggingFace models

Project Structure

mini-rag/
├── documents/              # Your documents (.txt or .md files)
├── src/
│   ├── chunker.py          # Document chunking
│   ├── embedder.py         # Embedding generation
│   ├── indexer.py          # FAISS vector indexing
│   ├── retriever.py        # Semantic retrieval
│   ├── generator.py        # Local LLM generator
│   ├── generator_openrouter.py  # OpenRouter API generator
│   └── rag_pipeline.py     # Main orchestrator
├── main.py                 # CLI interface
├── requirements.txt
└── README.md

Installation

cd mini-rag

# Create virtual environment (recommended)
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # Linux/Mac

# Install dependencies
pip install -r requirements.txt

Option 1: Run with OpenRouter LLM (Recommended)

OpenRouter provides access to various LLMs via API. Free models available!

Step 1: Get API Key

Go to https://openrouter.ai/keys
Sign up / Login
Create a new API key
Copy the key

Step 2: Set API Key

PowerShell:

$env:OPENROUTER_API_KEY = "sk-or-v1-your-key-here"

Command Prompt:

set OPENROUTER_API_KEY=sk-or-v1-your-key-here

Linux/Mac:

export OPENROUTER_API_KEY="sk-or-v1-your-key-here"

Step 3: Run RAG

# Single query
python main.py --query "What packages does Indecimal offer?" --use-openrouter

# Interactive mode
python main.py --interactive --use-openrouter

# Use different model
python main.py --query "Your question" --use-openrouter --openrouter-model "meta-llama/llama-3.2-3b-instruct:free"

Available Free Models on OpenRouter:

Model	ID
Google Gemma 3n	`google/gemma-3n-e2b-it` (default)
Meta Llama 3.2 3B	`meta-llama/llama-3.2-3b-instruct:free`
Mistral 7B	`mistralai/mistral-7b-instruct:free`
Microsoft Phi-3 Mini	`microsoft/phi-3-mini-128k-instruct:free`

Option 2: Run with HuggingFace Local LLM

Run LLMs locally on your machine. Requires more RAM (~4GB+).

Step 1: Run RAG with Local LLM

# Single query (uses TinyLlama by default)
python main.py --query "What packages does Indecimal offer?" --use-local-llm

# Interactive mode
python main.py --interactive --use-local-llm

# Use different local model
python main.py --query "Your question" --use-local-llm --llm-model "microsoft/phi-2"

Recommended Local Models:

Model	Size	RAM Required
TinyLlama-1.1B-Chat (default)	1.1B	~3GB
Microsoft Phi-2	2.7B	~6GB
Mistral-7B-Instruct	7B	~16GB

First Run Note:

The first run will download the model (~2-4GB). This only happens once.

Option 3: Run without LLM (Simple Mode)

For testing without LLM - returns raw retrieved context.

python main.py --query "What packages does Indecimal offer?"
python main.py --interactive

CLI Options

Flag	Description
`-q, --query`	Question to ask
`-i, --interactive`	Interactive mode
`-d, --docs`	Documents directory (default: `documents`)
`-k, --top-k`	Number of chunks to retrieve (default: 3)
`--use-openrouter`	Use OpenRouter API
`--openrouter-model`	OpenRouter model ID
`--openrouter-key`	API key (or use env var)
`--use-local-llm`	Use local HuggingFace model
`--llm-model`	Local model name
`--chunk-size`	Chunk size in characters (default: 500)

Example Output

======================================================================
QUERY: What packages does Indecimal offer?
======================================================================

RETRIEVED CONTEXT (Top 3 chunks):
----------------------------------------------------------------------

[1] Source: doc2.md (Score: 0.85)
    Package Pricing: Essential ₹1,851/sqft, Premier ₹1,995/sqft...

[2] Source: doc1.md (Score: 0.78)
    Indecimal provides end-to-end home construction support...

======================================================================
GENERATED ANSWER:
======================================================================

Based on the provided documents, Indecimal offers four packages:
1. Essential: ₹1,851/sqft
2. Premier (Most Popular): ₹1,995/sqft
3. Infinia: ₹2,250/sqft
4. Pinnacle: ₹2,450/sqft

----------------------------------------------------------------------
⏱ Retrieval: 25.3ms | Generation: 1234.5ms
----------------------------------------------------------------------

Adding Your Own Documents

Place .txt or .md files in the documents/ folder
Run the pipeline - documents are automatically indexed

python main.py --query "Your question about your documents" --use-openrouter

License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mini-RAG: Simple Retrieval-Augmented Generation Pipeline

Features

Project Structure

Installation

Option 1: Run with OpenRouter LLM (Recommended)

Step 1: Get API Key

Step 2: Set API Key

Step 3: Run RAG

Available Free Models on OpenRouter:

Option 2: Run with HuggingFace Local LLM

Step 1: Run RAG with Local LLM

Recommended Local Models:

First Run Note:

Option 3: Run without LLM (Simple Mode)

CLI Options

Example Output

Adding Your Own Documents

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
documents		documents
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Mini-RAG: Simple Retrieval-Augmented Generation Pipeline

Features

Project Structure

Installation

Option 1: Run with OpenRouter LLM (Recommended)

Step 1: Get API Key

Step 2: Set API Key

Step 3: Run RAG

Available Free Models on OpenRouter:

Option 2: Run with HuggingFace Local LLM

Step 1: Run RAG with Local LLM

Recommended Local Models:

First Run Note:

Option 3: Run without LLM (Simple Mode)

CLI Options

Example Output

Adding Your Own Documents

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages