📘 RAG System Design Guide¶
The single-page reference for designing, evaluating, and operating production RAG systems.
Table of Contents¶
- RAG Ecosystem
- The Problem
- What It Does
- Demo
- Built On
- Quickstart
- Architecture
- Run Locally
- Deploy Your Own
- Why This Is Different
RAG Ecosystem¶
This repo is part of a broader RAG toolkit:
| Repo | What it covers |
|---|---|
| rag-auditor | Evaluate your RAG pipeline |
| multi-llm-rag-agent-chat | Build a production RAG chatbot with multi-LLM routing |
| rag-system-design-guide ← you are here | Design reference — architecture patterns and trade-offs |
Start with the design guide, build with the chatbot, evaluate with the auditor.
The Problem¶
Most RAG explainers stop at isolated concepts.
You can find plenty of material on chunking, embeddings, or vector search — but almost nothing that connects those decisions to evaluation, observability, security, and production operations. When you're designing a real system, that missing connective tissue is exactly what matters.
This guide puts the full picture in one place.
What It Does¶
Input: A team or individual planning, reviewing, or debugging a RAG architecture
Output: A decision-oriented guide covering what to build, what to avoid, and how to run it in production
| Part | Focus | Highlights |
|---|---|---|
| Part I | Foundations | Foundation models, LLM pitfalls, how RAG works, RAG vs. prompt engineering vs. fine-tuning |
| Part II | System Design | Problem framing, failure scenarios, ingestion, chunking, embeddings, search, retrieval, reranking, prompting, generation, hallucination reduction |
| Part III | Operations & Architecture | Evaluation metrics, observability, scaling, Kubernetes, security, enterprise RAG architecture |
| Part IV | Advanced Topics | RAG vs. MCP vs. AI agents, HyDE, CRAG, Self-RAG, Adaptive RAG, GraphRAG, multi-modal RAG, guardrails, agentic RAG |
| Appendices | Practical Reference | The 2026 RAG Developer Stack and recommended tools |
Demo¶
- Live site: amitgambhir.github.io/rag-system-design-guide
- Source guide:
RAG System Design - Complete Q&A Guide.md
Open the site and you land on a single-page reference with:
Part I → Foundations
Part II → System Design
Part III → Operations & Architecture
Part IV → Advanced Topics
Plus:
- Design pitfalls & best practices
- The 2026 RAG Developer Stack
- Recommended tools & technologies
Built On¶
| Technology | Role |
|---|---|
| Markdown | Keeps the guide easy to edit and version |
| MkDocs Material | Gives the site a clean docs layout, navigation, and search |
| GitHub Pages | Hosts the published documentation site |
| GitHub Actions | Deploys the site automatically on every push to main |
Site config lives in mkdocs.yml. The deploy workflow is .github/workflows/deploy-pages.yml. The docs/ directory contains symlinks back to the root Markdown files so content stays in one place.
Quickstart¶
git clone https://github.com/amitgambhir/rag-system-design-guide.git
cd rag-system-design-guide
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
mkdocs serve
Then open http://127.0.0.1:8000/.
Architecture¶
README.md + RAG System Design - Complete Q&A Guide.md ← source of truth
│
▼
docs/index.md + docs/guide.md ← symlinks, not copies
│
▼
mkdocs.yml ← site config
│
▼
GitHub Actions on push to main ← CI/CD
│
▼
gh-pages branch deploy
│
▼
Published GitHub Pages site
Run Locally¶
Deploy Your Own¶
- Create a new GitHub repository.
- Push this folder's contents to the
mainbranch. - In GitHub, open Settings → Pages.
- Set the source to Deploy from a branch and choose
gh-pages→/ (root). - Push to
main— the workflow in.github/workflows/deploy-pages.ymlwill publish the site.
Why This Is Different¶
- Focuses on system design trade-offs, not just definitions.
- Connects retrieval, generation, and operations in one continuous reference.
- Covers both foundations and production realities: evals, observability, security, and scaling.
License¶
Released under the MIT License.
I wrote this as the reference I wish I had when turning RAG ideas into production architecture.