Production RAG System with Hybrid Search & Evaluation Framework
Production-ready RAG system using FAISS, BM25, Docker, and evaluation metrics.
A production-ready RAG system that combines semantic vector search (FAISS) with keyword matching (BM25) to deliver high-quality, citation-backed answers. Built with comprehensive evaluation metrics, monitoring, and one-command Docker deployment.
Production RAG System with Hybrid Search & Evaluation Framework
Enterprise-grade Retrieval-Augmented Generation (RAG) system combining semantic vector search and keyword retrieval for accurate, citation-backed responses.
Overview
This project implements a production-ready RAG pipeline using:
- FAISS vector search
- BM25 keyword search
- OpenAI LLM integration
- Automated evaluation framework
- Docker-based deployment
The system retrieves relevant context from ArXiv ML/AI papers and generates grounded answers with source citations.
Key Features
- Hybrid retrieval (FAISS + BM25) + Reranking
- Citation-backed answer generation
- <2s query latency
- Automated evaluation metrics
- Docker deployment support
- FastAPI
- Monitoring and performance tracking
Results
- Indexed 7,700+ documents
- Achieved 94% faithfulness
- Average latency: ~1.5s
- Low-cost inference using GPT-4.1-mini
- Improved retrieval precision using hybrid search and reranking
Tech Stack
- Python
- FastAPI
- FAISS
- BM25
- SentenceTransformers
- OpenAI API
- Docker
- PyTorch
Applications
- Enterprise knowledge assistants
- AI-powered search engines
- Research paper QA systems
- Internal documentation retrieval
- Customer support automation