Retrieval-Augmented Generation(RAG)
A technique that combines LLMs with retrieval of external information to ground responses in facts.
Retrieval-Augmented Generation is a technique where an LLM is paired with a retrieval system (vector database, search index) that pulls in relevant context before generating a response. RAG dramatically improves factual accuracy and lets LLMs answer questions about proprietary or recent information they weren't trained on.
RAG is the foundation of most production LLM applications — chatbots over company documentation, support assistants, research tools.
A law firm's AI assistant uses RAG: when asked a legal question, it retrieves relevant sections from the firm's case database, then asks the LLM to answer using that retrieved context. Responses are accurate and traceable to source documents.
Related terms
A neural network trained on vast text data to understand and generate human language.
An AI system that can autonomously plan and execute multi-step tasks to achieve a goal.
A database optimized for storing and searching high-dimensional vector embeddings.
Need help applying Retrieval-Augmented Generation to your business?
Book a free 30-minute strategy call. I'll show you how Retrieval-Augmented Generation fits into a real growth strategy for your business.
Book a free strategy call