What Is Retrieval-Augmented Generation (RAG)?

Retrieval‑Augmented Generation (RAG) is an architecture where a retriever first pulls documents from a knowledge base, and then a generator (usually an LLM) answers using those documents as context.

The goal is to:

  • Ground answers in fresh, organization‑specific data.
  • Reduce hallucinations by forcing the model to cite sources.
  • Keep sensitive data in your own store instead of in the model weights.

For architecture trade‑offs, evaluation metrics, and testing guidance, return to the RAG Systems pillar page.