AI

RAG (Retrieval-Augmented Generation)

Definition: RAG (Retrieval-Augmented Generation) is an AI architecture that combines semantic search over a knowledge base with text generation by LLMs, enabling up-to-date answers grounded in specific sources.

RAG is the technology behind Perplexity, ChatGPT Search, Google AI Overviews, and most AI systems connected to the web. Instead of relying only on training knowledge, the LLM consults documents in real time and cites the sources in the answer.

How it works

  1. User asks a question.
  2. System converts the question into an embedding (numeric vector).
  3. Semantic search in a vector index finds the most relevant chunks.
  4. Retrieved chunks are injected into the LLM's context.
  5. LLM generates an answer citing the sources.

Implications for SEO/GEO

To be retrieved by RAG, your content needs to be:

  • Chunkable: self-contained 200-500 word passages with their own meaning.
  • Semantically clear: one central idea per paragraph.
  • Indexable: accessible HTML, no heavy JavaScript dependency.
  • Citable: source authority is checked before citation.

External references

Related terms

  • LLM (Large Language Model)

    LLM (Large Language Model) is an AI model trained on vast amounts of text to understand, generate, and reason about natu…

  • GEO (Generative Engine Optimization)

    GEO (Generative Engine Optimization) is the discipline of optimizing content and digital entities to be understood, cite…

  • Semantic Chunking

    Semantic chunking is the technique of dividing content into coherent, self-contained blocks optimized for retrieval by R…