> ## Documentation Index
> Fetch the complete documentation index at: https://docs.a2v2.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# How RAG Works

> How A2V2.ai grounds every answer in your own content — so your agent cites its sources instead of guessing.

A visitor asks your agent, "What's your refund window for opened products?" A
generic chatbot would improvise a plausible-sounding policy from whatever it
learned during training. Your A2V2.ai agent does something different: it looks up
the exact passage in *your* returns policy, answers from that, and shows you which
source it used.

That difference is **Retrieval-Augmented Generation (RAG)** — the technique behind
every answer your agent gives. This page explains how it works and why it makes
your agent trustworthy. It's background reading; you don't have to configure any of
it.

## Why grounding matters

Large language models are fluent, but on their own they have two problems for
business use:

* **They don't know your specifics.** A model wasn't trained on your pricing, your
  onboarding steps, or last week's policy update.
* **They guess confidently.** Asked something they don't know, models tend to
  *hallucinate* — produce an answer that sounds right but isn't.

RAG fixes both. Instead of asking the model to answer from memory, A2V2.ai first
**retrieves** the most relevant passages from your [knowledge
base](/knowledge-base/overview), then asks the model to answer **using only those
passages**. The model becomes a writer working from your source material, not an
oracle inventing facts.

The payoff:

* **Accurate, on-brand answers** drawn from content you control.
* **Citations** — every answer can show the sources behind it.
* **Fewer hallucinations** — when your content doesn't cover a question, the agent
  is instructed to say so rather than make something up.
* **Instant updates** — change a source and the agent answers from the new version.
  There's no slow "retraining" of the model itself.

## How it works

There are two halves to RAG: getting your content *ready* to search (which happens
once, when you add a source), and *using* it to answer (which happens on every
question).

### Part 1 — Preparing your content

When you add a source — a file, a website, a YouTube video, or a Q\&A pair —
A2V2.ai doesn't just file it away. It transforms it into something searchable:

<Steps>
  <Step title="Extract">
    The text is pulled out of your file or URL — paragraphs from a PDF, the
    transcript from a video, the body of a web page.
  </Step>

  <Step title="Chunk">
    That text is split into small, self-contained passages. Chunking matters
    because a question usually only needs one paragraph of a 40-page handbook —
    retrieving small passages lets the agent find and use exactly the right part.
  </Step>

  <Step title="Embed">
    Each passage is converted into an **embedding** — a list of numbers that
    captures its *meaning*, not just its words. Passages about "cancelling a plan"
    and "ending my subscription" end up close together, even though they share no
    keywords.
  </Step>

  <Step title="Index">
    The embeddings are stored in a search index that's private to that agent. This
    is the step that lets the agent later find the right passage for any question
    in a fraction of a second.
  </Step>
</Steps>

A source moves through statuses while this runs — from unprocessed, to
**processing**, to **Completed**. Only **Completed** sources are used in answers,
so an agent never quotes a half-processed document. (See [Manage
sources](/knowledge-base/manage-sources) for handling failed or stuck sources.)

### Part 2 — Answering a question

When a visitor sends a message, the agent runs the *retrieval* and *generation*
steps that give RAG its name:

<Steps>
  <Step title="Understand the question">
    The agent first works out what the visitor actually wants — a real question to
    answer, versus a greeting or something unclear. Greetings get a friendly reply
    without a knowledge-base lookup.
  </Step>

  <Step title="Retrieve the best passages">
    For a real question, the agent searches your knowledge base and pulls the
    passages most likely to contain the answer. A2V2.ai uses **both** meaning-based
    search (finding passages that are *about* the same thing) and keyword search
    (catching exact terms, product names, and codes), then runs a final relevance
    ranking pass to put the strongest passages first. Combining the two means the
    agent finds the right content whether the visitor phrases things your way or
    their own.
  </Step>

  <Step title="Generate a grounded answer">
    The top passages are handed to the AI model as context, along with the recent
    conversation. The model writes a natural-language answer based on that
    context — and A2V2.ai records which sources it drew from.
  </Step>

  <Step title="Stream and cite">
    The answer streams back to the visitor word by word, with the sources behind it
    available to inspect.
  </Step>
</Steps>

### What happens when your content doesn't cover the question

This is where grounding earns its keep. If the search turns up nothing relevant,
the agent isn't handed any context to work from — and it's explicitly instructed to
tell the visitor it doesn't have that information, rather than invent an answer.

That's usually a *feature*, not a failure: it keeps your agent honest. When you see
an agent say it can't help with something, the fix is almost always to [add a
source](/knowledge-base/overview) that covers the topic.

## What this means for you

Understanding RAG changes how you get the most from your agent:

* **Answer quality follows source quality.** The agent can only answer from what
  you give it. Clear, well-structured, up-to-date sources produce clear, accurate
  answers. Vague or contradictory sources produce vague or contradictory answers.
* **To fix a wrong answer, fix the source.** If an agent answers incorrectly, the
  culprit is usually a missing, outdated, or ambiguous passage — not the model.
  Update the content and the next answer reflects it.
* **Specific beats sprawling.** A focused [Q\&A pair](/knowledge-base/qa) for a
  common question often retrieves more reliably than the same answer buried in a
  long document.
* **Citations are your QA tool.** Reviewing the sources behind answers in
  [Conversations](/conversations/history) tells you exactly where the agent is
  getting its information — and where your knowledge base has gaps.

<Tip>
  If an agent keeps missing a question you *know* is covered, the passage may be hard
  to retrieve. Try restating that information as a short, direct Q\&A pair — it gives
  the retrieval step a clean, unambiguous target to match against.
</Tip>

## Related

<CardGroup cols={2}>
  <Card title="Knowledge base overview" icon="database" href="/knowledge-base/overview">
    Add and manage the sources your agent answers from.
  </Card>

  <Card title="Data privacy & isolation" icon="shield-halved" href="/concepts/data-privacy-and-isolation">
    How each agent's knowledge stays separate and private.
  </Card>

  <Card title="Choosing a model" icon="microchip" href="/concepts/choosing-a-model">
    Which AI model writes your agent's answers.
  </Card>

  <Card title="Review conversations" icon="comments" href="/conversations/history">
    See answers, their sources, and where to improve.
  </Card>
</CardGroup>
