How RAG Works - A2V2.ai Docs

A visitor asks your agent, “What’s your refund window for opened products?” A generic chatbot would improvise a plausible-sounding policy from whatever it learned during training. Your A2V2.ai agent does something different: it looks up the exact passage in your returns policy, answers from that, and shows you which source it used. That difference is Retrieval-Augmented Generation (RAG) — the technique behind every answer your agent gives. This page explains how it works and why it makes your agent trustworthy. It’s background reading; you don’t have to configure any of it.

Why grounding matters

Large language models are fluent, but on their own they have two problems for business use:

They don’t know your specifics. A model wasn’t trained on your pricing, your onboarding steps, or last week’s policy update.
They guess confidently. Asked something they don’t know, models tend to hallucinate — produce an answer that sounds right but isn’t.

RAG fixes both. Instead of asking the model to answer from memory, A2V2.ai first retrieves the most relevant passages from your knowledge base, then asks the model to answer using only those passages. The model becomes a writer working from your source material, not an oracle inventing facts. The payoff:

Accurate, on-brand answers drawn from content you control.
Citations — every answer can show the sources behind it.
Fewer hallucinations — when your content doesn’t cover a question, the agent is instructed to say so rather than make something up.
Instant updates — change a source and the agent answers from the new version. There’s no slow “retraining” of the model itself.

How it works

There are two halves to RAG: getting your content ready to search (which happens once, when you add a source), and using it to answer (which happens on every question).

Part 1 — Preparing your content

When you add a source — a file, a website, a YouTube video, or a Q&A pair — A2V2.ai doesn’t just file it away. It transforms it into something searchable:

Extract

The text is pulled out of your file or URL — paragraphs from a PDF, the transcript from a video, the body of a web page.

Chunk

That text is split into small, self-contained passages. Chunking matters because a question usually only needs one paragraph of a 40-page handbook — retrieving small passages lets the agent find and use exactly the right part.

Embed

Each passage is converted into an embedding — a list of numbers that captures its meaning, not just its words. Passages about “cancelling a plan” and “ending my subscription” end up close together, even though they share no keywords.

Index

The embeddings are stored in a search index that’s private to that agent. This is the step that lets the agent later find the right passage for any question in a fraction of a second.

A source moves through statuses while this runs — from unprocessed, to processing, to Completed. Only Completed sources are used in answers, so an agent never quotes a half-processed document. (See Manage sources for handling failed or stuck sources.)

Part 2 — Answering a question

When a visitor sends a message, the agent runs the retrieval and generation steps that give RAG its name:

Understand the question

The agent first works out what the visitor actually wants — a real question to answer, versus a greeting or something unclear. Greetings get a friendly reply without a knowledge-base lookup.

Retrieve the best passages

For a real question, the agent searches your knowledge base and pulls the passages most likely to contain the answer. A2V2.ai uses both meaning-based search (finding passages that are about the same thing) and keyword search (catching exact terms, product names, and codes), then runs a final relevance ranking pass to put the strongest passages first. Combining the two means the agent finds the right content whether the visitor phrases things your way or their own.

Generate a grounded answer

The top passages are handed to the AI model as context, along with the recent conversation. The model writes a natural-language answer based on that context — and A2V2.ai records which sources it drew from.

Stream and cite

The answer streams back to the visitor word by word, with the sources behind it available to inspect.

What happens when your content doesn’t cover the question

This is where grounding earns its keep. If the search turns up nothing relevant, the agent isn’t handed any context to work from — and it’s explicitly instructed to tell the visitor it doesn’t have that information, rather than invent an answer. That’s usually a feature, not a failure: it keeps your agent honest. When you see an agent say it can’t help with something, the fix is almost always to add a source that covers the topic.

What this means for you

Understanding RAG changes how you get the most from your agent:

Answer quality follows source quality. The agent can only answer from what you give it. Clear, well-structured, up-to-date sources produce clear, accurate answers. Vague or contradictory sources produce vague or contradictory answers.
To fix a wrong answer, fix the source. If an agent answers incorrectly, the culprit is usually a missing, outdated, or ambiguous passage — not the model. Update the content and the next answer reflects it.
Specific beats sprawling. A focused Q&A pair for a common question often retrieves more reliably than the same answer buried in a long document.
Citations are your QA tool. Reviewing the sources behind answers in Conversations tells you exactly where the agent is getting its information — and where your knowledge base has gaps.

If an agent keeps missing a question you know is covered, the passage may be hard to retrieve. Try restating that information as a short, direct Q&A pair — it gives the retrieval step a clean, unambiguous target to match against.

Knowledge base overview

Add and manage the sources your agent answers from.

Data privacy & isolation

How each agent’s knowledge stays separate and private.

Choosing a model

Which AI model writes your agent’s answers.

Review conversations

See answers, their sources, and where to improve.

​Why grounding matters

​How it works

​Part 1 — Preparing your content

​Part 2 — Answering a question

​What happens when your content doesn’t cover the question

​What this means for you

​Related

Knowledge base overview

Data privacy & isolation

Choosing a model

Review conversations

Why grounding matters

How it works

Part 1 — Preparing your content

Part 2 — Answering a question

What happens when your content doesn’t cover the question

What this means for you

Related