The three things you’re trading off
Picking a model means balancing three factors. You rarely get all three at once.- Quality — how well the model understands nuance, follows your instructions, reasons across multiple sources, and writes a clean answer. Heavier models are stronger on hard, multi-step, or ambiguous questions.
- Speed — how fast the answer comes back. Lighter models tend to respond faster, which matters when a visitor is watching the reply stream in.
- Credit cost — each answer consumes credits, and the cost per message varies by model. A heavier model can cost several times more per reply than a light one. At low volume that’s negligible; at thousands of conversations a month it adds up.
How A2V2.ai labels models
In the model selector on the Sandbox, each model shows its credits per message and may carry one or both badges:| Badge | Meaning |
|---|---|
| HIPAA | The model is eligible for use in HIPAA-regulated workflows. Use these if you’re handling health-related information — see Data privacy & isolation. |
| Premium | A higher-tier model — typically stronger on quality, and often higher in credit cost per message. |
The available models and their credit costs change over time as providers release
new versions. The selector in the app is always the source of truth for what’s
available and what it costs — the full reference table lives on Model &
Temperature.
When to reach for a heavier model
Start light, and step up only when testing shows you need to. Signs a heavier model is worth the extra credits:- Complex, multi-step questions — the agent has to combine several sources, do light reasoning, or weigh conditions (“if the customer is on the annual plan and past their renewal date…”).
- Nuanced tone or judgement — sensitive topics, negotiation, or carefully worded responses where a clumsy answer costs you.
- Long or dense source material — handbooks, contracts, or technical docs where the agent must synthesize rather than quote.
- Instructions aren’t being followed — if a light model keeps ignoring parts of your instructions, a stronger model often holds the line better.
- High-volume, repetitive questions with answers that live cleanly in your knowledge base — hours, pricing, “where’s my order.”
- Speed is the priority and the questions aren’t subtle.
- Cost control at scale — a one-credit model across thousands of chats keeps your credit burn predictable.
A starting point by use case
| Your agent mostly… | A reasonable starting point |
|---|---|
| Answers routine FAQs at high volume | A light, low-cost model (the default is a good start) |
| Handles a mix of simple and tricky questions | A mid-tier model; step up only if testing shows gaps |
| Reasons through complex or sensitive topics | A Premium model |
| Works with health-related information | A model with the HIPAA badge |
A few things to keep in mind
- Model choice affects file uploads. If you let visitors attach files in chat, what’s supported (PDFs, images, text) depends on the model — the Sandbox tells you what the selected model accepts.
- Quality isn’t only the model. A grounded, accurate answer also depends on good sources, clear instructions, and a sensible temperature setting. A heavier model won’t rescue a thin knowledge base — see How RAG works.
- You can change your mind anytime. Switching models takes effect on the agent’s next answers; it doesn’t require retraining your knowledge base.
Related
Model & Temperature
Set the model, tune temperature, and see the full model reference.
Compare in the Playground
Run the same questions against two models side by side.
How credits work
What each answer costs and how credits are metered.
Data privacy & isolation
What the HIPAA badge does and doesn’t mean.