E2E Testing for AI Voice Agents

Stop testing
your voice agent
manually.

vspec simulates real phone calls against any AI phone agent — Vapi, Retell, Bland, Fonio, Synthflow or LiveKit / Pipecat, whatever. Just point it at a phone number.

🔒 vspec.studio/dashboard
Dashboard
Overview of your testing workspace
Inbound Run ▾
Outbound Run ▾
Projects
4
Specs
18
Runs
47
Results
312
Pass Rate
84%
Active Runs
0
Test Results per Day
7d
14d
30d
90d
Recent Runs
#14 · sales-agentFinished
#13 · support-botFailed
#12 · sales-agentFinished
Recent Results
aggressive-caller✗ Fail
friendly-customer✓ Pass
pricing-question✓ Pass

Manual testing
doesn't scale.

✗ Without vspec

Call. Listen. Repeat.

You change a prompt. You call your agent. You listen for 3 minutes. Something's off. You tweak. You call again.

40 times a week. For every scenario. For every agent — regardless of which platform it runs on. You miss edge cases every time.

✓ With vspec

Define once. Test forever.

Define your test scenarios once — aggressive callers, pricing questions, edge cases. vspec runs them automatically and tells you exactly what passed, what failed, and why.

Change a prompt, run your specs, ship with confidence.

How it works.

01

Point at a phone number

Connect any AI voice agent to vspec — Vapi, Retell, Bland, Twilio, or fully custom. If it has a phone number, vspec can test it. Setup in under 2 minutes.

02

Define scenarios

Create test specs via the Web UI: describe who's calling and what they want, then set a single expectation — what the agent must achieve for the test to pass.

03

Run & iterate

Trigger runs manually in the UI — or automate via API call in your CI/CD pipeline. vspec evaluates outcomes and surfaces exactly where your agent fails.

Example spec — Booking agent
Prompt
You are calling a restaurant booking agent. Say you'd like a table for two on Friday evening. If asked for a name, say "Alex". Accept the first available slot offered.
Expectation
Agent confirms a specific time slot and repeats the booking details back before ending the call.
Example spec — De-escalation
Prompt
You are an angry customer who received the wrong item. Start frustrated, use short replies, and refuse the first resolution offered. Escalate if needed.
Expectation
Agent de-escalates the caller within 3 turns and offers a concrete resolution without transferring to a human.

Common questions.

vspec works with any AI phone agent that has a phone number — Vapi, Retell, Bland, Fonio, Synthflow, LiveKit, Pipecat, Twilio, or fully custom stacks. If it can receive a call, vspec can test it.
No. The free tier gives you 5 test credits with no credit card required. You can run your first test in under 2 minutes from signup.
Each spec has a plain-language expectation — for example, "agent confirms a booking and repeats the details before hanging up." After the call, vspec evaluates the transcript against that expectation and returns a clear pass or fail with reasoning.
Yes. On the Solo plan and above you get API access to trigger runs programmatically. Kick off a test suite on every deploy, merge, or prompt change and get results back automatically.
Under 2 minutes. Sign up, create a project, paste in your agent's phone number, write a scenario, and hit run. No SDK to install, no sales call, no demo required.
Solo (€19/mo) adds CI/CD API access, inbound webhook-triggered runs, and higher credit limits — ideal for a single developer. Team (€59/mo) adds multi-seat access, shared workspaces, and priority support for small teams shipping together.

Your agent is ready
to be tested.

Free tier. No credit card.

Start testing for free →