Kontaktu QA - Voice Agent Testing

Metrics

22 predefined metrics across 4 categories

Accuracy
7

Expected Outcome

boolean

Whether the agent achieved the scenario's intended outcome

expected_outcome

Hallucination

boolean

Agent fabricated information not grounded in tool responses

hallucination

Tool Call Success

boolean

All expected tools were called with correct arguments

tool_call_success

Transcription Accuracy

rating

Accuracy of the agent's speech-to-text understanding

transcription_accuracy

Relevancy

rating

Response relevance to user queries

relevancy

Response Consistency

rating

Consistent answers across the conversation

response_consistency

Voicemail Detection

boolean

Agent correctly detected voicemail greeting

voicemail_detection

Conversation Quality
6

AI Interrupting User

numeric

Count of times the agent interrupted the user

ai_interrupting_user

Stop Time after Interruption

numeric

Milliseconds for agent to stop after user interruption

stop_time_after_interruption_ms

Latency

numeric

Response latency with P50/P90/P95/P99 percentiles

latency_ms

Infrastructure Issues

boolean

No connection drops, audio gaps, or timeout errors

infrastructure_issues

Silence Detection

numeric

Total silence duration in conversation

silence_detection

Unnecessary Repetition

boolean

Agent avoided repeating the same information

unnecessary_repetition

Customer Experience
4

CSAT

rating

Customer satisfaction score (0-100)

csat

Sentiment

enum

Overall conversation sentiment: positive/neutral/negative

sentiment

Topic of Call

enum

Categorization of the call subject

topic_of_call

Dropoff Node

enum

Where in the conversation the user dropped off

dropoff_node

Speech Quality
5

Average Pitch (Hz)

numeric

Mean pitch frequency of agent voice

average_pitch_hz

Talk Ratio

numeric

Ratio of agent speaking time to total time

talk_ratio

Speaking Rate (WPM)

numeric

Agent words per minute

speaking_rate_wpm

Average Ringing Duration

numeric

Time before agent picks up

ringing_duration

Gibberish Detection

boolean

Agent did not produce unintelligible speech

gibberish

Metrics

Accuracy7

Conversation Quality6

Customer Experience4

Speech Quality5

Accuracy
7

Conversation Quality
6

Customer Experience
4

Speech Quality
5