All Evaluation Posts

Evaluation

Evaluating ASR Systems, Part 3: Latency and Responsiveness

In Part 1, we introduced several key dimensions for ASR evaluation, latency being one of them. In Part 2, we focused on accuracy and robustness.

Now let's talk about speed. A speech recognition system that's highly accurate but slow can feel broken. Users don't experience milliseconds; they experience responsiveness. And the factors that determine perceived responsiveness are often different from what you might expect.

This post explores the components of latency, the difference between technical latency and perceived responsiveness, and why endpointing is often the biggest lever you have.

Evaluating ASR Systems, Part 2: Accuracy and Robustness

In Part 1, we introduced the "Advertisement Only" problem: claims like "best accuracy" or "trained on diverse data" that sound impressive but lack the transparency to mean anything.

Here's the thing: what's in a training dataset shouldn't matter to you. What matters is how the system performs on your data, in your conditions, for your use case. Understanding how accuracy is measured helps you ask the right questions, interpret vendor claims, and validate that a system actually works for your scenario.

Audio waveform transforming into a target

Now let’s dig into accuracy and robustness: what they mean, how to measure them, and why a single number like “98% accuracy” rarely tells the full story.

Evaluating ASR Systems, Part 1: The Big Picture

In speech recognition, and tech in general, bold claims are everywhere:

“Best accuracy.”

“Diverse training data.”

“Real-world performance.”

They sound great, but too often, there’s little transparency or data behind them. Without metrics, it’s impossible to know whether a claim reflects real capability or just clever marketing.

It reminds me of a story a friend once told me. She used to travel constantly for work and always struggled to keep up with laundry. One weekend, she spotted a dry cleaner with a big sign that read “Same Day Service!” Relieved, she dropped off her clothes and asked when she could pick them up.