QContact Delivers World-Class Call Transcription Accuracy with NVIDIA

To truly learn what is going on inside your call centre and use tools such as automated quality assurance or AI insights, you need accurate transcription. The problem with call recordings though, is that often they are from busy environments, muffled microphones or highly compressed by the mobile carriers long before it hits your contact centre.

The standard industry measure of transcription is what is called Word Error Rate (WER). This is a rather simple formula to determine how many mistakes the transcription makes. For example, imagine the customer said “The cat sat on the mat” but the transcription mistranslated to the “The rat sat on the mat”

TheCatSatOnTheMat
TheRatSatOnTheMat

To calculate the WER you add up all the times you had to substitute a word, words that are missing (deletions) and words that were never actually said (additions). In this example one word is substituted out of 6, with no additions or deletions – giving a WER of 1 out of 6 or 16.7%.

Another example could be utterances like Umm which are missed or perhaps writing $123 as Hundred Twenty-Three. As you can see WER is very susceptible to tiny differences – but it works as a good measure of comparison between different models.

For years the market was dominated by the large players – Microsoft Azure (formally Dragon Speech) and Google Cloud. However, there is a large difference between Google generating subtitles for videos recorded with studio microphones and perfectly clear audio to a telephone call! Even Google’s telephone tailored speech recognition achieves 14.29% WER – meaning that on average 14% of the transcript has errors. Now, this is a great result, but speech recognition models are experiencing rapid improvements over the past 12 – 24 months. How good can we get in the second half of 2025?

QContact uses the latest in technology from NVIDIA publicly released this week to achieve a WER of just 6.05% – a 58% improvement compared to Google’s Speech solution, and a 15% improvement on even the latest Whisper models from OpenAI.

We’ve also expanded our automatic call transcription to now cover over 20 European languages including Spanish (3.72% WER), German (4.9% WER), French (5.38%) and Portuguese (5.95%) – with what’s now the second most accurate and the number one fastest multilingual speech recognition model in the world according to the Open ASR Leaderboard (https://huggingface.co/spaces/hf-audio/open_asr_leaderboard) .

Another thing you should be asking your CCaaS provider is whether your audio is being shipped overseas to countries without GDPR/POPIA equivalence or more worryingly being used for training without your consent. Given most 3rd party transcription services provide a discount if you consent to training, you may find your technology provider is allowing them to train from your data. Do you have your customer consent to use their personal information for training or exporting to countries like the US?

QContact guarantees our EMEA customers we will never ship your call recordings outside of the EU, and that we won’t use your recordings for training. What’s more – call transcription is included as standard and is completely free to all our customers. Yet again, QContact demonstrates its rapid ability to evolve our platform with the latest and best technologies while giving these new away to our customers for free. By not being tied to a single AI model, we have the flexibility to rapidly adapt to newer models delivering superior performance without any disruption or configuration changes required from our customers.

So, if you want a provider who understands they need to quickly adapt to the new technologies, who was awarded the Best Mid-Market Contact Centre in the world in 2025, and who was the first CCaaS provider in the world to launch WhatsApp calling – give QContact a call!

Check out our other blog posts