Surprising impact of noise reduction on ASR

Asma Trabelsi
五月 28, 2024

An ALE study reveals that noise reduction techniques can negatively impact transcription accuracy in Artificial Speech Recognition (ASR) applications.

woman during a presentation

In today's digital age, the quality of communications technology can significantly enhance the way we connect and collaborate. Recent advances in Artificial Speech Recognition (ASR) technology have led to significant improvements, particularly through open-source platforms like Vosk and Whisper, which are now pivotal in sectors requiring precise and efficient transcription services.

This blog highlights the groundbreaking work in ASR done by Alcatel-Lucent Enterprise researchers Asma Trabelsi, Laurent Werey, Sébastien Warichet and Emmanuel Helbert, which was published and presented at the international scientific conference, ICAART’24. The team’s study focuses on the impact of noise reduction techniques on the transcription quality of open-source ASR engines, showcasing how innovations in this area can streamline and enhance communication.

The research compares two leading open-source ASR tools, Vosk and Whisper, using the Word Error Rate (WER) metric. The findings suggest that Whisper generally outperforms Vosk in transcription accuracy.

The team also studied the effects of applying noise reduction models like RNNoise and ASTEROID before transcription takes place. Numerical experimentations revealed that, surprisingly, noise reduction techniques can negatively impact ASR performance and cause important information to be lost.

The team’s results clearly point to the need for continuous improvement and adjustment based on the evolving demands of ASR applications. It highlights the potential for further refining noise reduction technologies and their integration into ASR systems to meet diverse user needs.

For businesses and developers, choosing the right ASR tool is crucial for maintaining data sovereignty and achieving high-quality transcription. The ALE research not only guides users in selecting suitable ASR tools but also underscores the importance of ongoing innovation in speech recognition technologies.

As we move forward, embracing advancements in ASR and noise reduction technologies will be key to unlocking more seamless, efficient and accurate communication solutions across various industries.

Access the noise reduction study for more detailed insights.

Asma Trabelsi

Asma Trabelsi

Senior Data Scientist, Alcatel-Lucent Enterprise

As a Data Scientist at ALE, Asma leads a working group aiming at integrating Artificial Intelligence (AI) into Rainbow by Alcatel-Lucent Enterprise.

Prior to joining ALE, Asma worked at Expleo Group on a number of projects focused on applying Machine Learning in industry and transportation (autonomous vehicles and trains, chatbots) for well-known French companies like Renault, PSA and the RATP.

Asma holds a Bachelor’s Degree in Business Computing from the Faculty of Sciences and Management of Nabeul, Tunisia and a Master’s and PhD in Data Science co-supervised by Institute of Management of Tunis (ISG) and Artois University in France.

LinkedIn

关于作者

最新文章

A man in a control centre

深入了解政府ICT新时代的系统集成商

随着ICT市场的不断发展,系统集成商应关注解决方案的关键要素,以满足不断变化的政府需求。

a few men looking at a screen
安全

全局性安全策略有助网络安全成本实现优化

企业必须全面评估自身需求以减少重叠,并制定综合性计划,实现对网络安全预算的最大化利用。

a women with headphones working on a laptop
教育

六步助力教学机构制定综合性数字化转型战略,全方位提升校园体验

端到端数字化转型战略为真正以学生为中心的全方位卓越体验交付创造有效途径。

a laptop and a book
教育

深入推进教育数字化转型

教育行业已经发生了重要而深刻的变化。数字化转型为学生和学校取得成功提供了一种积极途径。

聊天