Residual Echo and Noise Cancellation with Feature Attention Module and Multi-domain Loss Function <BR>(3 minutes introduction)

Residual Echo and Noise Cancellation with Feature Attention Module and Multi-domain Loss Function
(3 minutes introduction)

Jianjun Gu (CAS, China), Longbiao Cheng (CAS, China), Xingwei Sun (CAS, China), Junfeng Li (CAS, China), Yonghong Yan (CAS, China)

For real-time acoustic echo cancellation in noisy environments, the classical linear adaptive filters (LAFs) can only remove the linear components of acoustic echo. To further attenuate the non-linear echo components and background noise, this paper proposes a deep learning-based residual echo and noise cancellation (RENC) model, where multiple inputs are utilized and weighted by a feature attention module. More specifically, input features extracted from the far-end reference and the echo estimated by the LAF are scaled with time-frequency attention weights, depending on their correlation with the residual interference in LAF’s output. Moreover, a scale-independent mean square error and perceptual loss function are further suggested for training the RENC model. Experimental results validate the efficacy of the proposed feature attention module and multi-domain loss function, which achieve an 8.4%, 14.9% and 29.5% improvement in perceptual evaluation of speech quality (PESQ), scale-invariant signal-to-distortion ratio (SI-SDR) and echo return loss enhancement (ERLE), respectively.

Search in Audio

Related Recordings

Multi-Stream Gated and Pyramidal Temporal Convolutional Neural Networks for Audio-Visual Speech Separation in Multi-Talker Environments
(3 minutes introduction)

Yiyu Luo , Jing Wang , Liang Xu , Lidong Yang

A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation
(3 minutes introduction)

Hao Zhang , DeLiang Wang

InterSpeech 2021

Residual Echo and Noise Cancellation with Feature Attention Module and Multi-domain Loss Function (3 minutes introduction)

Search in Audio

Related Recordings

Multi-Stream Gated and Pyramidal Temporal Convolutional Neural Networks for Audio-Visual Speech Separation in Multi-Talker Environments (3 minutes introduction)

A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation (3 minutes introduction)

Residual Echo and Noise Cancellation with Feature Attention Module and Multi-domain Loss Function
(3 minutes introduction)

Multi-Stream Gated and Pyramidal Temporal Convolutional Neural Networks for Audio-Visual Speech Separation in Multi-Talker Environments
(3 minutes introduction)

A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation
(3 minutes introduction)