Publications

An MVDR-Embedded U-Net Beamformer for Effective and Robust Multichannel Speech Enhancement

Published

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Date

2024.04.14

Research Areas

Abstract

In multichannel speech enhancement (SE) systems based on beamforming, deep neural networks (DNNs) are often used to estimate beamformer weights directly. This approach, however, may not generalize well to new acoustic conditions. Alternatively, DNNs can predict T-F masks for speech and noise patterns that can be used with statistical beamforming. This approach is robust, but its performance is constrained by the later component as relying on certain modeling assumptions, e.g., covariance-based modeling in the minimum-variance-distortionless-response (MVDR) beamformer. In this paper, we propose a novel integration of the two types of methodology by introducing an intra-MVDR module embedded in the U-Net architecture that combines the merits of both, i.e., effectiveness and robustness. Simulation results show that the proposed MVDR-embedded U-Net leads to SE improvements that are not achievable by simply enlarging the network with baseline approaches.