Publications

TFPSNET: TIME-FREQUENCY DOMAIN PATH SCANNING NETWORK FOR SPEECH SEPARATION

Published

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Date

2022.05.22

Research Areas

Abstract

Speech separation has been very successful with deep learning techniques. In this paper, we propose Time-Frequency (T-F) domain path scanning network (TFPSNet) for speech separation task. By introduces T-F scan path to the network, frequency bins in different frames and frequency can interact directly. We also introduce a T-F path loss to improve the performance further. The proposed TFPSNet could learn more details of frequency structure and separate the feature in T-F domain. Experiments on public WSJ0-2mix datasets show that our approach outperforms the current state-of-the-arts (SOTA) method, and achieves 21.1dB SISDR improvement (SISDRi). Our approach has good generalization ability. The model trained on WSJ0-2mix dataset achieves 18.8dB SISDRi on Libri-2mix test set without any fine tuning work.