Publications

MIB: Mixed Information Bottleneck for Out-of-Distribution Keyword Spotting

Published

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Date

2024.12.20

Research Areas

Abstract

Deep Keyword Spotting (KWS) systems continuously process audio streams to detect keywords. However, performance of deep neural networks degrade when the input data diverges from the training data; referred to as Out-of-Distribution (OOD) data problem. In this paper, we show performance degradation of existing State-of-the-Art (SOTA) keyword spotting models on OOD data w.r.t. in-domain testing data, and propose a training mechanism to improve performance on OOD data. Specifically, we propose a novel combination of Mixup and Information Bottleneck, called MIB, to achieve SOTA performance on OOD data. Considering on-device applications, we show across multiple models ranging from sizes of 12.5K parameters to 350K parameters, that MIB achieves as much as 2.5% (absolute) improvement in performance over OOD data. Further, in the more realistic case where OOD keywords are uttered in the presence of OOD noise, MIB achieves as much as 10% (absolute) performance improvement over SOTA models. The proposed MIB is model-agnostic, i.e., it can be applied to enhance the training of any deep keyword spotting model.