A More Accurate Internal Language Model Score Estimation for the Hybrid Autoregressive Transducer
Language model (LM) adaptation in hybrid autoregressive transducer (HAT) is justified only when the transducer logits and the sum of speech and text logits in the label estimation sub-networks are approximately the same.
A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization
Research in Automatic Speech Recognition (ASR) continues to show that larger models yield better results. But while state-of-the-art networks continue to grow with billions of parameters, the difficulty of deploying these models on device also increases.
pMCT - Patched Multi-Condition Training
The clashing of pans and pots as you cook and ask your voice assistant what you can use to replace eggs in the recipe. The excited, overlapping conversations as you ask which of Henry the VIIIs wives survived, trying to settle a bet.