Pretrained Bidirectional Distillation for Machine Translation
Initializing parameters by a pretrained masked language model (LM)  is a knowledge transfer method widely applied to natural language processing tasks. Following its success, pretrained neural machine translation (NMT) models have attracted more and more research interest [2,3,4,5].
[INTERSPEECH 2022 Series #4] Cross-Modal Decision Regularization for Simultaneous Speech Translation
In today’s world of virtual meetings, conferences, and multi-media, automatic speech translation offers a wide variety of applications. Traditional offline speech translation models used a cascade of speech recognition and text translation. In our prior works , we developed efficient techniques for end-to-end speech translation which outperforms traditional cascaded approaches.