Blog(1)
Deep learning techniques have accomplished a big step forward on speech separation task. The current leading methods are based on the time-domain audio separation network (TasNet) [1]. TasNet uses a learnable encoder and decoder to replace the fixed T-F domain transformation. It takes waveform inputs and directly reconstructs sources, and computes time-domain loss with utterance-level permutation invariant training (uPIT). Several approaches are proposed based on TasNet framework, such as the Conv-TasNet [2] , the dual-path recurrent neural network (DPRNN) [3], the dual-path Transformer network (DPTNet) [4], RNN-free transformer-based neural network (SepFormer) [5] , a self-attentive network with a novel sandglass-shape, namely Sandglasset [6].
Focus Areas(0)
Research Achievements(181)
PoseKernelLifter: Metric Lifting of 3D Human Pose using Sound
AuthorZhijian Yang, Xiaoran Fan, Volkan Isler, Hyun Soo Park
PublishedIEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)
Date2022-06-20
Look and Listen: A Multi-Sensory Pouring Network and Dataset for Granular Media from Human Demonstrations
AuthorAlexis Burns, Siyuan Xiang, Daewon Lee, Larry Jackel, Shuran Song, Volkan Isler
PublishedIEEE/International Conference on Robotics and Automation (ICRA)
Date2022-05-23
Pouring by Feel: An Analysis of Tactile and Proprioceptive Sensing for Accurate Pouring
AuthorPedro Piacenza, Daewon Lee, Volkan Isler
News(25)
Samsung R&D Institute Poland (SRPOL) was recognized as one of the leading teams at the Detection and Classification of Acoustic Scenes and Events (DCASE) 2022 challenge, held by the Institute of Electrical and Electronics Engineers (IEEE), which aims to use state-of-the-art artificial intelligence (AI) technology to understand and interpret audio signals.
In the recently conducted IEEE CONECCT 2022 conference (International Conference on Electronics, Computing and Communication Technologies), SRI-B researchers along with Samsung PRISM collaborators, have published and received 2 Best Paper Awards, in emerging technologies such as Generative AI and Emotion AI
The Computer Vision and Pattern Recognition Conference (CVPR) is a world-renowned international Artificial Intelligence (AI) conference co-hosted by the Institute of Electrical and Electronics Engineers (IEEE) and the Computer Vision Foundation (CVF) which has been running since 1983.
Others(0)