Blog(1)
Temporal sequences (e.g., videos) are an appealing data source as they provide a rich source of information and additional constraints to leverage in learning. By far the main focus on temporal sequence analysis in computer vision has been on learning representations (i.e., compact abstractions of the input data) targeting high-level distinctions between signals (e.g., action classification, “What action is present in the video?”).
Research Areas(0)
Publications(23)
Language-Aware Soft Prompting: Text-to-text optimization for few- and zero-shot adaptation of V&L models
AuthorAdrian Bulat, Georgios Tzimiropoulos
PublishedInternational Journal of Computer Vision
Date2023-09-01
MOCKS 1.0: Multilingual Open Custom Keyword Spotting Testset
AuthorMikolaj Pudo, Mateusz Wosik, Adam Cieslak, Justyna Krzywdziak, Bozena Lukasiak, Artur Janicki
PublishedAnnual Conference of the International Speech Communication Association (INTERSPEECH)
Date2023-08-20
LAFD: Local-differentially Private and Asynchronous Federated Learning with Direct Feedback Alignment
AuthorKijung Jung, Incheol Ba다, Soohyung Kim, Yon Dohn Chung
PublishedIEEE Access
Date2023-08-14
News(5)
Text-enrolled KWS with phoneme-level deep metric learning and adversarial training. We align acoustic and text embeddings across modalities and achieve SOTA on WSJ & LibriPhrase.
Handwriting has served as the primary method for information recording over an extensive historical span [1], as its development begins in early education, accompanied by a strong emphasis on writing accuracy.
Others(0)