Blog(1)
Temporal sequences (e.g., videos) are an appealing data source as they provide a rich source of information and additional constraints to leverage in learning. By far the main focus on temporal sequence analysis in computer vision has been on learning representations (i.e., compact abstractions of the input data) targeting high-level distinctions between signals (e.g., action classification, “What action is present in the video?”).
Research Areas(0)
Publications(26)
ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation
AuthorSherry X. Chen, Yi Wei, Luowei Zhou, Suren Kumar
PublishedInternational Conference on Computer Vision (ICCV)
Date2025-07-01
Augmenting Perceptual Super-Resolution via Image Quality Predictors
AuthorFengjia Zhang, Samrudhdhi B. Rangrej, Tristan Ty Aumentado-Armstrong, Afsaneh Fazly, Alex Levinshtein
PublishedComputer Vision and Pattern Recognition (CVPR)
Date2025-06-11
Better Exploiting Spatial Separability in Multichannel Speech Enhancement with an Align-and-Filter Framework
AuthorChinghua Lee, Chouchang Yang, Yashas Malur Saidutta, Retiree, Yilin Shen, Hongxia Jin
PublishedInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Date2024-12-21
News(5)
Text-enrolled KWS with phoneme-level deep metric learning and adversarial training. We align acoustic and text embeddings across modalities and achieve SOTA on WSJ & LibriPhrase.
Handwriting has served as the primary method for information recording over an extensive historical span [1], as its development begins in early education, accompanied by a strong emphasis on writing accuracy.
Others(0)