Blog(3)
In real-world scenarios, both face images and videos may suffer from unknown and varied types of degradation, such as down-sampling, noise, blur, and compression.
Diffusion models have established themselves as the state-of-the-art for generative modeling, producing high-fidelity images through iterative denoising.
Text-To-Music (TTM) generation model generates music tracks from text descriptions such as “A rock and roll song played by guitar”.
Research Areas(0)
Publications(10)
Hearable Image: On-Device Image-Driven Sound Effect Generation for Hearing What You See
AuthorDeokjun Eom, Nahyun Kim, Woohyun Nam, Kyung-Rae Kim, Chaebin Im, Jungwon Park
PublishedInternational Conference on Information and Knowledge Management (CIKM)
Date2025-11-10
EDiT: Efficient Diffusion Transformers with Linear Compressed Attention
AuthorAbhinav Mehrotra, Ruchika Chavhan, Malcolm Chadwick, Luca Morreale, Mehdi Noroozi, Alberto Gil Ramos
PublishedInternational Conference on Computer Vision/ European Conference on Computer Vision (ICCV)
Date2025-10-21
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
AuthorChinghua Lee, Chouchang Yang ,Retiree, Yashas Malur Saidutta, Yilin Shen, Hongxia Jin
PublishedInternational Conference on Machine Learning (ICML)
Date2025-05-01
News(5)
Have you ever wished you could easily add or remove objects in your photos—like adding a candle on the top of your birthday cake, or removing that background person from your photo?
Stable Diffusion [1] for Super Resolution (i.e. SD-SR), has been shown to produce steep improvements compared to previous SR approaches.
Others(0)