Blog(3)
Video Object Segmentation (VOS) is an important problem in computer vision and it has a lot of interesting applications like video editing, surveillance, autonomous driving, and augmented reality.
Speaker verification (SV) is the task of determining whether two speech utterances belong to the same speaker. Recent advances have been driven by high-capacity deep neural networks trained on large-scale datasets.
We study the significance of Feature Learning in Neural Networks (NNs) for Knowledge Distillation (KD), a popular technique to improve an NN model’s generalisation performance using a teacher NN model. We propose a principled framework Feature Kernel Distillation (FKD), which performs distillation directly in the feature space of NNs and is therefore able to transfer knowledge across different datasets.
Research Areas(0)
Publications(23)
EDiT: Efficient Diffusion Transformers with Linear Compressed Attention
AuthorAbhinav Mehrotra, Ruchika Chavhan, Malcolm Chadwick, Luca Morreale, Mehdi Noroozi, Alberto Gil Ramos
PublishedInternational Conference on Computer Vision/ European Conference on Computer Vision (ICCV)
Date2025-10-21
Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP
AuthorShell Xu Hu
PublishedConference on Empirical Methods in Natural Language Processing (EMNLP)
Date2024-11-13
You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation
AuthorMehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos
PublishedEuropean Conference on Computer Vision (ECCV)
Date2024-09-30
News(6)
Instance-level image retrieval aims to identify exact object or scene matches from large image collections, unlike object-level retrieval which targets broader categories. While deep models have significantly improved retrieval performance, deploying them in practical settings requires balancing accuracy with computational efficiency.
Scene Text Recognition (STR) aims to automatically transcribe text in natural scenes, enabling applications in autonomous driving, augmented reality, language translation, and assistive technologies.
Others(0)