Blog(2)
Welcome to our research blog post, where we delve into the fascinating world of vision and language models. In recent years, large-scale pre-training of neural networks has paved the way for ground-breaking advancements in Vision & Language (V&L) understanding.
Large-scale contrastive image-text pre-training remains the prevalent method for training Vision-Language Models (VLMs) [1,2].
Research Areas(0)
Publications(0)
News(0)
Others(0)