Search

ALL
Blog
Research Areas
Publications
News
Others

Blog(1)

SmoGVLM: A Small, Graph-Enhanced Vision-Language Model
Large vision-language models (VLMs) have achieved impressive performance across a wide range of multimodal tasks, from visual question answering (VQA) to reasoning over images and text [1, 2]. However, these models often suffer from hallucinations and poor grounding when faced with knowledge-intensive queries.

Research Areas(0)

Publications(0)

News(0)

Others(0)