CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries

Jun 11, 2025·
Shudong Liu
Yiqiao Jin
Yiqiao Jin
,
Cheng Li
,
Derek F. Wong
,
Qingsong Wen
,
Lichao Sun
,
Haipeng Chen
,
Xing Xie
,
Jindong Wang
· 1 min read
Abstract
Vision-language models (VLMs) are deployed globally but exhibit substantial cultural blind spots. We introduce CultureVLM, a framework that characterizes and improves cultural understanding of VLMs across more than 100 countries. CultureVLM provides culturally-grounded benchmarks and training procedures that meaningfully reduce cultural bias while preserving general capability.
Type
Publication
CVPR 2025 VLMs4All Workshop

Abstract

Vision-language models (VLMs) are deployed globally but exhibit substantial cultural blind spots. We introduce CultureVLM, a framework that characterizes and improves cultural understanding of VLMs across more than 100 countries.

Yiqiao Jin
Authors
Ph.D. Candidate in Computer Science
My research focuses on adaptive and efficient AI systems, with emphasis on LLM agents, agent memory, self-distillation, multimodal LLMs, and structured multi-agent intelligence.