CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries

Jun 11, 2025·

Shudong Liu

Yiqiao Jin

Cheng Li

Derek F. Wong

Qingsong Wen

Lichao Sun

Haipeng Chen

Xing Xie

Jindong Wang

· 1 min read

PDF Cite

Abstract

Vision-language models (VLMs) are deployed globally but exhibit substantial cultural blind spots. We introduce CultureVLM, a framework that characterizes and improves cultural understanding of VLMs across more than 100 countries. CultureVLM provides culturally-grounded benchmarks and training procedures that meaningfully reduce cultural bias while preserving general capability.

Type

Conference paper

Publication

CVPR 2025 VLMs4All Workshop

CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries

Abstract

Links