MM-SOC: Benchmarking Multimodal Large Language Models in Social Media Platforms

Jan 1, 2024·
Yiqiao Jin
Yiqiao Jin
,
Minje Choi
,
Gaurav Verma
,
Jindong Wang
,
Srijan Kumar
· 1 min read
Figure showing the main model architecture and workflow Model architecture and key components
Abstract
Social media platforms are hubs for multimodal information exchange, encompassing text, images, and videos, making it challenging for machines to comprehensively understand the information. Multimodal Large Language Models (MLLMs) have shown promise in addressing these challenges, yet they struggle with accurately interpreting the intertwined multimodal cues in social media content. We introduce MM-Soc, a comprehensive benchmark designed to evaluate MLLMs’ understanding of multimodal social media content.
Type
Publication
ACL (Findings) 2024

Abstract

Social media platforms are hubs for multimodal information exchange, encompassing text, images, and videos, making it challenging for machines to comprehensively understand the information. Multimodal Large Language Models (MLLMs) have shown promise in addressing these challenges, yet they struggle with accurately interpreting the intertwined multimodal cues in social media content. We introduce MM-Soc, a comprehensive benchmark designed to evaluate MLLMs’ understanding of multimodal social media content.

Keywords

Multimodal Learning, Large Language Models, Social Media Analysis, Benchmarking

Yiqiao Jin
Authors
Ph.D. Candidate in Computer Science
My research focuses on adaptive and efficient AI systems, with emphasis on LLM agents, agent memory, self-distillation, multimodal LLMs, and structured multi-agent intelligence.