MedHalu: Hallucinations in Responses to Healthcare Queries by Large Language Models
Jun 1, 2026·
,,,,·
1 min read
Vibhor Agarwal
Yiqiao Jin
Mohit Chandra
Munmun De Choudhury
Srijan Kumar
Nishanth Sastry
Abstract
Large language models are increasingly used for consumer healthcare queries, but their responses can contain subtle hallucinations with serious implications for patient safety. We introduce MedHalu, a benchmark for studying hallucinations in LLM responses to healthcare queries, with fine-grained annotations of hallucination types and severity. We analyze hallucination patterns across LLMs, query types, and medical specialties, and discuss implications for safer deployment in consumer healthcare.
Type
Publication
International AAAI Conference on Web and Social Media (ICWSM) 2026
Abstract
Large language models are increasingly used for consumer healthcare queries, but their responses can contain subtle hallucinations with serious implications for patient safety. We introduce MedHalu, a benchmark for studying hallucinations in LLM responses to healthcare queries, with fine-grained annotations of hallucination types and severity.