MedHalu is a fine-grained benchmark for studying hallucinations in LLM responses to consumer healthcare queries, analyzing hallucination patterns across models, query types, and medical specialties.
Jun 1, 2026
Cross-lingual evaluation framework that exposes substantial multilingual gaps in LLM healthcare responses. Featured by Scientific American, The World, and Georgia Tech News. WebConf 2024 Oral.
Apr 1, 2024

We present a framework and benchmark to evaluate LLMs' multilingual capabilities in healthcare queries, revealing significant performance gaps across languages and providing insights for improving hea...
Jan 1, 2024