Cross-lingual evaluation framework that exposes substantial multilingual gaps in LLM healthcare responses. Featured by Scientific American, The World, and Georgia Tech News. WebConf 2024 Oral.
Apr 1, 2024

We present a framework and benchmark to evaluate LLMs' multilingual capabilities in healthcare queries, revealing significant performance gaps across languages and providing insights for improving hea...
Jan 1, 2024