Protein Large Language Models: A Comprehensive Survey
Nov 1, 2025·,,
,,,,,,,,,,,·
1 min read
Yijia Xiao
Wanjia Zhao
Junkai Zhang
Yiqiao Jin
Han Zhang
Zhicheng Ren
Renliang Sun
Haixin Wang
Guancheng Wan
Pan Lu
Xiao Luo
Yu Zhang
James Zou
Yizhou Sun
Wei Wang
Abstract
Protein large language models (Protein LLMs) have rapidly emerged as a transformative paradigm for protein understanding, generation, and design. This survey provides a comprehensive overview of Protein LLMs, organizing the field along architectures, training objectives, datasets, downstream tasks, and applications across biology, chemistry, and medicine. We discuss key challenges, open problems, and future research directions, and provide a unified taxonomy for navigating this rapidly evolving area. EMNLP 2025 acceptance rate: 22.2%.
Type
Publication
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025
Abstract
Protein large language models (Protein LLMs) have rapidly emerged as a transformative paradigm for protein understanding, generation, and design. This survey provides a comprehensive overview of Protein LLMs, organizing the field along architectures, training objectives, datasets, downstream tasks, and applications across biology, chemistry, and medicine.