Survey | Yiqiao Jin CS PhD @ Georgia Tech

Protein Large Language Models: A Comprehensive Survey

Sat, 01 Nov 2025 00:00:00 +0000

Abstract

Protein large language models (Protein LLMs) have rapidly emerged as a transformative paradigm for protein understanding, generation, and design. This survey provides a comprehensive overview of Protein LLMs, organizing the field along architectures, training objectives, datasets, downstream tasks, and applications across biology, chemistry, and medicine.

Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

Wed, 13 Aug 2025 00:00:00 +0000

Abstract

This survey deconstructs the ethics of large language models, mapping long-standing issues such as bias and misinformation to newly emerging dilemmas including agentic behavior, environmental cost, and cross-cultural alignment.

A Survey on Efficient LLM Training: From Data-centric Perspectives

Thu, 31 Jul 2025 00:00:00 +0000

Abstract

Efficient training of large language models has become a central concern as model and data scales grow. This survey reviews efficient LLM training from a data-centric perspective, organizing techniques around data selection, mixing, ordering, and synthesis.

Survey | Yiqiao Jin CS PhD @ Georgia Tech

Protein Large Language Models: A Comprehensive Survey

Abstract

Links

Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

Abstract

Links

A Survey on Efficient LLM Training: From Data-centric Perspectives

Abstract

Links