UniSD: Towards a Unified Self-Distillation Framework for Large Language Models

May 30, 2026·

Yiqiao Jin

Yiyang Wang

Lucheng Fu

Yijia Xiao

Yinyi Luo

Haoxin Liu

B. Aditya Prakash

Josiah Hester

Jindong Wang

Srijan Kumar

· 1 min read

PDF

Abstract

Self-distillation has emerged as a powerful technique for improving large language models without external teacher signals, but existing approaches are fragmented across diverse objectives, training signals, and model components. We introduce UniSD, a unified self-distillation framework that consolidates these directions into a single, modular formulation. UniSD enables systematic comparison of self-distillation variants and supports new combinations across data, representation, and decoding levels, providing a principled foundation for efficient and adaptive LLM training.

Type

Preprint

Publication

Under Review at NeurIPS 2026 (Preprint)

UniSD: Towards a Unified Self-Distillation Framework for Large Language Models

Abstract

Links