A Survey on Efficient LLM Training: From Data-centric Perspectives
A survey of efficient LLM training organized around data-centric techniques — selection, mixing, ordering, and synthesis — and their trade-offs with compute and downstream performance.
Jul 31, 2025