Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective

Published in ICLR, 2025

Recommended citation: Zeyu Gan, Yong Liu. Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective. In The Thirteenth International Conference on Learning Representations (ICLR), 2025. https://arxiv.org/abs/2410.01720