CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning
Published in ArXiv, 2025
This paper explores the critical role of synthetic data in enhancing the post-training performance of large language models (LLMs) from a novel reverse-bottleneck perspective.
Recommended citation: Zeyu Gan, Hao Yi, Yong Liu. CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning. arXiv preprint arXiv:2509.04027, 2025. https://arxiv.org/abs/2509.04027