Publications
You can also find my articles on my Google Scholar profile.
Published in ArXiv, 2026
This survey paper provides a structured roadmap for transitioning LLM development from engineering heuristics toward a principled scientific discipline.
Recommended citation: Zeyu Gan, Ruifeng Ren, Wei Yao, Xiaolin Hu, Gengze Xu, Chen Qian, Huayi Tang, Zixuan Gong, Xinhao Yao, Pengwei Tang, Zhenxing Dou, Yong Liu. Beyond the Black Box: Theory and Mechanism of Large Language Models. arXiv preprint arXiv:2601.02907, 2026. https://arxiv.org/abs/2601.02907
Published in ArXiv, 2025
This paper proposes Information Gain-based Policy Optimization (IGPO), a simple yet effective RL framework that provides dense and intrinsic supervision for multi-turn agent training.
Recommended citation: Guoqing Wang, Sunhao Dai, Guangze Ye, Zeyu Gan, Wei Yao, Yong Deng, Xiaofeng Wu, Zhenzhe Ying. Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents. arXiv preprint arXiv:2510.14967, 2025. https://arxiv.org/abs/2510.14967
Published in ArXiv, 2025
This paper introduces CoT-Space, a novel theoretical framework that recasts LLM reasoning as a continuous optimization problem, which provides a coherent explanation for empirical phenomena such as overthinking.
Recommended citation: Zeyu Gan, Hao Yi, Yong Liu. CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning. arXiv preprint arXiv:2509.04027, 2025. https://arxiv.org/abs/2509.04027
Published in ICML, 2025
This paper theoretically analyzes external slow-thinking methods in LLMs, linking snowball errors to reasoning accuracy and providing insights to enhance the interpretability of existing approaches.
Recommended citation: Zeyu Gan, Yun Liao, Yong Liu. Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning. In The 42nd International Conference on Machine Learning (ICML), 2025. https://arxiv.org/abs/2501.15602
Published in ICLR, 2025
This paper explores the critical role of synthetic data in enhancing the post-training performance of large language models (LLMs) from a novel reverse-bottleneck perspective.
Recommended citation: Zeyu Gan, Yong Liu. Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective. In The Thirteenth International Conference on Learning Representations (ICLR), 2025. https://arxiv.org/abs/2410.01720
Published in TPAMI, 2025
This paper theoretically analyzes the excess risk between the empirical optimal solution and the population-level optimal solution for semi-supervised learning under class distribution mismatch.
Recommended citation: Pan Du, Suyun Zhao, Puhui Tan, Zisen Sheng, Zeyu Gan, Hong Chen, Cuiping Li. Towards a Theoretical Understanding of Semi-Supervised Learning Under Class Distribution Mismatch. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 47, 6 (June 2025), 4853-4868. https://dl.acm.org/doi/10.1109/TPAMI.2025.3545930