Posts by Collection

portfolio

publications

Superclass Learning with Representation Enhancement

Published in CVPR, 2023

This paper introduced a novel coarse-grained classification situation called superclass learning, and proposed an attention-based framework (SCLRE) to extract superclass-aware representations.

Recommended citation: Zeyu Gan, Suyun Zhao, Jinlong Kang, Liyuan Shang, Hong Chen, Cuiping Li. Superclass Learning With Representation Enhancement. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24060-24069, 2023. https://openaccess.thecvf.com/content/CVPR2023/html/Kang_Superclass_Learning_With_Representation_Enhancement_CVPR_2023_paper.html

Towards a Theoretical Understanding of Semi-Supervised Learning Under Class Distribution Mismatch

Published in TPAMI, 2025

This paper theoretically analyzes the excess risk between the empirical optimal solution and the population-level optimal solution for semi-supervised learning under class distribution mismatch.

Recommended citation: Pan Du, Suyun Zhao, Puhui Tan, Zisen Sheng, Zeyu Gan, Hong Chen, Cuiping Li. Towards a Theoretical Understanding of Semi-Supervised Learning Under Class Distribution Mismatch. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 47, 6 (June 2025), 4853-4868. https://dl.acm.org/doi/10.1109/TPAMI.2025.3545930

Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective

Published in ICLR, 2025

This paper explores the critical role of synthetic data in enhancing the post-training performance of large language models (LLMs) from a novel reverse-bottleneck perspective.

Recommended citation: Zeyu Gan, Yong Liu. Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective. In The Thirteenth International Conference on Learning Representations (ICLR), 2025. https://arxiv.org/abs/2410.01720

Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning

Published in ICML, 2025

This paper theoretically analyzes external slow-thinking methods in LLMs, linking snowball errors to reasoning accuracy and providing insights to enhance the interpretability of existing approaches.

Recommended citation: Zeyu Gan, Yun Liao, Yong Liu. Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning. In The 42nd International Conference on Machine Learning (ICML), 2025. https://arxiv.org/abs/2501.15602

CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning

Published in ArXiv, 2025

This paper introduces CoT-Space, a novel theoretical framework that recasts LLM reasoning as a continuous optimization problem, which provides a coherent explanation for empirical phenomena such as overthinking.

Recommended citation: Zeyu Gan, Hao Yi, Yong Liu. CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning. arXiv preprint arXiv:2509.04027, 2025. https://arxiv.org/abs/2509.04027

Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents

Published in ArXiv, 2025

This paper proposes Information Gain-based Policy Optimization (IGPO), a simple yet effective RL framework that provides dense and intrinsic supervision for multi-turn agent training.

Recommended citation: Guoqing Wang, Sunhao Dai, Guangze Ye, Zeyu Gan, Wei Yao, Yong Deng, Xiaofeng Wu, Zhenzhe Ying. Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents. arXiv preprint arXiv:2510.14967, 2025. https://arxiv.org/abs/2510.14967

Beyond the Black Box: Theory and Mechanism of Large Language Models

Published in ArXiv, 2026

This survey paper provides a structured roadmap for transitioning LLM development from engineering heuristics toward a principled scientific discipline.

Recommended citation: Zeyu Gan, Ruifeng Ren, Wei Yao, Xiaolin Hu, Gengze Xu, Chen Qian, Huayi Tang, Zixuan Gong, Xinhao Yao, Pengwei Tang, Zhenxing Dou, Yong Liu. Beyond the Black Box: Theory and Mechanism of Large Language Models. arXiv preprint arXiv:2601.02907, 2026. https://arxiv.org/abs/2601.02907

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.