Publications

Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective

Published in ICLR, 2025

This paper explores the critical role of synthetic data in enhancing the post-training performance of large language models (LLMs) from a novel reverse-bottleneck perspective.

Recommended citation: Zeyu Gan, Yong Liu. Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective. In The Thirteenth International Conference on Learning Representations, 2025. https://arxiv.org/abs/2410.01720

Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning

Published in arXiv, 2025

This paper theoretically analyzes external slow-thinking methods in LLMs, linking snowball errors to reasoning accuracy and providing insights to enhance the interpretability of existing approaches.

Recommended citation: Zeyu Gan, Yun Liao, Yong Liu. Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning. arXiv preprint arXiv:2501.15602, 2025 https://arxiv.org/abs/2501.15602

Superclass Learning with Representation Enhancement

Published in CVPR, 2023

This paper introduced a novel coarse-grained classification situation called superclass learning, and proposed an attention-based framework (SCLRE) to extract superclass-aware representations.

Recommended citation: Zeyu Gan, Suyun Zhao, Jinlong Kang, Liyuan Shang, Hong Chen, Cuiping Li. Superclass Learning With Representation Enhancement. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24060-24069, 2023. https://openaccess.thecvf.com/content/CVPR2023/html/Kang_Superclass_Learning_With_Representation_Enhancement_CVPR_2023_paper.html