Advances in machine learning are reshaping the foundations of science and technology. Despite their empirical success, a rigorous theoretical understanding of when and why machine learning algorithms succeed or fail remains limited. My research aims to bridge this gap by developing theoretical frameworks that explain the underlying mechanisms of machine learning and guide the design of principled and effective algorithms. In particular, I aim to understand the limitations of existing methods and to develop theoretically grounded solutions that address these limitations. My current research focuses on:
Language Models
Representation Learning
Synthetic Data Generation
Ongoing Work
How to Correctly Report LLM-as-a-Judge Evaluations [arxiv] [github] (under review)
Chungpa Lee, Thomas Zeng, Jongwon Jeong, Jy-yong Sohn, Kangwook LeeFine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models [arxiv] (under review)
Chungpa Lee, Jy-yong Sohn, Kangwook LeePoisson Regression with Additive Exponential Mean: Statistical Modeling and Insurance Applications (under review)
Chungpa Lee, Joseph H.T. Kim
Publications
On the Similarities of Embeddings in Contrastive Learning [paper] [arxiv] [github]
Chungpa Lee, Sehee Lim, Kibok Lee, Jy-yong Sohn
In Proceedings of the 42nd International Conference on Machine Learning (ICML), 2025A Generalized Theory of Mixup for Structure-Preserving Synthetic Data [paper] [arxiv] [github]
Chungpa Lee, Jongho Im, Joseph H.T. Kim
In Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS), 2025A Theoretical Framework for Preventing Class Collapse in Supervised Contrastive Learning [paper] [arxiv] [github]
Chungpa Lee, Jeongheon Oh, Kibok Lee, Jy-yong Sohn
In Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS), 2025Analysis of Using Sigmoid Loss for Contrastive Learning [paper] [arxiv] [github]
Chungpa Lee, Joonhwan Chang, Jy-yong Sohn
In Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS), 2024