Research Topics
Advances in machine learning are reshaping the foundations of science and technology. Despite its successes and profound impact, a rigorous theoretical understanding of why machine learning algorithms succeed in some cases and fail in others remains limited. My research addresses this gap by developing theoretical foundations that explain when and why machine learning works and guide the design of more effective algorithms. My current research includes theoretical foundations of machine learning for:
Language Modeling
Representation Learning
Self-Supervised Learning, Multi-modal Learning, Contrastive Learning
Synthetic Data Generation
Publications [Full publication list]
On the Similarities of Embeddings in Contrastive Learning [arxiv] [github]
Chungpa Lee, Sehee Lim, Kibok Lee, Jy-yong Sohn
In Proceedings of the 42nd International Conference on Machine Learning (ICML), 2025
A Generalized Theory of Mixup for Structure-Preserving Synthetic Data [paper] [arxiv] [github]
Chungpa Lee, Jongho Im, Joseph H.T. Kim
In Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
A Theoretical Framework for Preventing Class Collapse in Supervised Contrastive Learning [paper] [arxiv] [github]
Chungpa Lee, Jeongheon Oh, Kibok Lee, Jy-yong Sohn
In Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Analysis of Using Sigmoid Loss for Contrastive Learning [paper] [arxiv] [github]
Chungpa Lee, Joonhwan Chang, Jy-yong Sohn
In Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS), 2024