CS 201 | Haipeng Luo, USC

Convergence of Self-Play in Games

Abstract:
Self-play has emerged as a powerful paradigm for training agents in complex multi-agent environments, underpinning some of the most striking successes in AI, such as AlphaGo and superhuman AI for poker. It enables agents to improve iteratively by interacting with copies of themselves, offering a natural path toward equilibria in games. However, even for the simplest games, the convergence of standard self-play dynamics is not fully understood. In this talk, I will discuss two recent results on this front. In the first part, I will cover some surprising separation between different notions of convergence in self-play for the well-known Optimistic Multiplicative Weight Update (OMWU) algorithm: while it is often considered as the best algorithm in terms of average-iterate convergence, it suffers arbitrarily slow last-iterate convergence due to its lack of forgetfulness. As a remedy, I will show that it still enjoys polynomial best-iterate convergence. In the second part, motivated by the well-known accelerated convergence rate when learning with gradient feedback, I will address the question of whether acceleration is also possible under only noisy reward feedback and provide an affirmative answer using a simple algorithm that enjoys instance-dependent convergence rates.

Bio:
Haipeng Luo is an associate professor and IBM early career chair in the Thomas Lord Department of Computer Science at the University of Southern California. He obtained his PhD from Princeton University in 2016, following which he spent a year at Microsoft Research as a postdoctoral researcher. His research interests are in developing practical machine learning algorithms with strong theoretical guarantees, with a focus on online learning, bandit algorithms, reinforcement learning, learning in games, and others. He has received several awards over the years, including NSF CAREER award, NSF CRII award, Google Faculty Research Award, Google Research Scholar Award, Best Paper Awards at ICML’15, NeurIPS’15, and COLT’21, and Best Student Paper Award at COLT’18.

Date/Time:
Date(s) - May 06, 2025
4:00 pm - 5:45 pm

Location:
3400 Boelter Hall
420 Westwood Plaza Los Angeles California 90095