CS 201 | Robin Jia, USC

Auditing, Understanding, and Leveraging Large Language Models

Abstract:
The rise of large language models offers opportunities to both scientifically study these complex systems and apply them in novel ways. In this talk, I will describe my group’s recent work along these lines. First, I will discuss data watermarks, a statistically rigorous technique for auditing a language model’s training data based only on black-box model queries. Then, we will investigate how language models memorize training data: based on results from two complementary benchmarks, I will demonstrate the viability of localizing memorized data to a sparse subset of neurons. Next, I will provide a mechanistic account of how pre-trained language models use Fourier features to solve arithmetic problems, and how pre-training plays a critical role in these mechanisms. Finally, I will show how to leverage the complementary strengths of large language models and symbolic solvers to handle complex planning tasks.

Bio:
Robin Jia is an Assistant Professor of Computer Science at the University of Southern California. He received his Ph.D. in Computer Science from Stanford University, where he was advised by Percy Liang. He has also spent time as a visiting researcher at Facebook AI Research, working with Luke Zettlemoyer and Douwe Kiela. He is interested broadly in natural language processing and machine learning, with a focus on scientifically understanding NLP models in order to improve their reliability. Robin’s work has received best paper awards at ACL and EMNLP.

 

Date/Time:
Date(s) - Nov 05, 2024
4:00 pm - 5:45 pm

Location:
3400 Boelter Hall
420 Westwood Plaza Los Angeles California 90095