CS 201 | Chenguang Wang, UC Santa Cruz


CS 201 | Chenguang Wang, UC Santa Cruz

“Towards Safe and Secure LLM Agents”

The emergence of large language model (LLM) agents (e.g., OpenClaw, Claude Cowork), systems capable of planning, acting, and interacting with external environments, has enabled increasingly autonomous workflows. While these agentic systems demonstrate impressive capabilities, their autonomy also amplifies safety and security risks, ranging from unintended behaviors to vulnerabilities in the real world. As LLM agents move from controlled settings into open-world deployments, ensuring their safe and secure behavior becomes a central challenge. In this talk, I will present our recent research on understanding and improving the safety and security of LLM agents. I will begin by examining the foundations of agent behavior, including recent work on peer preservation, as well as evaluations for reasoning about agent reliability such as HLE. Next, I will introduce approaches for improving agent robustness through post-training, highlighting our work on rLLM, an open-source framework for post-training agents. Finally, I will briefly discuss real-world applications and deployment challenges of agents. I will conclude by outlining our vision for building a community around these issues, including the organization of “Agents in the Wild: Safety, Security, and Beyond” workshops at ICLR 2026 and ICML 2026.

Chenguang Wang is an Assistant Professor in the Department of Computer Science and Engineering at UC Santa Cruz, and a Research Advisor at Scale AI. Previously, he was an assistant professor at WashU, a postdoc at UC Berkeley, and a research scientist at Amazon AI. He received his Ph.D. from Peking University and was a visiting Ph.D. student at UIUC. His recent work is focused on trustworthy agentic AI. He has created and contributed to several impactful research findings and open-source systems, including rLLM, peer-preservation, and HLE. He is the recipient of several academic awards, such as the 2024 Google Research Scholar Award and 2026 Thinking Machines Lab Tinker Research Grant, and his work has garnered media attention, including coverage in MIT Technology Review and Fortune.

 

Date/Time:
Date(s) - Apr 16, 2026
4:00 pm - 5:45 pm

Location:
3400 Boelter Hall
420 Westwood Plaza Los Angeles California 90095