CS 201 | Representation and Control of Meanings in Large Language Models, STEFANO SOATTO, Amazon Web Services – UCLA | Computer Science

Speaker: Stefano Soatto
Affiliation: Amazon Web Services - UCLA | Computer Science

ABSTRACT:

Information, Knowledge, Meaning, and Understanding are being talked about informally in reference to large-scale generative models. Large Language Models and World Models are alternatively described as stochastic parrots or human-like reasoning agents, in both cases without sound definitions. I will describe a notion of accessible information in large-scale trained models that works for instantiated dataset and deterministic trained maps at scale, and relate it to classical notions from Fisher, Shannon, Kolmogorov, and Solomonoff. I will also show how the learning dynamics challenge traditional notions of regularization and generalization, and engender complex behavior such as the emergence of critical learning periods and latent topological and algebraic structures in the representation (“neuralese”). I will then describe how today’s generative models, viewed as stochastic dynamical systems, represent and instantiate “meanings,” which are equivalence classes of expressions (tokenized data sequences), and view them alternatively as vectors or distributions over continuations of trajectories. These allow representing asymmetric relations such as entailment and containment, and formalize the notion of control of the “state of mind” of AI bots, which is key to their safe and secure realization and deployment. Along the way, I will point to how current models, viewed mechanistically, can help reframe some old epistemological questions for the modern age. Finally, I will describe how these ideas can be defined to address current challenge in measuring conceptual similarity in the context of privacy and attribution, and to ensure safety through disgorgement of information and knowledge stored in the models’ weights.

BIO:

Stefano Soatto is Vice President at AWS, where he has led the teams that have developed AWS AI applications in the areas of Vision, Speech, Language, and Verticals, including and most recently Foundation Models as a Service. These include Amazon Bedrock, Amazon Titan Models, Amazon CodeWhisperer, and Amazon Q, in addition to Amazon Comprehend, Amazon DevOps Guru, Amazon Forecast, Amazon Kendra, Amazon Lex, Amazon Lookout for Vision, Lookout for Metrics, Lookout for Equipment, Amazon Personalize, Amazon Rekognition, Amazon Textract, Amazon Translate, Amazon Transcribe. HE is also a Professor of Computer Science and Electrical Engineering at UCLA, where he i the founding director of the UCLA Vision Lab, and a Fellow of the IEEE and the ACM.

Date/Time:
Date(s) - Feb 27, 2024
4:15 pm - 5:45 pm

Location:
3400 Boelter Hall
420 Westwood Plaza Los Angeles California 90095