CS 201 | Yezhou Yang, Arizona State University

“Compositional, efficient, and robust): Visual Concept Learning in the #GenAI era”

Abstract:
The goal of Computer Vision, as coined by Marr, is to develop algorithms to answer What are Where at When from visual appearance. The speaker, among others, recognizes the importance of studying underlying entities and relations beyond visual appearance, following an Active Perception paradigm. This talk will present the speaker’s efforts over the last decade, ranging from 1) reasoning beyond appearance for vision and language tasks (VQA, captioning, T2I, etc.), and addressing their evaluation misalignment (SMURF, ConceptBed), through 2) reasoning about implicit properties (such as spatial consistency – SPRIGHT and REVISION), to 3) their roles in efficient (ECLIPSE) and reliable (WOUAF, R.A.C.E.) image #GenAI models.

Date/Time:
Date(s) - Oct 03, 2024
4:00 pm - 5:45 pm

Location:
3400 Boelter Hall
420 Westwood Plaza Los Angeles California 90095