Professor Kim has taken a leadership role in defining the emerging area of Software Engineering for Data Analytics (SE4DA) and was invited to give a Distinguished Lecture at both UIUC Computer Science and University Minnesota’s Cray Distinguished Speaker Series. UIUC’s Distinguished Speaker Series invites CS leaders to promote conversations about important challenges in computing. UMN’s Cray Colloquium Series was established in 1981 by an endowment from Cray Research and brings distinguished visitors to the Department of Computer Science & Engineering every year. In her lecture, Professor Kim discussed how we are currently at an important juncture where software engineering meets the data-centric world of big data, machine learning (ML), and artificial intelligence (AI). Based on her large scale study of almost 800 professional data scientists in the software industry, she argued for re-targeting software engineering research to address new challenges in data-centric software development. She then showcased her group’s research on productivity tools for debugging and testing data-intensive applications: i.e., data provenance, symbolic-execution based test generation, and automated fuzz testing in Apache Spark. She concluded with discussing the open problems in SE4DA.

