CS 201: Program Analysis in the Era of Software 2.0, XIN ZHANG, MIT

Speaker: Xin Zhang
Affiliation: MIT

ABSTRACT:

Automated program analysis has been widely adopted to ensure software quality. With the software industry experiencing a major shift to machine learning, the program analysis community is facing both opportunities and challenges. While advances in machine learning provide new toolkits to build better analyses, unique properties of ML programs pose challenges to design new analyses that are effective on them.

In the first half of my talk, I will present an online-learning-based approach for generating optimal program approximations for individual scenarios. Deciding what approximations to apply is a key challenge in designing an effective program analysis, as approximations control soundness, precision, and scalability of an analysis. The state-of-the-art relies on an expert analysis designer to choose fixed approximations. These approximations, however, often fail to meet the needs of individual usage scenarios as software artifacts and analysis clients become increasingly diverse. To address this challenge, we propose a framework that adapts a given analysis to user feedback, individual assertions of interest, and the characteristics of the program at hand by learning from information it obtains at runtime. To enable learning, our approach converts a conventional program analysis to a probabilistic analysis that is expressed using Markov Logic Networks. Our empirical evaluation shows that this approach substantially reduces the number of false alarms produced by real-world analyses on large and widely used Java programs.

In the second half, I will talk about novel analyses to ensure interpretability and verify fairness of ML programs. These two properties are attracting growing interest as ML programs are increasingly being used in making critical decisions. For ensuring interpretability, I will present a new analysis to generate actionable feedback to judgments made by a neural network with ReLU activations in the form of minimal, stable, and symbolic corrections. Our analysis reduces the problem to a series of linear constraint satisfaction problems and is evaluated on three neural networks with upto 4,096 neurons. For verifying fairness, I will present an efficient analysis based on adaptive concentration inequalities that provides probabilistic guarantees. We show that our algorithm can scale to problem instances more than 10^6 larger than the state-of-the-art with very low probability of error (10^{-10}).

BIO:

Xin Zhang is a postdoctoral associate at MIT CSAIL. His research areas are programming languages and software engineering. His work has received Distinguished Paper Awards from PLDI’14 and FSE’15. Xin received his Ph.D. from Georgia Tech in 2017 which was partly supported by a Facebook Fellowship.

Hosted by Professor Harry Xu

Date/Time:
Date(s) - Dec 05, 2019
4:15 pm - 5:45 pm

Location:
3400 Boelter Hall
420 Westwood Plaza Los Angeles California 90095