Faculty Candidate Talk – Shumo Chu – Automated Reasoning of Database Queries – University of Washington

ABSTRACT: From booking air tickets to analyzing astronomy datasets, data management systems are pervasive in people’s work and life. In this talk, I argue that applying formal methods techniques can help us to build more reliable and secure data management systems. My talk will focus on a fundamental milestone towards this goal, automated reasoning of database queries. I will present Cosette, the first tool for reasoning the semantic equivalence of SQL queries, which leverages both an interactive theorem prover and a SMT solver. To reason SQL queries in these proof systems, we developed a formal semantics of SQL based on semirings, which covers the major SQL features, including sophisticated ones such as grouping, aggregate, correlated subqueries, and integrity constraints. As a result, the reasoning of SQL equivalences is lifted to reasoning equivalences of extended semiring expressions, which is much easier. For a pair of SQL queries, Cosette will either certify their equivalences by automatically finding a mechanized proof and validating the proof in an interactive theorem prover, or show their inequivalence by providing a counter-example that is found by a SMT solver. Cosette can decide the equivalence or provide counter example for a wide range of practical SQL queries collected from database literature, real-world optimizer rules and bugs, and data management class homework assignment from UW. Last, I will conclude with future directions on how to apply formal methods to build more reliable, secure, and privacy preserving future data management systems.

BIO: Shumo Chu is a doctorate student at the University of Washington. His research interests are in data management and formal methods. Shumo’s dissertation focus on developing theories, algorithms, and systems for automated reasoning of database queries. This requires leveraging both interactive theorem prover and constraint solver techniques. In the past, he had worked on large scale graph processing and query processing for distributed database systems. He also did internships with DMX group at Microsoft Research and Google’s Spanner team. He had received the best demo award from SIGMOD and “best of” citation of KDD.

Date/Time:
Date(s) - Apr 04, 2019
12:00 pm - 1:00 pm

Location:
Engineering VI – Room 289
404 Westwood Plaza Los Angeles California 90095