Speaker: Alon Halevy
Affiliation: Recruit Institute of Technology
ABSTRACT: In recent years, the main search engines have started displaying answers from structured data sources in response to Web search queries. These answers are based on vast knowledge bases about notable entities in the world, their properties and relationships between them. These answers focus on head queries, are required to be of very high accuracy and must be non-controversial. In this talk I will describe a few efforts to expand the breadth of answers from structured data. In the first, I will describe recent results about the WebTables Project in which we mine a corpus of HTML tables on the Web that contain high-quality data about long-tail content and present them in response to relevant queries. In the second, I describe the Biperpedia Project, which aims to considerably expand the schema on which the Google Knowledge Graph is based. Biperpedia mines attribute names from the query stream and from Web text and contains the long and heavy tail of attributes that are of interest to users. This unique collection of attribute names presents new opportunities to understand the space of queries concerning structured data. Finally, I describe the Surveyor System that discovers high-confidence subjective facts about properties of entities. Surveyor uses a probabilistic model about content generation on the Web to analyze the content of Web documents and decide whether there is a dominant opinion about such subjective facts. BIO: Alon Halevy is the Executive Director of the Recruit Institute of Technology. From 2005 to 2015 he headed the Structured Data Management Research group at Google. Prior to that, he was a professor of Computer Science at the University of Washington in Seattle, where he founded the Database Group. In 1999, Dr. Halevy co-founded Nimble Technology, one of the first companies in the Enterprise Information Integration space, and in 2004, Dr. Halevy founded Transformic, a company that created search engines for the deep web, and was acquired by Google. Dr. Halevy is a Fellow of the Association for Computing Machinery, received the Presidential Early Career Award for Scientists and Engineers (PECASE) in 2000, and was a Sloan Fellow (1999-2000). Halevy is the author of the book “The Infinite Emotions of Coffee”, published in 2011, and serves on the board of the Alliance of Coffee Excellence. He is also a co-author of the book “Principles of Data Integration”, published in 2012. Dr. Halevy received his Ph.D in Computer Science from Stanford University in 1993 and his Bachelors from the Hebrew University in Jerusalem.
Hosted by Professor Carlo Zaniolo
REFRESHMENTS at 3:45 pm, SPEAKER at 4:15 pm
Video Taped Lecture:
Date(s) - Feb 04, 2016
4:15 pm - 5:45 pm