CS240B
Spring 2008


CS240B

ADVANCED DATA BASES and KNOWLEDGE BASES

Instructor: Carlo Zaniolo

Office Hours: Tuesdays 3:00--5:00pm



Research Presentations

June 4

Burst Detection in Data Streams

Association Rule Mining by Cheuk Yu Yeung


May 28

Gigascope: A Stream Database for Network Applications by Swathi Ganapathi

Minining for Frequent Sequences & Closed Itemsets by Grace Shih


May 21

XML Query Systems by Yung-Cheng Chen

Load Shedding Techniques in Aurora by Zhen Huang

May 19

Continuous Query Support (in DBMS and DSMS) by Alexander Shkapsky

Clustering Data Streams: Birch, Stream, Clu-Stream by MyungWon Ham

May 14

Filtering XML Documents with XPath by Nickolaus Phan

XPath on Streaming XML using YFilter and Xaos by Daniel Carney

May 12

Density-Based Clustering & Visualization of Cluster Structure
By Nikolay Pavlovich Laptev




Week 4. Assignment (Due on WD, April 30)

Download the Weka system and use it to do the following project.

Week 3. Assignment (Due Monday, April 21)

Although DSMS are now taking a leadership role in the Complex Event Processing (CEP) fiel, many other approaches are actively being pursued by commercial vendors and researchers. Your assignement is to compare the following CEP systems and products

  • DSMS startups, including, Streambase, Coral8, Apama, and Truviso (and any other startup company you might know or discover through the web, e.g., Oracle CEP extensions)
  • Systems and product that support CEP using languages other than SQL and XML. For that you might find the following references useful:
  1. www.unix.com/blogimgs/EVENT.PROCESSING.LANGUAGE.SURVEY.V14.OCT.15.2006.pdf
  2. Dagstuhl Seminar Proceedings 07191: Event Processing
  3. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems by David Luckham

Please write a short report (1400 words or less) to describe the history, techniques and application forte of the different approaches to CEP, and the systems implementing them.

Week 2. Assignment. Solutions can be found in here.

Read The following three papers:

* Study K. Guion's overview on OLAP Functions.

* Study the proposed Match-Recognize ANSI Standards for SQL (SQL-MR for short) Discussion blog on SQL-MR

* Study Models and Issues in Data Stream Systems. by B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Invited paper in Proc. of PODS 2002, June 2002.

Task 1.1. Given the temporal table:

emp-history( E#, Dept, Sal, Tstart, Tend)

Use SQL:2003 OLAP functions to compute the relation

emp-history( E#, Dept, Tstart, Tend)

obtained from the previous relation by projecting out Sal and coalesching the time intervals ( You can find a solution of the similar problem in these slides and this paper)

Task 1.2. Express in SQL-MR the example query the SQL-TS queries in the slides

Task 1.3. Write a short (one page or less) report on SQL-MR explaining the parts that, in your view, have problems or require clarification (e.g., my view is thatoutputs are not clear in the presence of alternative patterns).