Speaker: Xin Zhou
Title: Unifying the Processing of XML Streams and Relational Data Streams
Time: 12:30-2:00
Room: BH 4549
Abstract
Relational data streams and XML streams have previously
provided two separate research foci, but their unified
support by a single Data Stream Management System
(DSMS) is very desirable from an application viewpoint. In
this paper, we propose a simple approach to extend relational
DSMSs to support both kinds of streams efficiently.
In our Stream Mill system, XML streams expressed as SAX
events, can be easily transformed into relational streams,
and vice versa. This enables a close cooperation of their
query languages, resulting in great power and flexibility.
For instance, XQuery can call functions defined in our SQLbased
Expressive Stream Language (ESL) using the logical/
physical windows that have proved so useful on relational
data streams. Many benefits are also gained at the
system level, since relational DSMS techniques for load
shedding, memory management, query scheduling, approximate
query answering, and synopsis maintenance can now
be applied to XML streams. Moreover, the many FSA-based
optimization techniques developed for XPath and XQuery
can be easily and efficiently incorporated in our system. Indeed,
we show that YFilter, which is capable of efficiently
processing multiple complex XML queries, can be easily integrated
in Stream Mill via ESL user-defined and systemdefined
aggregates. This approach produces a powerful and
flexible system where relational and XML streams are unified
and processed efficiently.
Speaker: Feng Qiu
Title: Automatic Identification of User Interest For Personalized Search
Time: 12:30-2:00
Room: BH 4549
Abstract
One hundred users, one hundred needs. As more and more
topics are being discussed on the web and our vocabulary
remains relatively stable, it is increasingly difficult to let the
search engine know what we want. Coping with ambiguous
queries has long been an important part of the research on
Information Retrieval, but still remains a challenging task.
Personalized search has recently got significant attention
in addressing this challenge in the web search community,
based on the premise that a userÕs general preference may
help the search engine disambiguate the true intention of a
query. However, studies have shown that users are reluctant
to provide any explicit input on their personal preference.
In this paper, we study how a search engine can learn a
userÕs preference automatically based on her past click his-
tory and how it can use the user preference to personalize
search results. Our experiments show that usersÕ preferences
can be learned accurately even from little click-history data
and personalized search based on user preference yields sig-
nificant improvements over the best existing ranking mech-
anism in the literature.
Speaker: Hyun Jin Moon
Title: Support for Historical Queries and Schema Evolution in XML and Relational DBMS 
Time: Friday(Mar., 3) 12:30-2:00pm
Room: BH4549
Abstract
Schema is the interface between the database and the applications:
the database is organized under a schema, and application queries
are written against the schema. For this reason, it is desired that
the schema would remain unchanged. However, in real world scenarios,
schemas do change many times during  lifetime, posing a host of
challenging schema evolution problems in information system
research.  In this paper, we first consider the problem in archival
information systems, i.e. systems that preserve the history of the
database content and  support temporal queries on such history. We
discuss exiting approaches to archival databases and temporal
queries in the situation where the schema has remained unchanged,
and only the database has evolved over time. Then, we concentrate on
the  more difficult problem of supporting temporal queries when the
schema has also evolved over time, resulting in multiple versions of
schema, and multiple versions of the database under each schema
version. To address this challenging problem, we propose an
XML-based approach to represent the combined history of database
schema and content, and  mapping techniques to translate queries
between different versions of schemas. Then, we turn to the problem
of schema evolution in current databases, and explore the use of
similar techniques to support a more gradual transition of the
database and applications  from the old schema to the current one.
Our objective is to address these two independent, but
closely-related problems within a unified framework.
Speaker: Yan-Nei Law
Title: Models and Operators for Continuous Queries on Data Streams 
Time: Friday(Feb., 24) 12:30-2:00pm
Room: BH4549
Abstract
A new generation of data-intensive applications is emerging for
managing and querying information that, rather than residing in
databases, flows continuously through the network in the form of
massive data streams. Hence, there is much research work on
designing Data Stream Management Systems, and the approach favored
by many research projects consists in extending database languages
and technology for data streams. However, the new computational
environment brings significant research challenges in areas such
as query languages, query processing, and advanced applications.

In this talk, we first focus on the limitations of
relational languages in expressing continuous stream queries. A
main limitation follows from the fact that only nonblocking
operators can be used in continuous queries, which makes
relational languages incomplete on these queries. To address this
problem, we investigate user-defined aggregates natively defined
in SQL itself, and prove that these make SQL (i) Turing-complete
on stored data, and (ii) complete on data streams. Furthermore, we
illustrate the effectiveness of the proposed extensions on complex
applications involving time-series queries, and mining queries.

For advanced applications, we focus on data-stream mining
algorithms, which must now be redesigned to make lighter demands
on resources and display greater adaptability than those on stored
data. In this talk, we introduce ANNCAD, which uses
multi-resolution data representation to classify new test points
using the nearest-neighbors principle. The incremental property
and very fast update speed make ANNCAD very suitable for mining
data streams. Our experiments show that ANNCAD is adaptive and
works well in many applications, including image recognition and
censor surveying.

We then study the problem of stream query processing. We propose a
load-shedding technique for multi-join, called Msketch, which
makes decisions based on the productivity of tuples, rather than
only the content of the joined pair stream. A thorough study shows
that Msketch outperforms other existing algorithms. Finally, we
propose general techniques for optimizing the accuracy of window
aggregates, statistical aggregates and mining queries in the
presence of sampling. Our method incorporates prior knowledge into
an error model that is used to reduce the uncertainty introduced
by sampling. We also extend the method to adjust to concept
shifts.
Speaker: Professor Arne and Ingeborg Solvberg
Title: Concepual Modelling for the Semantic Web 
Time: Friday(Feb., 10) 12:00-2:00pm
Room: BH4549
Abstract
Over the last 10 years the web has emerged as a primary vehicle for
disseminating information among people and among businesses. Along with
this comes an increased need for interoperability, e.g., in order to
support value chain organised business. In technical terms this
translates to an increased need for enterprise modelling and for
information content modelling. For both of these purposes ontologies are
of central importance.

The talk will present a view of what constitutes the central issues.
Ongoing research at NTNU in Trondheim will be presented, in model
management, semantic annotation of process models, ontology alignments.

The talk will be rounded off by a short overview of a new initiative of
establishing a field test laboratory for developing and testing out
mobile information services. The laboratory comprises a wireless
broadband service covering substantial part of downtown Trondheim.
About authors
Professors Arne and Ingeborg Solvberg of NTNU (Norwegian
Technical-Natural science University) at Trondheim, Norway
http://www.ntnu.no/indexe.php stay with UCLAÕs CS department during
the spring of 2006. They are hosted by Wes Chu.

ArneÕs research area is Information Systems
http://www.idi.ntnu.no/grupper/is/. IngeborgÕs area is Digital
Libraries http://www.idi.ntnu.no/grupper/if/.