July 11, 2000 | ||||||||||
Home : Technology : Applied | ||||||||||
|
||||||||||
|
Conference tracks advances in computer visionBy R. Colin
Johnson Intel Corp. (Santa Clara, Calif.) also chose the conference to announce
and distribute its proposed open-source-code library of computer vision
algorithms. So far the library includes more than 400 entries for
everything from calibrating cameras to recognizing hand gestures.
"We think that an open-source computer vision library will provide the
infrastructure for integrating computer vision into everyday
applications," said Mark Holler, manager at Intel's Microprocessor
Research Lab. Among the nuggets already in the library are camera
calibration functions, licensed from California Institute of Technology
(Pasadena, Calif.), that allow the use of a wide-angle lens to capture a
large field of view and then correct for the lens distortion.
"We've also licensed a face recognition routine from Georgia Tech that
should be immediately useful to researchers," said Holler. At the
conference, Intel gave away 260 free CDs with the complete library,
including source code and optimized compiled code for Intel's CPUs. It can
be downloaded from the Intel
Web site.
Stealing the show at this year's conference were live demonstrations
showing off the fruits of research on tracking and rendering 3-D models.
Some of the most interesting were designed to extract three-dimensional
models from informal sequences of 2-D images that were shot with a video
camera, such as creating a wire frame of a chess board by just waving it
in front of a video camera. Both Sarnoff Corp. (Princeton, N.J.) and
Geometrix Inc. (San Jose, Calif.) showed such real-time estimation
algorithms, which extract multiview 3-D models from ad hoc 2-D image
sequences.
Stereo vision Point Grey Research (Vancouver, British Columbia) showed an algorithm
that tracked and counted people from a stereo-video camera setup. Another
demo tracked human heads only, again with stereo vision. Another
demonstrated how to create geometrically correct 3-D models of buildings
by using preexisting knowledge about urban environments.
Soatto's coup was a new algorithm that makes sophisticated tracking
possible on a PC.
"For a long time, researchers have been trying to track 3-D objects in
real-time, but their mistake was not to require that their models be
observable," he said. "Observability means that the initial condition is
uniquely determined in my model by the current state."
Where previous models statistically regress from archived data,
Soatto's algorithm utilizes a Kalman filter to predict the future location
of feature points from their current location, as uniquely determined by
the causal determinism of the observability requirement.
Soatto's algorithm addresses a classical problem called in the jargon
shape-from-motion estimations. According to Soatto, almost all other
purely software approaches to this problem collect their data ahead of
time, so that "future" events can be regressed back to the observed
initial conditions, thereby inferring a non-causal model. Real-time
tracking, he said, must begin with a causal model that predicts future
events, then corrects itself when the future arrives and the model's
predictions are in error.
Soatto created his causal model of real-time motion by configuring the
problem's parameters as a nonlinear Kalman filter. It predicts future
responses from current control actions.
Soatto's algorithm assumes only that images from a video camera are
collected in a relatively smooth sequence. With that scant prerequisite,
he was able to prove his model reliable and robust even when objects are
changing orientation or become occluded.
"We handled missing feature points by setting their variance to
infinity or by merely deleting them from the Kalman filter matrix," said
Soatto. This can be done because each row of Soatto's implementation of
the filter matrix is decoupled from the others, thereby allowing a feature
and all its past states to be simultaneously deleted.
More than 300 papers were presented at this year's conference on
subjects such as tracking objects, rendering 3-D models from 2-D video
cameras, retrieving unindexed images and recognizing shapes, faces and
postures. The event also featured tutorials on illumination, color and
motion rendering. More than 500 researchers attended.
|