Declarative Failure Recovery for Sensor Networks
Proceedings of the Sixth International Conference on Aspect-Oriented Software Development (AOSD 2007), Vancouver, British Columbia, March 12-16, 2007.
Ramakrishna Gummadi, Nupur Kothari, Todd Millstein, Ramesh Govindan
Wireless sensor networks consist of a system of distributed sensors
embedded in the physical world, and promise to allow observation of
previously unobservable phenomena. Since they are exposed to
unpredictable environments, sensor-network applications must handle
a wide variety of faults: software errors, node and link failures,
and network partitions. The code to manually detect and recover from
faults crosscuts the entire application, is tedious to implement
correctly and efficiently, and is fragile in the face of program
modifications. We investigate language support for modularly
managing faults. Our insight is that such support can be naturally
provided as an extension to existing "macroprogramming" systems
for sensor networks. In such a system, a programmer describes a
sensor network application as a centralized program; a compiler then
produces equivalent node-level programs. We describe a simple
checkpoint API for macroprograms, which can be automatically
implemented in a distributed fashion across the network. We also
describe declarative annotations that allow programmers to specify
checkpointing strategies at a higher level of abstraction. We have
implemented our approach in the Kairos macroprogramming
system. Experiments show it to improve application availability by
an order of magnitude and incur low messaging overhead.
[PDF]