Jacques J. VidalUCLA Computer Science Department and UCLA Brain Research Institute
It is a safe prediction that, not very far into the next century, computer technology will have dragged a significant portion of human activities into the world we are now beginning to call cyberspace, a world without historical precedent, with a structure that almost invalidates all the familiar space and time boundaries that have been the bedrock of experience for millenniums.
At this time, many still perceive the arrival of computers in the past few decades as just the last of a series of the industrial events that have shaped modern life, such as railroads, automobiles or airplanes, or as a continuation of the revolution in communications created by the telegraph and the telephone. Certainly, the distant ears and eyes of Radio and Television brought the sounds and the vistas of other places and other cultures inside our home and enriched life even in distant and backward villages. Yet it can be argued that these have extended rather than transformed the range of human experience.
By all indications the computer revolution is different and considerably more socially subversive. Computers have progressively expanded their stature from the initial actuarial, librarian or scientific to invade more and more intimate aspects of human life, giving birth to a world of its own. The value added by electronic communication to almost all human endeavor is making this network a mandatory presence in modern life. Indeed, for cyberspace citizens this virtual world of electronic information will often be the environment where one might find gainful work, cultural nutrition, learning, entertainment and even social intercourse. Computers are busy progressively reshaping the way we experience the world.
The technological phenomenon that supports this revolution is the wrapping of the world into an increasingly dense electronic communication network, where information flows at the speed of light and for which nation borders, time zones and physical distances are irrelevant details. The most prominent manifestation of this phenomenon is the Internet, a continually expanding and evolving computer network that nearly doubles in size every year, spans the whole world and progressively claims and molds to its needs all available transmission media, from ordinary phone lines to satellites or fiber optics links. The Internet makes interactive information distribution, information gathering, commercial transactions and personal communication instantly available across the world for the benefit of governments, academies and commercial companies. At this writing, already some fifty millions of human users rely on Internet resources to communicate with each other and to access information and services at over ten million hosts across the world.
The precursor of Internet, Arpanet , was created around 1969 and was sponsored by the U.S. Advanced Research Projects Agency (ARPA) mainly as a research tool for computer scientists and US military and government agencies. The Computer Science Department at the University of California Los Angeles (UCLA), played a key role in Arpanet development and maintained one of the initial servers of the nascent network.
By the nineties, Arpanet had become the Internet and was no longer confined to government and academia. It became a phenomenon of society, an outcome that very few of those who had witnessed its birth had anticipated.
The computers or "hosts" that form the Internet, use a common set of communication standards or "protocols " referred to as TCP/IP. Computer messages are split in fixed size chunks of binary symbols called packets, each containing the host of origin, the destination host and a slice of the information being transmitted. Packets are sent through the network as separate packages which will often reach their destination through different routing. Re-assembly takes place at the destination site. These protocols and this packet switching form the critical technologies that the present Internet has inherited from Arpanet.
Agents and Cognitive Robots
A recent new frontier in the field of Artificial Intelligence is that of Intelligent Software Agents, a computer program category that is still searching for a precise definition or consensus (Wooldridge M. and Jennings N.R.,1995, Rieker D. 1994). Software agents perform services for human computer users, but possess some qualities of autonomy, reasoning and decision power, and are able to take initiatives and perform actions. Current implementations of that concept are still limited and typically confined to narrow fields of specialization. It is likely however that the near future will see a proliferation of autonomous software agents.
A specific characteristic of agents is mobility in contrast with programs running in a fixed host environment (Ferguson I.A.,1992). The distributed world of Internet, this network of networks, epitomizes the kind of playground suitable for deploying Intelligent agents. Recent computer languages like JAVA and ActiveX have created much excitement precisely because they are suited for the autonomy, portability and mobility that software agents will require. The difficulties involved are considerable and the ultimate goals are among the most ambitious challenges facing Artificial Intelligence. Hence progress has been relatively slow.
It is now recognized that one impediment to agent development is the present communication procedure used on the web, the so-called client-server protocol. This protocol implies directional constraints on message passing and regular interruption of connection. It is inadequate to implement agents that need to communicate with other agents over open paths, in a peer-to-peer fashion. The future evolution of Internet protocols is expected to remedy that situation.
Again the US Department of Defense has taken notice and launched a large program called I*3, for "Intelligent Integration of Information". Its goal is to master new technologies to deal with dynamically changing, potentially inconsistent, incomplete and heterogeneous data sources. In other words, agents capable of accommodating conflicting goals, choose among them and reason about the means to reach those that have been chosen. This involves a number of difficult problems in Artificial Intelligence, such as the representation of beliefs about present and future situations.
The virtual world into which agents must evolve will also be populated with other agents, whose cooperation will be necessary in order to accomplish the goal. Cooperation between agents will require the autonomous generation of deliberate messages. In yet another twist of such scenarios, some of those roaming agents can also be hostile, in the fashion of already well known destructive software like viruses, Trojan horses and logic bombs. One sees the eventual emergence of a shadow society of software robots, with its own indigenous sociology, that would become deeply threaded into human society in significant ways, albeit still difficult to predict.
In the context of human-computer interfaces, one particular class of intelligent agents assumes a special significance, that which will embody a model of the user to perform its personalized services. These User Agents would operate as the images or alter-ego of their human counterpart. To realistically emulate human intelligent behavior, a user's agent will likely consist of a whole society of sub-agents interacting and communicating with each other. In "Society of Minds", Marvin Minsky' attempts to explain human intelligence in precisely this distributed multi-agent concept . The book provides some good insight on the feasibility as well as the complexity of the task.
Again, existing examples of user agents tend also to be limited to narrow fields and to the discovery and embodiment of only static characteristics of their masters. One can find server programs on the World Wide Web that, for instance, will advise a user on a choice of musical titles to listen to or buy. This requires a model of the user, and his or her tastes obtained from an interactive inquiry. The future is likely to reveal much more sophisticated autonomous agents, capable of taking into account dynamically the current mental state of the user. These provide one of the motivations for some of the bionic technology to be discussed later.
A more subtle, but just as profound and irreversible aspect of the invasion of computers in the cognitive environment of modern men is that the human-computer dialogue is progressively becoming more intimate and natural, to the point of competing with and often even displacing what heretofore would have been reserved to human communication.
Only a decade or so ago, dialoguing with a computer usually meant the keyboard typing of instructions into a terminal, using string oriented and often arcane script language. Because of efficiency and low overhead, this is still the option of choice in many situations, especially with experienced users and with people who typically have invested considerable time in learning to address computers on their own terms, namely professional programmers.
The arrival of graphic user interfaces (GUI's) using mice, tablets or joysticks was a breakthrough developed and nurtured in the seventies by a team at the Xerox Park research center. With a graphic interface, part of the computer information is presented in the form of windows and icons which can be visually identified. The icons themselves can be moved and acted upon using natural hand movements and a mouse. The lessons learned then made possible the recruiting of a whole population of new computer users in the eighties, following Apple's introduction of the Lisa and the Macintoshes. The new style of human-computer communication possesses considerably more appeal because it relies on more natural human behaviors and reduces the requirement for memorizing a collection of command names.
Creating an effective bridge between the human brain processes responsible for perception and inductive problem solving and the general symbol-manipulating capabilities of the computer is the goal that drives the developers of human-machine communication systems. Their goal is to narrow the gap that remains between the natural boundaries to the human body , namely limbs, special senses and tactile skin, with the input and display mechanisms that can be attached to computers. Indeed the avoided aim of human-computer interfacing is to blur awareness of the frontier between the real and the virtual world.
The major success story of the Internet, the World Wide Web illustrates the trend. The Web is an immense distributed depository of mostly free information on essentially any subject. Documents on the web which were once limited to text are increasingly becoming multimedia, i.e., contains images, both still and movies, as well as sounds. Access is obtained by clicking labels or buttons on the screen rather than laboriously typing in queries. The multimedia technologies that proliferate on the web have made the human-computer interaction much more natural and also less threatening for many users. New tools are regularly appearing on the web bringing into the unified realm of internet continually improved images and sounds and recently allowing communication to take place in real-time.
Several research groups are working on exotic interfacing projects financed by multinational companies. One case in point is a project of the MIT Media center called "Things That Think". It is a research consortium involving several MIT professors and some forty corporate sponsors. One of the motivations for this project is that the present arsenal of interfacing tools is often a redundant, heterogeneous collection of gadgets. These should shrink and be replaced by more capable unimedia components that are compact and portable. One particular concept is that of wearable computers, where ordinary articles like footwear and glasses are endowed with intelligent sensors and made part of the human-computer interface. Another is the use of intrabody signaling, using the human body itself as the electrical infrastructure of a local area network. This development and that of appropriate body net protocols is an important part of the MIT program, along with that of power sources using body motion rather than batteries. Affective computing, the integration of emotional states in the interchange that will be evoked later
is also on the project agenda.
Virtual Reality (VR) can be viewed as a step beyond the generic graphic interfaces just mentioned. It forms a special class of applications where the human-machine interaction is a specific perceptual world designed to create a compelling illusion into which the user can become immersed (Pimentel K. and Teixeira K.,1993, Gigante M.A., 1993). Successful VR systems deliver an interactive sensorimotor experience, that can be realistic enough to cause "suspension of disbelief"(Psotka J., 1993).
VR has existed for years and with considerable sophistication in flight simulators and many other military and civilian applications such as trainers for truck driving, tank warfare or missile launches. However, by their very nature, these applications did not concern large categories of users.
In the nineties however, civilian VR applications became a common sight at computer conventions and gained the attention of the popular media. The emergence was for a great part made possible by the vertiginous increases in computer power that had taken place in the decade, along with dramatic decreases in cost.
New peripheral devices have appeared to facilitate access into the virtual world and keep out the distractions of ordinary reality. Head mounted displays are helmet like devices that attach to the user's head and project the images created by the computer. Projection can be made from small cathode ray tubes (CRTs), light emitting diodes (LEDs) or liquid crystal displays(LCDs). Natural stereopsis can be emulated by presenting separately to each eye images whose viewpoints are horizontally shifted, letting the user's mind produce realistic three-dimensional views of objects. A process under exploration at the University of Washington, is the Laser Retinal Scanner, would scan the image directly into the retina, bypassing the display entirely. While still very experimental, this type of research suggest that the roster of possible display technologies is still far from closed.
By using position sensors outside the head mounted display, the view can be made to respond to changes in the user's visual point of view. This brings a strong sense of reality to the virtual space by making the projected scenery respond to the user's head movements.
Earphone delivery of truly multidirectional sound (i.e., fore and aft, up and down as well as sideways) dramatically add to the realism of acoustical information (Brewster S.A. et al., 1994). The sound is filtered so that it appears to emanate from a specified direction and still is delivered through ordinary headphones. The filtering emulates the distortion of the sound caused by the body, head, and pinna, fooling the neurological auditory system. There again, user head position can be fed back to the sound direction, for instance to make it appear as coming from a fixed positions in virtual space (Bodden M., 1993).
Many body motion detection devices are becoming available. Three-dimensional position trackers can detect the position and velocity of limb movements, allowing "gesture" input or communicating hand signals such as those of the Sign Languages for the deaf (Koons, D.B. et al., 1993). Control gloves and wands, capable of capturing complex movement of hand and fingers, can be used for pushing and grasping virtual objects (Massie T.H. and Salisbury J.K., 1994),
These user-controlled devices can also provide sensory feedback. For example, a force-reflecting joystick can be equipped with motors that apply forces in any of two directions (Akamatsu M. and Sato S., 1994). The method can be extended to an entire exoskeleton (Jau B.M., 1988). Tactile-haptic stimulation can be achieved by air jets, vibrotactile devices such as blunt pins, voice coils, or piezoelectric crystals. Electrotactile stimulation can also be achieved by with electrical pulses from small electrodes to the user's fingers.
At this time, these somatic communication devices have found a number of uses in applications of virtual reality where entertainment and video games occupy a place of choice. But VR most significant impact will come from other directions. In what came to be known as "telepresence", virtual "hands-on" operation can be conducted in hostile environments such as the sea bottom, damaged nuclear reactors chambers or even distant planets by controlling human like robots dispatched on the site. VR has considerable potential in many areas still in the research stage such as Microsurgery, where a surgeon's hand movements can be scaled down to operate on small, and even microscopic body parts while hand and eye feedback is returned to the surgeon at a macroscopic level suitable to normal hand motion.
For years an important motivation for biologically assisted interfacing has been the quest for compensatory means of physical access for handicapped persons. Many notable examples exist of control prosthetics that are dramatically enabling for handicapped, but often unusable by normal subjects, which lack the need and motivation to master what are often complex and cumbersome devices. Dependency on cumbersome machinery is evidently also an obstacle to the wide proliferation of VR environments and the invention of devices that are convenient and unobtrusive is one of the major challenges facing this field.
Biofeedback and Biocybernetics Control
By the early seventies, several funding agencies of the US Department of Defense had also become interested in technologies that would permit a more immersed and intimate interaction between humans and computers and would include so-called bionic applications. These concerns again converged to ARPA. One outcome was a program proposed and directed by Dr. George Lawrence whose vision guided its evolution during the subsequent years. Initially, its named focus was auto regulation and cognitive biofeedback. Its goal was developing biofeedback techniques that would improve human performance, in particular that of military personnel engaged in tasks demanding high mental loads. The auto regulation research produced some valuable insights on biofeedback, but only indecisive results on its relevance or practicality as a mean to reach the stated goals. A new direction, under the more general label of Biocybernetics, was then defined and became the main source of support for bionics research during the ensuing years.
One of the Biocybernetics program directives was to evaluate the potential of biological measurable signals helped by real-time computer processing, to assist in the control of vehicles, weaponry, or other systems. These more exotic extensions of interface technology will be reviewed below.
Eye gaze Direction and Eye Movement
The voluntary pointing of eye gaze toward an item of interest is a most effortless way to designate the location of a target in the field of vision. For instance, eye gaze can act as mouse or tablet input for selecting, dragging or scrolling items on a computer screen. Furthermore, subconscious eye movements are also of great significance for human-machine communication. For instance, when scanning pictures, the viewer's gaze automatically lingers over key areas, including, in particular, items that elicit emotional interest. Eye-controlled systems have also been designed to provide access to computers to persons with severe motor handicaps and whose eye movements are sometimes the only available motor channel input ( White et al. 1993). The eye tracking is typically used to control menu-driven applications from a control screen.
Several technologies are available for dynamic eye position tracking (Young L.A. et al. 1975) using either remote or head-mounted apparatus (Schroeder, W.E. , 1993, Mendel et al., 1993, White et al., 1993). Sophisticated re-calibration systems have been designed to maintain accurate point-of-regard estimates in the presence of significant head displacements. (Mendel, M. J., 1993). Some eye tracking systems are technologically quite mature and have made an appearance in at least one consumer product, the Canon 8800 video camera.
Another source of position information is the EOG or Electro-oculogram, the bioelectrical signature generated by the movement of the eyeball. This signal can be measured by electrodes placed next to the eye. The eyeball is polarized and acts as a battery would moving in the eye socket. It creates a large field on the skull that reflects eye position. EOG accuracy as a position indicator is limited by drifts, but it does provide sensitive movement information, including on the small saccades that are always present, i.e. it has an advantage of speed over some of the other methods such as video cameras, but is best used in combination with other approaches.
It should be noted that EOG fluctuations are called "artifacts" by brain wave researchers because they powerfully overshadow the much smaller brain signals that we will discuss later. The best laboratories have now developed very sophisticated computer methods to filter out this EOG interference.
There exist also intimate channels, that in some contexts, are capable of probing significant aspects of human emotion. All the interface enhancements described earlier use increasingly natural, but still voluntary and deliberate behaviors to issue commands to a computer. A more intriguing and more controversial direction is the monitoring of behavioral clues and biological signals to acquire unconscious or sub-conscious information reflecting emotional states.
Emotions produce rapid changes in a number of measurable physiological indicators such as blood pressure, heart rate, dilatation or contraction of the pupil in the eye, galvanic skin resistance, and respiration. These indicators can be easily monitored by special equipment using individual probes or perhaps distributed in a computerized body suit. Other approaches may combine emotional with cognitive clues such as tracking facial expressions as described in the next section.
A rather unfortunate precedent to instrumental emotional monitoring needs a special mention. The practice of polygraph or "lie detectors" testing, a rather silly practice widely use in the US (over one million a year, according to some estimates) is an embarrassment to many Americans. It is to their credit that many legislatures, including several American states, ban the practice in civilian life, but conservative influences have retained its use in some federal agencies such as the US National Security Agency, despite ample evidence that there is no reliable signature of truth or falsehood showing on polygraph displays. Polygraph measurements usually include blood pressure, heart rate, galvanic skin resistance, and respiration, abdominal and thoracic. These biological signs are related to emotional arousal in a complex and very subject specific way, but their relation to the goals of a polygraph examination is founded on general assumptions and remains inconclusive. Polygraph results are best viewed as the subjective judgment of the examiner. The fact remains, however, that the introduction of emotional states in the human-machine dialogue is readily feasible and has potential benefits in a number of areas, especially if the process is part of a closed loop with the user. For instance, the provision of emotional feedback in the context of intelligent software tools can be a key to improved self-knowledge and self-control.
Face Tracking, Lip Reading
In ordinary life, many informative nonverbal clues are transmitted by facial expression or by head attitude or motion (Eckman,and Friesen, 1977). Indicators of mental states and especially of emotions, can be voluntary or involuntary as well as sometimes intentionally deceptive. Head movements are also commonly used to express hesitation, agreement or refusal.
A considerable body of literature exists that provides a classification and quantification basis for this aspects of nonverbal communication (Eckman 1977). Other research groups have developed the complex computer techniques needed for acquiring facial data. For instance, one team at the CMU Interactive Systems Laboratory has developed a system that can track the face of a subject that moves freely in an experimental room. Tracking relies, in particular, on an elaborate skin color model that allows for changes in lighting and other viewing conditions. The model assumes a Gaussian distribution in 2D chromatic space, and estimates its six parameters for a given subject and lighting environment. The system is to be expanded to the simultaneous tracking of multiple subjects.
Other groups at CMU and UC Berkeley are applying Neural Network technology to track the lip contour dynamically during speech, and merge this information with that from the acoustic track in a visual-acoustic speech recognition system (Campbell, 1988, Bregler et al. 1993,1994a, 1994b). To port these and many other such behaviors into a human computer interface can be viewed as a component of a larger research agenda, which aims to address every significant component of a comprehensive modeling of the user with all of his or her physical, cognitive and social characteristics. In some circles, this perspective elicits serious concerns regarding individual freedom and privacy, but that subject requires a separate forum. One general statement would be that a user model brought to a given degree of intimacy would, in computer communication, play the same role as similar knowledge of one's interlocutor in ordinary human interplay.
Relating the electrical signals emitted by the human brain to the mental state of its owner has been one of the central purpose of neurophysiological research for over sixty years. It has been a long and still frustrating journey for brain researchers. The discovery by Berger in 1929 of electrical brain waves recorded from the intact skull caused considerable excitement. The recordings became known as the Electroencephalogram or EEG. The enthusiasm progressively gave way to the realization that these electrical signals were considerably more elusive and unpredictable that had been anticipated. Current research, which is critically dependent on modern computer technology, has only recently reached a point that allows a revival of the initial excitement. The results, while still limited, have nevertheless come a very long way to decode brain wave signatures in useful ways .
Brain tissue contains a myriad of active current sources that cause the local electrical potential to endlessly fluctuate with a great deal of variability. Some of the characteristics of the wave trains can be somewhat predicted in relation to the electrode site, the activity of the subject, and the presence and type of sensory stimulation. Some of those could be readily identified by eye, and these were for a long time the principal object of EEG research: the recognition of the 10Hz alpha activity and the observation of alpha blocking, of sleep versus wake, of barbiturate induced "spindles", and of the 3/sec spike and wave complex of petit mal epilepsy. Indeed, traditionally, the clinical EEG usefulness has been mostly limited to assessing the overall condition of the brain such as identifying wakefulness versus REM, or deep sleep, or else to detect and localize epileptic seizures.
The electrical fluctuations detected over the scalp must be attributed mostly to brain tissue located at or near the skull, i.e., their source is the electrical activity of the cerebral cortex, a significant portion of which lies on the outer surface of the brain below the scalp. The cerebral cortex is a thin layer containing nerve cells or neurons and dendrites, long tubular extensions of neuron bodies which extend toward the surface and branch out laterally for some distance. to connect with adjacent neurons and dendrites. Dendrites are electrolytic connectors that propagate electrical fields to the neuron body, where they eventually trigger the nerve impulses
The surface potentials observed are generated mainly at the dendrites and at the bodies (soma) of brain cells. The peaks and valleys of the wave forms betray polarization and depolarization that occurs somewhat in synchrony. A positive variation recorded at the surface would correspond to a region of synchronized depolarization (greater excitability) underneath and vice versa. To account for the observed amplitudes, one must assume that underlying neurons in large number are driven from synchrony to disorder on a relatively slow schedule (compared with the time constant of a single neuron), in order to account for the power spectrum which displays most of its energy at frequencies around and below 10 Hz. It is generally believed that the neuronal firing themselves are not significant contributors to EEG waves and in fact these are present even when all the cells concerned are prevented from firing altogether (Marshall et al , Li & Jasper ). By contrast, correspondence between individual waves in the EEG signal and post synaptic potentials recorded intracellularly in adjacent neurons has been well established ( Landau ).
This spontaneous EEG activity is also often rhythmic, i.e., its spectral density shows peaks at characteristic frequencies. The analysis of these brain rhythms has retained much of the early attention paid to EEG.
Considerable efforts were spent in the late sixties and early seventies to pry more subtle information from EEG, with the help of computers. Spectral densities and spectral coherence between pairs of channels were measured on relatively short (2 to 10 sec) EEG epochs, in order to track short shifts of mental activity. This produced an abundant literature and some interesting results. For instance, it is known that human subjects' ability to sustain their initial level of performance during continuous auditory or visual monitoring tasks is limited. After only a few minutes on task, particularly in low-arousal environments, performance can deteriorate substantially while the subject fights drowsiness. These in turn causes spectral changes in EEG spectra. A recent study, using tone detection in auditory stimuli, showed that human performance affected by drowsiness tends to fluctuate over periods of 4 minutes and longer and that these performance lapses are accompanied by characteristic EEG changes in the 4 Hz delta and 4-6 Hz theta bands as well as around 14 Hz (the sleep spindle frequency). furthermore, the transient changes that occur before and after presentations and are good predictors and of correct versus incorrect detection of targets.(Makeig S. and Jung T., 1966)
A light flash, brief sound, or light touch of the skin generates activity to pathways specific to the sense involved and, in particular, in the corresponding sensory cortex (visual, auditory, or somesthetic). This electrical response as measured by an electrode on the cortical surface is reflected by a nonperiodic wave form buried in the ongoing background activity and covering roughly one third of a second. Again, this results from the synchronous contribution of post synaptic potentials in a large number of neurons in the vicinity of the electrode. The cortical neurons are distributed in layers, each layer in a given area presumably having a particular integrative function. In a direction perpendicular to the cortical surface, cells above one another seem to serve various sub functions for a given sensory modality, while in a lateral direction, the functional properties of the cells exhibit sharp transitions. This columnar organization was revealed early in the visual and the somatosensory areas by Huben & Wiesel and in the auditory cortex by Gerstein & Kiang . Function and modality vary with the cortical position. Functional specificity in relation to cortical sites is also reflected in the ongoing EEG: sensory stimuli of a given modality will desynchronize localized areas of the cortical surface. Various feature-extracting functions have been found to map on the surface of the cortex, and indeed different stimuli are found to evoke distinct electrical "signatures" on the cortical surface and more diffusely on the scalp beyond. By averaging half a second or so of the EEG signal that followed many repetitions of the same stimulus, such as flashed visual patterns or brief sounds, variations thought of as "noise" were eliminated. The scalp response to the brief flashing of vertical lines would yield a wave form different from that obtained from a set of circles. The presence and stability of these correlates of the of sensory modality in the average evoked wave form has been abundantly demonstrated in the early days of brain wave research. (White & Eason , Harter & White , Clynes & Cohn, and Spehlmann)
In the seventies, this Average Evoked Potential or AEP became a source of considerable excitement for behavioral psychologists as well. The wave forms exhibited tantalizing regularities that became associated with such concepts as selective attention or novelty. A particular positive deflection of the signal, occurring roughly 300 ms after the stimulus, the "P300", became the object of much attention as it appeared reliably when task significant stimuli were presented at infrequent intervals in a stream of similar events for which the subjects had no task to perform. Deeper analysis proved more elusive but the research showed that large groups of neurons are abruptly called into some level of synchrony, when a sudden, relevant or potentially threatening event occurs.
At about the same epoch, the author headed the Brain Computer Interface Project at UCLA, a component of the ARPA Biocybernetics program mentioned earlier. Using computer generated visual simulation and sophisticated signal processing, the research successfully demonstrated the possibility of using single-epoch evoked potential without averaging in human-computer interaction, for instance to control a robot device. Over the picture of a maze , a cursor was controlled by a successions of turns in four different directions. The decision was transmitted in real-time, solely from the evoked brain signal of the user.
A better grasp of the cognitive signatures present in brain waves had to wait until the development of powerful topographic techniques addressing simultaneously the spatial and temporal distribution of the brain waves on the surface of the brain.
Brain wave topography was pioneered by A. Remond at the Hospital of the Salpetriere in Paris, many years ago and with interesting results, but a quantitative assessment of key cognitive signature required both considerable advances in computer technology and new efforts in refining the signal processing methodology and experiment design.
A breakthrough came from a team at the EEG Laboratory headed by A. Gevins in San Francisco. It took many years for this group to achieve results which are now nothing short of spectacular. The first realization was that the spatial resolution, initially limited to a few electrodes, had to be increased enormously to reach the simultaneous recording of 124 channels of EEG signals. In addition, the position of each electrode is first calibrated directly from landmark on cerebral tissue, using Magnetic Resonance Imaging. The recorded signals are de-convolved with an empirical kernel, to minimize the spatial spreading due to bone and scalp, and to lower the focus of the recording down to the underlying brain surface. Brain activity is reconstructed in this manner at half a second intervals.
A typical experiment involves subjects given a visumotor task involving both hands. The task dissection distinguishes successive phases: "prepare to receive information" followed by a stimulus to be recognized, a decision , a presetting of the motor system for the chosen action and finally the motor act itself. A phase of re calibration is added, immediately following feedback informing the subject of his performance level. Multichannel processing consists of time covariance calculations between pairs of channels, and the results are classified with an Artificial Neural Network in relation to each phase of the task. The results, obtained in the form of colored graphs superimposed on the skull images can articulate, both spatially and temporally by graphic animation, the succession of cognitive events inherent to the task into connected graphs.
This real-time mapping of brain states, brings brain waves firmly into the realm of human-computer interfacing, although its practical deployment is still somewhat over the horizon. Given the general acceleration of the technology, it is unlikely to remain there much longer
Epilogue: Economic and Political Implications
The world is clearly in the mist of an unprecedented technological revolution as the century is coming to an end. But the political context in which this revolution is taking place is disturbingly unstable. As Eric Hobsbawn noted in "The Age of Extreme.":
"No one who looks back on a century in which no more than a handful of states existing at present have come into being or survived without passing through revolution, armed counter-revolution, military coups or armed civil conflicts would bet much money on the universal triumph of peaceful and constitutional change, as predicted in 1989 by some euphoric believers in liberal democracy"
Similar prediction are being aired in these late nineties by euphoric believers in the redemptive powers of universally accessible information. Yet, it appears unlikely that the prevailing unrest and ubiquitous conflict will subside any time soon. Irrationality is on the rise everywhere, including in the developed world. Many parts of the world experience a social breakdown of unprecedented scale, epitomized by the situation in the ex USSR. Even more ominously, in many places an unfocused rejection of modernity and a setback to medieval barbarism is emerging and has taken an acute form in places like Iran, Algeria and Afghanistan.
Those are strange conditions indeed to accompany the radical and unstoppable revolution in computer and telecommunication that is taking place in the developed world. The capability for governments to control the flow of information and the movement of money is rapidly shrinking. Multinational finances and trade companies acquire a dominance which bypasses borders and makes short drift of the protective barriers that the states had erected in an earlier era. Internet commerce, still in infancy, is certain to expand exponentially. Instant trading at the speed of light is a novel phenomenon that is viewed as the promise of the future by some and by others as the harbinger of financial crashes that could dwarf any from the past. Whatever the outcome, in the near future the most likely prospect is of the emergence of an enormously powerful supra-national club, deeply involved with the internet and at ease with advanced technologies. This group will derive its membership predominantly from developed countries. Its numbers will be bare millions in a world population counted in billions. Social cleavage probably cannot be avoided.
In this context, what can be expected from the increasingly intimate interdependence between humans and technology, and in particular, from the bionic enhancements to the human-computer interface in this perspective. It is likely that, at least in the near future, exotic enhancements will play a relatively minor role. Areas that affect commerce such as biometrics personal identification are likely to receive increased attention. Biometrics systems are being studied in many places for such purposes as owner authentication in banking, social security entitlement, immigration control and even election validation. Modern pattern recognition algorithms and fast database scanning can make the traditionally awkward fingerprint technique both unobtrusive and nearly instantaneous. Hand geometry, face recognition and retinal scans are other approaches that although still experimental are almost ready for large scale implementation. Finally, and in view of our earlier remarks on the fragility of our most cherished humanistic assumptions, one should not discard the concept of chip implantation as unthinkable. It is, after all, currently a most favored, reliable and even least intrusive way to securely identify domestic animals.
Despite the dangers and the problems looming, these developments and many unforeseen others are on their way and will not stop. These are interesting times. Enlightened human intervention can, sometime and at least locally, moderate the perverse effects that almost always accompany change. This is the best society can hope for.
Scott Makeig, ,Tzyy-Ping Jung, 1996. Tonic, Phasic, and Transient EEG Correlates of Auditory Awareness in Drowsiness, Cognitive Brain Research 4:15-25 1996
Young, Lawrence R. and Sheena, 1975, Methods and Design, Survey of Eye Movement Recording Methods, Behavior Research Methods and Instrumentation, Vol: 7, 397-429.
Schroeder, W.E., 1993, Head-mounted computer interface based on eye tracking, Proceedings of the SPIE - The International Society for Optical Engineering, Vol: 2094 /3, 1114-1124.
Mendel, Mark J.; Van Toi, Vo; Riva, Charles E., 1993. Eye-tracking laser Doppler velocimeter stabilized in two dimensions, Journal of the Optical Society of America, Optics and Image Science (ISSN 0740-3232)10, 663-669.
C.Bregler, H.Hild, S.Manke, and A.Waibel, Improving Connected Letter Recognition by Lipreading, in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Minneapolis, 1993.
C.Bregler, Y.Konig, "Eigenlips" for Robust Speech Recognition in, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Adelaide, Australia, 1994.
M. Akamatsu and S. Sato. A multi-modal mouse with tactile and force feedback. Int. Journ. of Human-Computer Studies, 40:443--453, 1994.
R. Balakrishnan, C. Ware, and T. Smith. Virtual hand tool with force feedback. In C. Plaison, editor, Proc.of the Conf. on Human Factors in Computing Systems, CHI'94, Boston, 1994. ACM/SIGCHI.
M. Bodden, 1993 Modeling Human Sound Source Localization and the Cocktail-Party-Effect. Acta Acustica 1,,1:43--55,
S. A. Brewster, P. C. Wright, and A. D. N. Edwards. A detailed investigation into the effectiveness of earcons. In G. Kramer, editor, Auditory Display, pages 471--498, Reading, Massachusetts, 1994. Santa Fe Institute, Addison Wesley.
F. P. Brooks, Jr. et al. Project GROPE - Haptic Displays for Scientific Visualization. ACM Computer Graphics, 24(4):177--185, Aug. 1990.
R. Campbell. Tracing lip movements: making speech visible. Visible Language, 8(1):33--57, 1988.
P. Ekman and W. V. Friesen. Facial Action Coding System. Consulting Psychologists Press, Stanford,,University, Palo Alto, 1977.
I. A. Ferguson. TouringMachines: An Architecture for Dynamic, Rational, Mobile Agents. Phd thesis, University of Cambridge, 1992.
M. A. Gigante. Virtual Reality: Definitions, History and Applications. In R. A. Earnshaw, M. A. Gigante,, and H. Jones, editors, Virtual Reality Systems, chapter 1. Academic Press, 1993.
B. M. Jau. Anthropomorphic Exoskeleton dual arm/hand telerobot controller. pages 715--718, 1988.
D. B. Koons, C. J. Sparrel, and K. R. Thorisson. Integrating Simultaneous Output from Speech, Gaze, and Hand Gestures. In M. Maybury, editor, Intelligent Multimedia Interfaces, pages 243--261. Menlo Park: AAAI/MIT Press, 1993.
T. H. Massie and J. K. Salisbury. The PHANToM Haptic Interface: a Device for Probing Virtual Objects. In Proc. of the ASME Winter Annual Meeting, Symp. on Haptic Interfaces for Virtual Environment and Teleoperator Systems, Chicago, 1994.
K. Pimentel and K. Teixeira. Virtual Reality: through the new looking glass. Windcrest Books, 1993.
279 J. Psotka, S. A. Davison, and S. A. Lewis. Exploring immersion in virtual space. Virtual Reality Systems, 1(2):70--92, 1993.
J. Rasmussen. Information Processing and Human-Machine Interaction. An Approach to Cognitive Engineering. North-Holland, 1986.
J. Rhyne. Dialogue Management for Gestural Interfaces. Computer Graphics, 21(2):137--142, 1987.
D. Riecken, editor. Special Issue on Intelligent Agents, volume 37 of Communications of the ACM, 1994.
K. B. Shimoga. A Survey of Perceptual Feedback Issues in Dexterous Telemanipulation: Part II. Finger Touch Feedback. In Proc. of the IEEE Virtual Reality Annual International Symposium. Piscataway, NJ : IEEE Service Center, 1993.
R. J. Stone. Virtual Reality & Telepresence -- A UK Iniative. In Virtual Reality 91 -- Impacts and Applications. Proc. of the 1st Annual Conf. on Virtual Reality, pages 40--45, London, 1991. Meckler Ltd.
C. C. Tappert, C. Y. Suen, and T. Wakahara. The State of the Art in On-line Handwriting Recognition. IEEE Trans. on Pattern Analysis & Machine Intelligence, 12:787--808, 1990.
D. Varner. Olfaction and VR. In Proceedings of the 1993 Conference on Intelligent Computer-Aided Training and Virtual Environment Technology, Houston, TX, 1993.
M. Wooldridge and N. R. Jennings. Intelligent Agents: Theory and Practice. (submitted to:) Knowledge Engineering Review, 1995.
A. Remond, Integrated and Topological Analysis of the EEG,EEG & Clinical Neurophysiol., Supp. no 20, "Computer Techniques in EEG Analysis", pp. 64-67, 1961;
A. Remond, EEG Field Mapping, EEG & Clinical Neurophysiol., 45, pp417-421, , 1978.