Issues in Haptic Rendering
Acquisition of Models
There are several commercial 3D digitizing cameras available for acquiring models of objects, such as the ColorScan and the Virtuoso shape cameras mentioned earlier. The latter uses six digital cameras, five black and white cameras for capturing shape information and one color camera that acquires texture information that is layered onto the triangle mesh. At USC's IMSC one of the approaches to the digitization process begins with models acquired from photographs, using a semiautomatic system to infer complex 3-D shapes from photographs (Chen & Medioni, 1997, 1999, 2001). Images are used as the rendering primitives and multiple input pictures are allowed, taken from viewpoints with different position, orientation, and camera focal length. The direct output of the IMSC program is volumetric but is converted to a surface representation for the purpose of graphic rendering. The reconstructed surfaces are quite large, on the order of 40 MB. They are decimated with a modified version of a program for surface simplification using quadric error metrics written by Garland and Heckbert (1997). The LightScribe system (formerly known as the 3Scan system) incorporates stereo vision techniques developed at IMSC, and the process of matching points between images has been fully automated. Other comparable approaches to digitizing museum objects (e.g., Synthonics) use an older version of shape-from-stereo technology that requires the cameras to be calibrated whenever the focal length or relative position of the two cameras is changed.
Volumetric data is used extensively in medical imaging and scientific visualization. Currently the GHOST SDK, which is the development toolkit for the PHANToM, construes the haptic environment as scenes composed of geometric primitives. Huang, Qu, and Kaufman of SUNY-Stony Brook have developed a new interface that supports volume rendering, based on volumetric objects, with haptic interaction. The APSIL library (Huang, Qu, & Kaufman, 1998) is an extension of GHOST. The Stony Brook group has developed successful demonstrations of volume rendering with haptic interaction from Computer Tomography data of a lobster, a human brain, and a human head, simulating stiffness, friction, and texture solely from the volume voxel density. The development of the new interface may facilitate working directly with the volumetric representations of the objects obtained through view synthesis methods.
The surface texture of an object can be displacement mapped with thousands of tiny polygons (Srinivasan & Basdogan, 1997), although the computational demand is such that force discontinuities can occur. More commonly, a "texture field" can be constructed from 2D image data. For example, as described above, Ikei, Wakamatsu, and Fukuda (1997) created textures from images converted to grayscale, then enhanced them to heighten brightness and contrast, such that the level and distribution of intensity corresponds to variations in the height of texture protrusions and retractions.
Surface texture may also be rendered haptically through techniques like force perturbation, where the direction and magnitude of the force vector is altered using the local gradient of the texture field to simulate effects such as coarseness (Srinivasan & Basdogan, 1997). Synthetic textures, such as wood, sandpaper, cobblestone, rubber, and plastic, may also be created using mathematical functions for the height field (Anderson, 1996; Basdogan, Ho, & Srinivasan, 1997). The ENCHANTER environment (Jansson, Faenger, Konig, & Billberger, 1998) has a texture mapper which can render sinusoidal, triangular, and rectangular textures, as well as textures provided by other programs, for any haptic object provided by the GHOST SDK.
In many applications of haptics, it is desirable to be able to explore and manipulate deformable objects as well as rigid-body objects like vases and teapots. One area that IMSC researchers are beginning to explore is the development of reliable vision-based control systems for robotic applications such as the acquisition of images for 3D modeling. Two topics that have been identified as crucial for the development of such systems for robotic applications (e.g., 3D and 4D modeling for haptics) are the development of self-calibrating control algorithms (Hager, Chang, & Morse, 1995) and the use of single-camera image acquisition systems in feedback control. One can use images of an object taken from multiple viewpoints to construct a 3D model of the object to be used for haptics. To automate the procedure of collecting the multiple views, one needs to have a camera mounted on a computer-controlled robot arm. This is particularly important for constructing 4D models of objects whose shape is evolving (e.g., a work of art as it is being produced). From a controls perspective the research problem is to build algorithms to position the camera. The desired position can be specified directly in terms of its Cartesian coordinates or indirectly in terms of desired locations of parts of the object in the image. The latter falls in the area of vision-based control and is much more interesting, because the use of vision in the feedback loop allows for great accuracy with not very precise, therefore relatively inexpensive, robotic manipulators.
The realism of haptic rendering can be adversely affected by slow update rates, as can occur in the case of the extreme computation time required by real-time rendering of deformable objects, or the delays induced by network congestion and bandwidth limitations in distributed applications.
Floyd (1999) deals with the issue of computational latency and haptic fidelity in bitmapped virtual environments. In traditional systems with some latency there is a lack of fidelity if, say, the user penetrates a virtual object and the lag is such that there is no immediate feedback of force to indicate that a collision has occurred and that penetration is not possible. Floyd proposes that the server inform the haptic client when the user has penetrated a surface in the environment, and where that contact occurred. The client uses this information to offset the coordinate system the user is operating in so that instead of having significantly penetrated the surface, the user is just within it, computes an appropriate force response, and caches the constraint implicit in the existence of that surface so that forces to impede further progress in that direction are computed on the client alone.
Mark and his colleagues (Mark, Randolph, Finch, van Verth, & Taylor, 1996) have proposed a number of solutions to recurring problems in haptics, such as improving the update rate for forces communicated back to the user. They propose the use of intermediate representation of force through a "plane and probe" method: A local planar approximation to the user's hand location is computed when the probe or haptic tool penetrates the plane, and the force is updated at approximately 1 kHz by the force server, while the application recomputes the position of the plane and updates it at approximately 20 kHz. Balaniuk (1999) has proposed a buffer model to transmit information to the PHANToM at the necessary rate. The buffer can also be used to implement a proxy-based calculation of the haptic forces.
Networked virtual reality (VR) applications may require that force and positional data be transmitted over a communication link between computers where significant and unpredictable delays are the norm, resulting in instability in the haptic system. The potential for significant harm to the user exists in such circumstances due to the forces that the haptic devices can generate. Buttolo, Oboe, Hannaford, and McNeely (1996) note that the addition of force feedback to multiuser environments demands low latency and high collision detection sampling rates. Local area networks (LANs), because of their low communication delay, may be conducive to applications in which users can touch each other, but for wide area networks, or any environment where the demands above cannot be met, Buttolo et al. propose a "one-user-at-a-time" architecture. While some latency can be tolerated in "static" applications with a single user and no effect of the user's action on the 3D object, in collaborative environments where users make modifications to the environment it is important to make sure that any alterations from individual clients are coordinated through the server. In effect the server can queue the users so that only one can modify the object at a time and can lock the object until the new information is uploaded to the server and incorporated into the "official" version of the virtual environment. Then and only then can the next user make a modification. Delay can be tolerated under these conditions because the haptic rendering is done on a local copy of the virtual environment at each user's station.
Hespanha, McLaughlin, and Sukhatme (Chapter 8, this volume) note that latency is a critical factor that governs whether two users can truly share a common haptic experience. They propose an algorithm where the nature of the interaction between two hosts is decided dynamically based on the measured network latency between them. Users on hosts that are near each other (low communication latency) are dynamically added to fast local groups. If the communication latency is high, users are allowed a slower form of interaction where they can touch and feel objects but cannot exert forces on them. Users within a fast local group experience true haptic collaboration since the system is able to resolve the interaction forces between them quickly enough to meet stability criteria.
Fukuda and Matsumoto (Chapter 7, this volume) have also addressed the issue of the impact of network delay on collaborative haptic environments. They conducted a study of a multiuser environment with force feedback. They found that the performance of the PHANToM is sensitive to network delay, and that their SCES (Sharing Contact Entity's State) solution demonstrated good performance, as compared to taking no countermeasure against delay. Other approaches for dealing with random time delays, including Transmission Line Modeling and Haptic Dead Reckoning, are considered in Wilson et al. (1999).
A fundamental problem in haptics is to detect contact between the virtual objects and the haptic device (a mouse, a PHANToM, a glove, etc.). Once this contact is reliably detected, a force corresponding to the interaction physics is generated and rendered using the probe. This process usually runs in a tight servo loop within a haptic rendering system. Lin et al. (1998, 1999) have proposed an extensible framework for contact detection that deconstructs the workspace into regions and at runtime identifies the region(s) of potential contacts. The algorithm takes advantage of temporal and spatial coherence by caching the contact geometry from the immediately prior step to perform incremental computations. Mascarenhas et al. (Chapter 5, this volume) report on a recent application of this system to the visualization of polygonal and scientific datasets. The contact detection problem is well studied in computer graphics. The reader is referred to Held (1995) and to Lin and Gottschalk (1998) for a survey.
Another technique for contact detection is to generate the so-called surface contact point (SCP), which is the closest point on the surface to the actual tip of the probe. The force generation can then happen as though the probe were physically at this location rather than within the object. Existing methods in the literature generate the SCP by using the notion of a god-object (Zilles & Salisbury, 1995), which forces the SCP to lie on the surface of the virtual object. A technique which finesses contact point detection using a voxel-based approach to 6 DOF haptic rendering is described in McNeely et al. (1999). The authors use a short-range force field to repel the manipulated object in order to maintain a minimum separation distance between the (static) environment and the manipulated object. At USC's IMSC, the authors are developing algorithms for SCP generation that use information from the current contact detection cycle and past information from the contact history to predict the next SCP effectively. As a first step, we are experimenting with a well-known linear predictor, the Kalman Filter, by building on our prior results in applying similar techniques to the problem of robot localization (Roumeliotis, Sukhatme, & Bekey, 1999).
Two requirements drive the force feedback research in haptics: high fidelity rendering and stability. It turns out that these two goals are somewhat conflicting because high fidelity haptic rendering generally calls for high force-feedback gains that often lead to self-induced oscillations and instability.
Inspired by electrical networks, Adams and Hannaford (1999) regard the haptic interface as a two-port system terminated on one side by the human operator and on the other side by the virtual environment (cf. Figure 1-1). The energy exchange between the human operator and the haptic interface is characterized by a force Fh and velocity vh, whereas the exchange between the interface and the simulated virtual environment is characterized by a force Fe and velocity ve. For ideal rendering, the haptic interface should be transparent (in the sense that Fh = Fe and vh = ve), but stability requirements generally force the designer of the haptic interface to introduce some haptic distortion.
Figure 1-1: Two-port framework for haptic interfaces.
It is generally assumed that a human operator interacting with a haptic interface behaves passively (Hogan, 1989) in the sense that he or she does not introduce energy in the system. Since most mechanical virtual environments are also passive, the stability of the overall system could be guaranteed by simply designing the interface to be passive. However, as observed by Colgate, Grafing, Stanley, and Schenkel (1993), time-sampling can destroy the natural passivity of a virtual environment. In fact, these authors showed that the smaller the sampling rate, the more energy can be "generated" by a virtual wall.
Several approaches have been proposed to deal with this difficulty. Colgate and Schenkel (1997) determined conditions on the simplest virtual environment (the virtual wall) that guarantee the stability of the haptic interface for any passive human operator. These conditions reflect the fact that the amount of energy generated by a time-discretized virtual wall depends on the sampling rate. They also involve the virtual wall's stiffness and damping coefficient, posing constraints on the range of stiffness/damping parameters that can be rendered. This range is referred to by Colgate and Brown (1994) as the z-width of the haptic interface and is an important measure of its performance.
Adams and Hannaford (1999) followed a distinct approach by advocating the introduction of virtual coupling in the haptic interface so as to guarantee the stability of the system for any continuous-time passive virtual environment, even if its discrete-time version is no longer passive. The virtual coupling can be designed to provide sufficient energy dissipation to guarantee the stability of the overall system for any passive virtual environment. This approach decouples the haptic interface control problem from the design of the virtual environment. Miller, Colgate, and Freeman (1999) extended this work to virtual environments that are not necessarily passive. The drawback of virtual coupling is that it introduces haptic distortion (because the haptic interface is no longer transparent). Hannaford, Ryu, and Kim (Chapter 3, this volume) present a new method to control instability that depends on the time domain definition of passivity. They define the "Passivity Observer," and the "Passivity Controller," and show how they can be applied to haptic interface design in place of fixed-parameter virtual couplings. This approach minimizes haptic distortion.
The work described above assumes that the human operator is passive, but poses no other constraints on her behavior. This can lead to small z-width, significant haptic distortion, or both. Tsai and Colgate (1995) tried to overcome this by modeling the human as a more general discrete-time linear time-invariant system. They derive conditions for stability that directly exclude the possibility of periodic oscillations for a virtual environment consisting of a virtual wall. Gillespie and Cutkosky (1996) address the same issue by modeling the human as a second order continuous-time system. They conclude that to make the approach practical, online estimation of the human mechanical model is needed, because the model's parameters change from operator to operator and, even with the same operator, from posture to posture. The use of multiple-model supervisory control (Anderson et al., 1999; Hespanha et al., 2001; Morse, 1996) to estimate online the operator's dynamics promises to bring significant advantages to the field, because it is characterized by very fast adaptation to sudden changes in the process or the control objectives. Such changes are expected in haptics due to the unpredictability of the human-in-the-loop. In fact, it is shown in Hajian and Howe (1995) that changes in the parameters of human limb dynamics become noticeable over periods of time larger than 20 ms.
Although most of the work referenced above focuses on simple prototypical virtual environments, a few researchers developed systems capable of handling very complex ones. Ruspini and Khatib (Chapter 2, this volume) are among these, having developed a general framework for the dynamic simulation and haptic exploration of complex interaction between generalized articulated virtual mechanical systems. Their simulation tool permits direct "hands-on" interaction with the virtual environment through the haptic interface.
Capture, Storage, and Retrieval of Haptic Data
One of the newest areas in haptics is the search for optimal methods for the description, storage, and retrieval of moving-sensor data of the type generated by haptic devices. With such techniques we can capture the hand or finger movement of an expert performing a skilled movement and "play it back," so that a novice can retrace the expert's path, with realistic touch sensation; further, we can calculate the correlation between the two exploratory paths as time series and determine if they are significantly different, which would indicate a need for further training. The INSITE system (Faisal, Shahabi, McLaughlin, & Betz, 1999) is capable of providing instantaneous comparison of two users with respect to duration, speed, acceleration, and thumb and finger forces. Techniques for recording and playing back raw haptic data (Shahabi, Ghoting, Kaghazian, McLaughlin, & Shanbhag, forthcoming; Shahabi, Kolahdouzan, Barish, Zimmermann, Yao, & Fu, 2001) have been developed for the PHANToM and CyberGrasp. Captured data include movement in three dimensions, orientation, and force (contact between the probe and objects in the virtual environment). Shahabi and colleagues address haptic data at a higher level of abstraction in Chapter 14, in which they describe their efforts to understand the semantics of hand actions (see also Eisenstein, Ghandeharizadeh, Huang, Shahabi, Shanbhag, & Zimmermann, 2001).
Haptic Data Compression
Haptic data compression and evaluation of the perceptual impact of lossy compression of haptic data are further examples of uncharted waters in haptics research (see Ortega, this volume, Chapter 6). Data about the user's interaction with objects in the virtual environment must be continually refreshed if they are manipulated or deformed by user input. If data are too bulky relative to available bandwidth and computational resources, there will be improper registration between what the user sees on screen and what he "feels." Ortega's work begins by analyzing data obtained experimentally from the PHANToM and the CyberGrasp, exploring compression techniques, starting with simple approaches (similar to those used in speech coding) and continuing with methods that are more specific to the haptic data. One of two lossy methods to compress the data may be employed: One approach is to use a lower sampling rate; the other is to note small changes during movement. For example, for certain grasp motions not all of the fingers are involved. Further, during the approaching and departing phases tracker data may be more useful than the CyberGrasp data. Vector coding may prove to be more appropriate to encode the time evolution of a multifeatured set of data such as that provided by the CyberGrasp. For cases where the user employs the haptic device to manipulate a static object, compression techniques that rely on knowledge of the object may be more useful than the coding of an arbitrary trajectory in three-dimensional space.
The many potential applications in industry, the military, and entertainment for force feedback in multiuser environments, where two or more users orient to and manipulate the same objects, have led to work such as that of Buttolo and his colleagues (Buttolo, Oboe, & Hannaford, 1997; Buttolo, Hewitt, Oboe, & Hannaford, 1997; Buttolo, Oboe, Hannaford, & McNally, 1996), who as noted above remind us that adding haptics to multiuser environments creates additional demand for frequent position sampling for collision detection and fast update.
It is also reasonable to assume that in multiuser environments, there may be a heterogenous assortment of haptic devices with which users interact with the system. One of our primary concerns thus would be to ensure proper registration of the disparate devices with the 3D environment and with each other. Of potential use in this regard is work by Iwata, Yano, and Hashimoto (1997) on LHX (Library for Haptics), which is modular software that can support a variety of different haptic displays. LHX allows a variety of mechanical configurations, supports easy construction of haptic user interfaces, allows networked applications in virtual spaces, and includes a visual display interface. The chapter by Hespanha, McLaughlin, and Sukhatme (Chapter 8, this volume) proposes an architecture for distributed haptic collaboration with heterogeneous devices.
There have only been a few studies of cooperation/collaboration between users of haptic devices. In a study by Basdogan, Ho and their colleagues (Basdogan, Ho, Slater, & Srinavasan, 1998; Ho, Basdogan, Slater, Durlach, & Srinivasan, 1998), partners at remote locations were assigned three cooperative tasks requiring joint manipulation of 3D virtual objects, such as moving a ring back and forth along a wire while minimizing contact with the wire. Experiments were conducted with visual feedback only, and with both visual and haptic feedback. Both performance and feelings of togetherness were enhanced in the dual modality condition. Performance was best when visual feedback alone was followed by the addition of haptic feedback rather than vice versa. Durlach and Slater (n.d.) note that factors that contribute to a sense of copresence include being able to observe the effect on the environment of actions by one's interlocutors, and being able to work collaboratively with copresent others to alter the environment. Point of view (egocentric vs. exocentric) with respect to avatars may also influence the sense of copresence. Touching, even virtual touching, is believed to contribute to the sense of copresence because of its associations with closeness and intimacy.