We have implemented and successfuly demonstrated an interactive three dimensional computer graphics system that incorporates both stereoscopic viewing and direct object interaction with a force feedback I/O handheld device. With this system, the user views a graphics scene that occupies the three dimensional volume in front of him and simultaneously has the ability to move a cursor anywhere in that volume, touching and feeling whatver he sees. This system was implemented using two HP 735 UNIX workstations, a Phantom force feedback device, Crystal Eyes for stereoscopic viewing of a CRT, and specially written realtime collision detection software code. We were able to obtain stereoscopic rendering rates of 20 Hz for our database and force update rates of over 1000 Hz. Most users of this bimodal display reported zero latency while using the system, even though the system latency of 50 milliseconds was dictated by the graphics rendering time. This effort directly addresses the elements of synthetic Design for Assembly and Design for Manufacturability shown in Magellan [1], Hewlett-Packard's 1995 vision tape on the use of Virtual Reality for automobile design.
2.0 IntroductionThis technical memo describes the work performed during Jan 96 through Aug 96 to implement an interactive computer graphics system that demonstrates the use of a force feedback I/O device. This work arose from unreported work done in the previous year with a single HP UNIX workstation and a different force feedback device. That experience lead to the conclusion that a single computer could not both render graphics images and drive a force feedback device at an adequate rate for stable operation. At that time it was realized that at least two loosely coupled computers would be required to successfuly implement force feedback: one dedicated to stereoscopic graphics rendering and the other dedicated to calculating object collisions in realtime. We have answered one of the fundamental questions we had at the outset of this project: what visual and force feedback (haptics) rates should we strive for to provide the user with a good sense of interaction, i.e. no latency effects. It is interesting that our independent results of 20Hz video and 1000Hz haptics are identical to those recently reported elsewhere[2].
A force feedback system is an example of a compute intensive, minimal I/O bandwidth problem. This is because 1) the human hand needs a high rate of update to perceive a sense of solid contact when two graphics objects collide with each other[3], and 2) the amount of computation required to (naively) detect collisions for generalized 3D polygonalized objects grows as the square of the number of polygons in the scene[4]. A realtime collision detection algorithm is the technical core of achieving a system with satisfactory user response. For our demonstrations, we chose a 3300 polygon model of the front end of a Ford Mustang car and allowed the user to interact with the object with a point probe. In our software implementation of the collision detection algorithm, we chose to spatially partition the 3D volume occupied by the car into a fine enough lattice of cells so that a minimal number of polygons were in any one cell. In this way, we could change the cell division factor and control the number of polygons that a single point probe would have to interrogatge for collision. This solution to the collision detection problem assumed a single static object (the car) and a single moving object (the point.) However, this approach is easily extended to the general case of multiple static 3D objects and multiple moving 3D objects (although only one at a time.)
In this report, we first present a primer about haptics devices so that an interested reader can select a device appropriate for his needs. We define the hapton, a unit of system performance meant to quantify the capability of a computer to do useful force interaction work. This unit is the analog of the graphics rendering unit of performance, the polygon/second. With this as a background, we next describe the hardware configuration of our system, noting the type of equipment used and any latency issues critical to overall system performance. In the next section, we describe the software written both for the vision, haptics, and I/O subsystems. We conclude with a list of work recommendations for system improvement and functionality expansion.
3.0 Introduction to Haptics DevicesThere are several performance factors to keep in mind when selecting a haptics device for an interactive application: number of input/output degrees of freedom (DOF) for spatial control, maximum force/torque output vector, mechanical frequency response to the hand, physical range of operation, and position control resolution. The subset of devices commercially available today that can be used in conjunction with a computer graphics workstation span the DOF range from 1 DOF input/1 DOF ouput to 6 DOF input/6 DOF ouput. A 1 DOF linear input/output device, for example, could be used to simulate needle insertion into a synthetic patient. Needle position would be controlled by input position of the device while the simulaneous presentation of output forces would simulate the feel of needle passage through various tissue layers. A fully endowed 6 DOF input/output device allows for the arbitrary positiong of an object in 3 space (x,y,z and theta, phi, rho) while concurrently outputing realistic forces and torques. Synthetic wrenches could be implemented which could turn synthetic bolts. Stiction, friction, and torquing force should ideally be part of this haptics imaging display. As a general rule, physical world operating volumes for interesting devices operate in a 1 foot cubed volume and output forces up to 5 pounds and torques to 30 pound-inches. Spatial/angular resolutions can vary widely from 100 dpi/100 counts per revolution to 1000 dpi/512 counts per revolution. Frequency response to the hand (small amplitude, infinite load) varies from 1 Hz to over 30 Hz.
We propose a new unit of systems measurement that gives one a sense of the adequacy of a haptics computer system to do useful work. This unit is called the hapton and is related to the number of object collisions that can be calculated in a realistically described 3D world. Here, the hapton is defined as the number of triangle/triangle collision pairs that can be calculated to a spatial accuracy of 0.1% or better at the rate of 1000 Hz. Further, the triangles should be colliding on edge since this is the most stressing case for collision detection calculation.
4.1 Hardware DescriptionFigure 1 presents the hardware block diagram for our interactive graphics system. One 125 MHz HP 735 workstation was equipped with a 48Z computer graphics accelerator and was exclusively devoted to stereoscopic rendering. A standard knobbox was used as an I/O device on this vision computer to rotate the graphics scene around 3 axes. The second HP 735 workstation operated at 99MHz and exclusively ran the collision detection code. Each computer ran their main loops asynchronously of the other, but they each exchanged a minimal amount of information over an ethernet twist connection running Berekely sockets. The vision computer provided the current viewing transform to the haptics computer. The haptics computer supplied a visual correction offset vector to the vision machine (explained later) to provide the illusion of objects resting exactly at the point of collision.
Figure1 shows a Gateway 166 MHz Pentium PC connected to the two workstations also via the etnernet connection. This PC acted as an I/O controller to the Phantom device, reading the current position of the Phantom and providing it to the vision and haptics computers. The PC also accepted the force vector from the haptics computer for the current position reading and, output the forces to the handcontroller. The inclusion of this PC was necessitated by the fact that the ISA card interface to the Phantom could not be made to work with either our HP workstation or our HP Vectra PC. We have concluded this is a deficiency in the Phantom I/O card design, not to any problem with HPs implementation of the ISA standard.
The three computers were connected on their own local network, separate from any other LAN. This configuration minimized packet and provided the highest possible control bandwidth to the Phantom. We measured the roundtrip communications latency in this system at 0.7 milliseconds.
The Phantom device is a high performance haptics device suitable for use in a workstation environment[5]. We used the Model A which extends the work dimensions by 50% over a standard model. This device has a maximum force output vector of 1.9 pounds, physical operating range of 7.5 x 10.5 x 15 inches, 800 dpi linear resolution, and 512 counts per revolution of the rotation gimbals. The gimbal axis option allowed for a 6 Degree of Freedom input control; but the design of the Phantom is limited to 3 degrees of freedom output (linear). No torques can be output to the hand. The constrained, small amplitude mechanical bandwidth of the device is estimated to be 30 Hz[6]. In this configuration, the user holds a pencil-like stylus that moves the arms of the mechanism. By using the reported position and orientation of the imagined endpoint of the arm, an object in the graphics world can be controlled as to position and orientation.
Stereoscopic viewing was provided by using Crystal Eyes Liquid Crystal eyewear viewing the standard 4033A CRT screen in stereo mode. No head tracking was employed in this hardware configuration. This system performed at the 1 hapton level but is capable of approximately 5 haptons.
4.2 Software DescriptionThree separate code modules were required to implement this project: a collision detection code running exclusively on the UNIX haptics machine, a visualization code running exclusively on the UNIX vision machine, and an I/O code running on the Gateway PC to control the Phantom. The codes were written in C to maximize exeuction times. Both 735 workstations ran under HP-UX 9.05 and were set to the real time priority mode of operation. The Phantom code was obtained from the vendor and slightly modified to work with our system, most notably to add TCP/UDP communications to the workstations. No further description of this control code will be presented here.
Figure 2 is a flow chart of the software system developed for this project. (If the Phantom ISA card had worked with the HP workstations, the middle code loop would not be present.) The vision loop runs asynchronously of the haptics code and both are loosely coupled through the mechanism of passing data through the Gateway PC. The TCP/UDP communications protocol was used to communicate between machines to minimize I/O overhead transfer times. Even though UDP is an unreliable protocol, we never noticed any missed communications packets as long as the three computers were attached on their own LAN. We would routinely run the system for 8 hours per day. Transferring Phantom position coordinates required 6 floating point numbers; the view transform matrix was transferred as 12 floating point numbers; and the output force vector to the Phantom device required 3 floating point numbers. Latency between the haptics computer/Gateway PC was measured as 0.7 milliseconds.
4.2.1 Vision SoftwareThe vision code can best be described as a standard visualization package that uses the Starbase Graphics Library to render steroscopically viewed imagery. With respect to the appropriate part of the flow chart in Figure 2, the code first initializes the Starbase package and maps a graphics window to an X window. The preprocessing stage computes and collects polygon normals, required by the lighting pipeline stage in the 48Z graphics accelerator. The flow chart then enters the beginning of a loop that does a blocking read to the Phantom for current phantom position/orientation. This data is used to represent the view transform matrix for the moving object in the graphics scene. Next, the software polls a knobbox device which controls orientation of the static segment of the graphics world. After computing the view transform matrix based on these inputs, the vision machine outputs these 12 floating point numbers for later use by the haptics code. Finally, the code renders the stereo view of the scene.
We typically used a scene of a car composed of 3300 independent, non-meshed polygons. (Our collision detection code can handle any triangulated model that is convex or concave.) The ligthing model used one infinite light source and the object was modeled as flat shaded polygons. We found that we had to use backface culling to get the stereo frame rate up to 20 Hz. Without this feature enabled, we could only obtain frame rates of around 8 Hz--an unacceptable number for interactivity. With this rendering feature enabled, a user could orient the car so that occasionally a part of the scene that should be visible was invisible. While this caused some confusion (especially when you could feel something that was visually not present) the higher rendering rate was deemed very important. It is an outstanding mystery why a graphics accelerator rated at 600,000 polygons/sec can effectively deliver only 50,000 polygons/sec.
4.2.2 Haptics SoftwareOur choice of collision detection algorithm was predicated by our desire for simplified coding and our choice of 3D worlds, viz. a static collection of polygons describing a scene and a single moving point probe. The overall design of the code was a precalculation phase followed by a free running loop calculating collisions for the current position of the Phantom.
In the first step of the precalculation phase, we chose to spatially divide the static world into axis aligned cubes and assign the triangles in each cube to a linear data structure pointing to each triangle's geometry. The moving probe point generates a bounded line in 3 space between the current and last time samples of the Phantom position. It is the (possible) bounded intersection of this line with the polygons in the scene which determines if, and where, a collision has occurred. The amount of spatial partitioning was user specified; we usually chose a uniform sample of 32 by 32 by 32 . For our dataset (the Ford Mustang) this yielded an cell occupancy factor of 25% and each cube had associated 3 triangles on average (although the maximum number was 23 .) Triangles crossing cell boundaries were not further subdivided so as to minimize downstream collision detection calculations.
The second step in the precalcuation phase computed numbers for each static triangle required to test whether a collision point occurred inside or outside of the triangle. These aproach, based on the Jordan curve theorem, is a standard test and is described in [7].
Figure 3 shows an expanded flow chart of our collision detection logic. This is an expansion of the box labeled "Calculate collision state", Haptics Software in Figure 2. To understand this algorithm, keep in mind that the probe point sweeps out a line between the current and last position time samples. This line will be contained in an axis aligned box which itself is comprised of the original cells that spatially subdivide our static world. The strategy of the logic shown in Figure 3 is to quickly determine whether a collision has occurred because the precise collision location requires the greatest number of computer cycles. The logic proceeds from the coarse determination of collision (Step 1) to the finest determination of collision, the exact location in x,y,z of the interference (Step 4.) The first chance to terminate collision calculation occurs in Step 1 when the current cell is queried as to whether there are triangles within the cell. If this precomputed number is zero, the algorithm examines the next space cells occupied by the sweep line. For those cells occupied by triangles, the intersection of the sweep line with all triangles is calculated. Step 3 represents the next chance to terminate collision calculation, when a quick test is made as to whether the intersection point is within the bounded length of the sweep line. Assuming this test passes, we must determine whether the intersection point is within the bounds of the current triangle. Step 4 uses the precomputed numbers from the precalculation phase and the Jordan curve test described above to test whether this is a valid collision point. Assuming a valid collision point in x,y,z, this point is stored (Step 5.) Given the high frame rates experienced with this algorithm, the sweep line usually stays within a single cell. However, our bounding box approach is totally general and can accomodate a line spanning the entire extent of the 3D world. Triangles common to these several cells are only handled once for speedup of collision detection calculation.
For a general object, it is highly possible that the sweep line can collide with several static polygons-- especially in regions of high geometric curvature. After determing these M number of intersections, the algorithm selects the point with the greatest penetration depth. This point physically represents the first point of collision during the time interval t0 - t-1. The output force for this collision is a vector whose magnitude is proportional to a constant times penetration depth (Hooke's spring law) and whose direction is collinear with the polygon's surface normal. The sign of that vector is the negative of the sign of the included direction cosine between the sweep line and the polygon surface normal. With this determination, the collision force vector will be appropriate for approaching a polygon from either side.
Clearly when no collisions occur, the current probe position becomes the location for the next time sample's position. When a collision occurs, however, it is very important to estimate this next time sample's last position. This caveat becomes clearer when one realizes that the haptics control point is phsically inside the object it is has collided with (because of the finite stiffness of the device.) If this current position is used to update the last point probe position, there will be no collision during the next frame using this collision detection algorithm. To maintain the illusion of continuous contact with an object, the collision code must think it is crossing a polygonal surface. We choose to provide an offset vector from the currently buried probe location to redefine the last position at time sample t-1 for the next frame. This offset vector is computed from a combination of the current probe location, the collision point, the polygon's surface normal, and a distance magnitude parameter. Figure 4 shows this relationship more clearly. It is important to correctly pick the distance magnitude number for this offset vector calculation. A near zero value will result in an unacceptably large latency effect we call geometry induced latency as the algorithm tries to catch up to the latest input position. We choose a distance representing 0.0005 inches for our world. Also a visual offset vector is computed as the vector from the current probe location to the intersection point on the polygon. This vector is output to the PC controller and used by the vision computer to modify the visual position of the point probe. Using this mechanism, the probe can be made to appear to be at the surface during collision, when in actuality it is physically beneath the surface.
We have found this algorithm to be adequate. Though we haven't had time to test it against a number of different databases, it appears fast enough to provide a sense of hard contact with a surface during collision and robust enough to handle large databases. As long as the space division factor can be increased so the number of triangles in a cell doesn't exceed some (undetermined) number (10?), the collision detection can be expected to be fast enough. Only visual rendering time will be impacted by a higher polygon count. This algorithmic approach to collision detection allows one to feel the discontinuities between polygons and is very precise for tracing surfaces. One unexpected result of this algorithm and system is that the perceived operating latency is zero. It is known that one human sensory channel can suppress another one, but nothing more will be said here about this fortuitous discovery.
5.0 RecommendationsWe have implemented and successfully demonstrated a force feedback equipped interactive computer graphics system. We are at the point where we should implement a system capable of demonstrating synthetic assembly of MCAD parts into a complete mechanical system (a car, for example.) To this end we make the following recommendations: First, and foremost, the Phantom I/O hardware interface should be reworked to function with an HP workstation. This will give us approximately 5 times the compute bandwidth from any HP computer we expect to use in the future for this compute bound problem. Second, we should write the code to implement collision detection for a generalized 3D polygonized object. This code should be ported to a 4 processor, distributed memory system that has a small shared memory capability for I/O control and process synchronization. The full collision detection problem requires 3 independent geometry calaculations that can run in parallel for best haptics execution times. The 4th CPU would be dedicated to visual rendering. Use of this optimal hardware architecture should allow us to realistically assess how much compute power is required for a real world problem faced by our customers. Third, an attempt should be made at writing a haptics API that will demonstrate materials property differentiation (hard, soft, compliant, etc), and haptics texturing (friction, stiction, bumps, viscosity). And fourth, the Phantom stylus should be modified to have 3 finger operated switches. These switches would be useful for implementing software features like object picking, menu pulldown selection, drawing, sculpting, etc.
Such a system could be applied to solving CAD design/verification problems for one of HPs forward thinking and cooperative customers.
6.0 References[1] Magellan, 7/31/95, 15:17minutes. Contact Bojana Fazarinc, HP Corporate Marketing Communications Manager, 415-857-5970.
[2] Mark, W., Randolph,S., Finch, M., Van Verth, J., Taylor, R.; Adding Force Feedback to Graphics Systems: Issues and Solutions; Computer Graphics Proceedings , SIGGRAPH 96, pp. 447-452, 1996.
[3] Minsky, M., et. al.;,Feeling and Seeing: Issues in Force Display; Computer Graphics, Vol. 24, No. 2, pp. 235-243, 1990.
[4] Hahn, J.K.; Realistic animation of rigid bodies; Computer Graphics, Vol. 22, No. 4, pp. 299-308, 1988.
[5] Massie, T., Salisbury, J.K.; The PHANToM haptics interface: A device for probing virtual objects; ASME Winter Annual Meeting: Symposium on Haptic Interfaces for Virtual Environmnet and Teleoperator Systems, pp 295-302, Chicago, Nov 6-11, 1994.
[6] Private conversation with Ken Salisbury.
[7] Arvo, J., editor; Graphics Gems II. Academic Press Professional, 1991.