Empirical models based on free-modulus magnitude estimation of perceived presence in virtual environments.
The illusion that a user is present or immersed in a virtual environment (VE) is a primary goal of virtual reality (Ellis, 1993). It is the attempt to create an illusion of presence that distinguishes VEs from other human-computer interfaces. Anyone deciding to purchase or design a VE system is faced with a wealth of decisions concerning design tradeoffs and technologies to incorporate in attempting to create this sense of presence. Currently there is lack of both understanding and empirical documentation concerning this phenomenon, and several authors have cited a need for research in this area (e.g., Barfield & Weghorst, 1993; Held & Durlach, 1992; Sheridan, 1992). Although some studies have been published (e.g., Hendrix & Barfield, 1996; Singer, Witmer, & Bailey, 1994; Slater & Usoh, 1993a), many of these have been exploratory in nature, with presence most often measured via questionnaires. The objectives of the current research were (a) to test magnitude estimation as a ratio-scale measure of presence, (b) to build empirical models of presence using polynomial regressions on this measure to provide a useful tool for VE system designers and purchasers, (c) to study the simultaneous effects of a wide variety of VE system parameters, and (d) to assess sequential experimentation as a technique for accomplishing these objectives.
One goal of this research was to examine the simultaneous effects of a wide variety of VE system parameters on perception of presence using magnitude estimation, as proposed by Stevens (1953). In this technique, participants are presented with a series of stimuli and asked to assign a number to each based on their subjective impressions of the intensity of each stimulus. The only restriction imposed on the numbers assigned is that they be positive. Free-modulus magnitude estimation was the method used in this research and is generally preferred over an experimenter-defined modulus (Gescheider, 1985; Stevens, 1971).
In free-modulus magnitude estimation, the observer is told to assign any value to the first stimulus that seems appropriate and then assign successive numbers accordingly. For example, if a value of 10 is assigned to the first stimulus and a subsequent stimulus seems half as intense as the first, the participant would assign the second stimulus a value of 5. Stevens (1971) recommended making one judgment per stimulus per observer and then taking the geometric mean of these judgments as the psychological scale value associated with each stimulus when combining data across observers. Use of the geometric mean assumes that the numbers produced by the observers represent a ratio scale.
A primary objective of this series of studies was to build empirical models based on polynomial regression that would predict the relative effects of a large number of VE system parameters investigated in a series of sequential experiments. Box and Draper (1987) described central-composite designs and polynomial regression procedures for building empirical models that predict performance as a function of several independent variables. It has been argued that a second-order polynomial approximation is adequate to account for most human performance effects (Williges, 1981).
When the number of independent variables of interest is too large to include them in one experiment, a series of sequential experiments must be conducted. Williges and Williges (1989) recommended that a general paradigm for sequential experimentation include the three stages of selecting, describing, and optimizing independent variables. They suggested that central-composite designs and fractional-factorial designs provide efficient design alternatives for conducting these sequential experiments. The results from these separate experiments can then be combined into one data set in order to build an integrated empirical model.
Williges, Williges, and Hah (1993) recommended that sequential experimentation procedures for building integrated empirical models include the following primary constraints: (a) that all independent variables be defined prior to data collection, (b) that experimental procedures and dependent measures be constant across all experiments in the sequence, (c) that information be available at the outset concerning the levels of all factors manipulated and held constant, (d) that all main and pure quadratic effects of each independent variable be tested, and (e) that all studies have at least one data point or experimental condition in common to provide a direct estimate of the comparability of the studies. Satisfaction of these constraints allows for the possibility of bridging the data set across experiments for the purpose of constructing second-order empirical models of the effects of the variables manipulated.
System Factors in VEs
In a series of related experiments, 11 factors suspected of influencing perceived presence in a VE were manipulated. Criteria used in selecting these factors included their impact on VE system cost, complexity, and performance, as well as their relevance to existing theories and taxonomies of presence (e.g., Heeter, 1992; Loomis, 1992; Sheridan, 1996; Slater & Usoh, 1993a; Steuer, 1992; Zeltzer, 1992). The first study in this series manipulated field of view, scene update rate, and visual display resolution. The second study tested the effects of the presence or absence of sound, head tracking, stereopsis, texture mapping, and virtual personal risk. The third experiment examined the presence or absence of a second user together with the number of interactions possible in the VE and the detail of objects in the VE.
Each of the independent variables evaluated in these experiments is associated with some cost. The cost might be measured in terms of money (e.g., the cost of a head-tracking system), processing time (e.g., decreased frame rate attributable to texture mapping or increased object detail), programming time (e.g., to program and debug user interactions with the VE), or system weight or complexity (e.g., the increased complexity associated with producing two visual scenes for stereopsis). These costs involve trade-offs that a VE system engineer must consider during system design. To the extent that experimental design constraints and variable levels allowed, an attempt was made to group variables in these experiments so that key design trade-offs (e.g., field of view vs. display resolution) could be directly examined.
Experiment 1: Resolution, field of view, and scene update rate. There is a direct tradeoff between field of view and resolution in most head-mounted displays (HMDs): As one adjusts the optics of an HMD to achieve a larger field of view, the size of each pixel in the display is enlarged and the resolution of the image is decreased. Designers of VE systems must generally decide how much of one is desired at the expense of the other. Both of these variables have been proposed by a number of researchers as factors that influence the depth of presence in a VE, often under the general rubric of realism or sensory fidelity. Previous studies examining these variables indicate that there is a strong likelihood that they affect both performance and perceived presence in a VE.
Hatada, Sakata, and Kusaka (1980) studied the effect of field of view of two projected static scenes (a bridge and an open field) on "sensation of reality" as measured by a seven-point subjective scale. Their results indicate the beginning of an asymptote for sensation of reality at approximately 62 [degrees]. McGreevy (1993, 1994) conducted ethnographic studies of field geologists at work in real terrain with and without mediation by an HMD. Participants reported strong negative feelings toward narrow fields of view in connection with their ability to navigate, maintain spatial orientation, and maintain a mental model of the site.
Psotka, Davison, and Lewis (1993) contended that immersion can be defined as the degree of compatibility between the location of self in the real world and the location of self in a virtual world, and they equated high compatibility with high immersion or presence. They had participants view a virtual room displayed with four geometric fields of view and two fields of view. Subsequently, the participants were asked to draw the path of the viewpoint on a two-dimensional (2D) overhead picture of the room. Their results indicate that participants appeared to use the frame of the monitor as the frame of reference for their entire 200 [degrees] visual field of view.
Wells and Venturino (1990) found that search time increased with decreasing field of view; the negative impact of decreased field of view was greater in conditions with more distractors. In a similar study, Piantanida, Boman, Larimer, Gille, and Reed (1992) examined the effect of field of view on a visual search task. They found significant effects of field of view and presence of distractors, but only for distractors that were the same color as the target.
Cha, Horch, and Normann (1992) studied field of view and pixelized vision in a mobility task. Average speed of navigation increased from 0.2 to 0.6 m/s, whereas the average number of contacts with obstacles decreased from nine to three across 50 traversals, suggesting an ability to adapt to sensory limitations such as those normally associated with current VE technology. Given the many studies reporting effects of resolution, field of view, and scene update rate, it is important to predict the effects of these three factors on perceived presence.
Experiment 2: Stereopsis, head tracking, sound, texture mapping, and virtual personal risk. Arthur and Booth (1995) investigated stereopsis using a three-dimensional (3D) tree-tracing task. Performance was best in the head-coupled stereoscopic condition, followed by the head-coupled only, stereoscopic only, and no-depth-cue conditions. Chung (1992) reported a study in which he compared head-tracked and non-head-tracked steering modes in a task involving targeting of radiotherapy treatment beams in a VE. He found no significant performance or preference differences between head-tracked and non-head-tracked modes. Hendrix and Barfield (1996) found that perceived presence in a screen-projected VE increased by approximately 33% with inclusion of stereopsis, and roughly 40% with inclusion of translational head tracking.
Ehrlich and Singer (1994) found that neither stereopsis nor head tracking significantly improved performance of a variety of tasks in a VE. Similarly, Lampton et al. (1994) reported that contrary to their expectations, there was no apparent benefit of stereopsis in a distance estimation task even for stimuli at short distances. Based on the results of a study designed specifically to test for differential effects of stereopsis, Hsu, Pizlo, Babbs, Chelberg, and Delp (1994) suggested that stereopsis increases the sensitivity and specificity of observer performance when confounding influences are removed, such as varying viewing conditions, image intensity differences, ghosting, flicker, speed-accuracy trade-off, participants' stereoacuity, and the degree of task difficulty.
When Zenyuh, Reising, Walchli, and Biers (1988) combined a secondary task with a visual search task, they found that stereopsis increased accuracy performance on the visual search task but had no effect on response time. The interaction of stereopsis with other depth cues was examined by Reinhart, Beaton, and Snyder (1990). Stereopsis, relative size, luminance, and interposition were factorially manipulated in a simple relative-depth judgment task. They found that stereopsis did not affect the speed of participants' judgments significantly but that the other three variables did. In contrast, they found that stereopsis dominated participants' ratings of depth imaging quality such that participants rated the image quality of the display with respect to ability to determine depth much higher in stereoscopic conditions. One notable difference between the studies of Reinhart et al. (1990) and Zenyuh et al. (1988) is that of task difficulty. This further supports the hypothesis that beneficial effects of stereopsis might increase with the difficulty of a participant's task.
Much work has been done on the achievement of 3D externalized audio in VEs, and it has been shown to enhance localization of virtual stimuli (Wenzel, 1992). However, there have been no reports of examinations of the effects that auditory stimuli have on perceived presence or performance of nonauditory tasks. Anecdotal evidence that suddenly deafened adults experience a decreased sense of presence in the real world has been reported by Gilkey and Weisenberger (1995). This decreased sense of presence in the absence of sound suggests that inclusion of sound in a VE might increase one's sense of presence. The inclusion of sound as a variable goes to the heart of Steuer's (1992) concept of sensory breadth: Presence should increase as more senses are stimulated by a VE.
Except for the works of Slater and Usoh (1993a, 1993b), there is a similar lack of research on effects of texture mapping and virtual personal risk. Texture mapping refers to the common graphical technique of applying bit-mapped textures to object facets. Texture mapping has been found to improve performance in simulated helicopter flight, nap-of-the-earth flight, and sailing (Padmos & Milders, 1992). Virtual personal risk refers to the inclusion of situations in a VE that are designed to make a user feel some sense of risk or danger. The scientific literature examining the independent variables manipulated in Experiment 2 demonstrates performance effects. No systematic research, however, has been conducted on the influence of these factors on perceived presence in VEs.
Experiment 3: Object detail, presence of a second user, and number of possible interactions. Object detail directly affects the number of polygons processed by the image-generating system and the speed with which a VE application runs. This in turn directly affects the rate at which a user receives feedback concerning actions taken within the VE. Object detail, like several of the other variables manipulated in this research, falls into the general category of realism or sensory fidelity. Zeltzer (1992) claimed that these are the only factors that influence presence, whereas Heeter (1992) took an opposite view in which interactions and other users are primary determinants of presence. Atherton and Caporael (1985) reported finding a positive effect of increased object detail on subjective judgments of the form faithfulness and aesthetic appeal of spheres composed of varying numbers of polygons. Few published studies exist that examine the effects of the variables manipulated in Experiment 5 on perceived presence in a VE.
A large number of variables have been documented or postulated as having potential effects on perceived presence in a VE. However, systematic research is needed to provide an integrated model of these system variables. One of the primary goals of this research is to provide VE system designers with a useful design tool in the form of empirical models of perceived presence as measured by magnitude estimation and based on an integrated database generated through sequential experimentation.
Two Pentium-based personal computers were used to run the software that interfaced the VE system peripherals used in these experiments. The software used to generate the VE and control each experiment was Superscape VRT, produced by Superscape. Head tracking was done using Ascension Technologies' Flock of Birds, a DC-pulsed magnetic tracking system. Visual and auditory presentation of the VE was done via a VR4 HMD manufactured by Virtual Research Systems. Navigation through the environment and object manipulation were accomplished using Logitech Magellan, a 3D stationary control device. A standard Microsoft 2D mouse was used to interact with objects in the VE. The Magellan, Microsoft mouse, and participant's hands were supported by a rotating platform. Figure 1 shows the VE equipment used by participants in all experiments.
Each participant's viewpoint was attached to an invisible "body" that measured (virtually) 16.5 inches (41.9 cm) front to back and side to side. The figure of 16.5 inches is the same as that used by Lampton et al. (1994) and is based on the 50th percentile value for elbow-to-elbow breadth for adult males in the United States (Sanders & McCormick, 1987). Viewpoint and eye level were set at the participant's actual standing eye level as measured at the start of the experiment. This virtual body interacted with the environment by colliding with walls, door frames, and movable objects.
Participants were able to adjust their viewpoint with four degrees of freedom by controlling the virtual body through head tracking and manipulation of a 3D control device. The z axis rotation (i.e., roll left and right) and y axis translation (i.e., movement up and down) were held constant and not manipulated by the participant. Speed of rotation and translation (i.e., sensitivity of the software to input from the 3D control device and the position sensor) for all axes was set via software to correspond roughly to that normally associated with walking (i.e., approximately 7.6 m/s). During immersion, participants stood at a swiveling platform adjusted to a height comfortable to them (approximately standing elbow height) at the start of the experiment. Their left hand rested on the 3D controller while they held a standard 2D mouse in their right hand. Participants used the 2D mouse to control a standard arrow cursor and left clicked with this mouse to interact with objects beneath this cursor.
The VE task and the trials each consisted of five subtasks adopted from the work of Lampton et al. (1994) and are summarized in [TABULAR DATA FOR TABLE 1 OMITTED] Table 1. These subtasks were performed in the order shown in Table 1 and constituted the experimental task (i.e., a trial) for each treatment condition in each experiment. These tasks were chosen because they represent the types of tasks commonly performed in VE applications and could be compared with the results of Lampton et al. The differences between the tasks used in the present research and those described by Lampton et al. were minor and derived either from suggestions for changes made by those authors based upon their results or from a desire to limit the duration of immersion in the VE (and the length of experimental sessions).
For the turns and choice tasks, the dependent variables consisted of time to perform the task and error rate (errors were counted as the number of contacts with walls in the turns task). For the distance estimation task, the dependent variables were the actual (virtual) distances at each of four estimated distances. For the search task and the bins task, the dependent variable was response time. A complete description of the five subtasks used in the VE for these three sequential experiments was provided by Snow (1996).
The VE used in all experiments had floors with a checkerboard pattern, walls 2.44 m (8 feet) high with narrow vertical stripes every 1.52 m (5 feet), ceilings with horizontal light panels every 3.05 m (10 feet), and corridors 0.91 m (3 feet) wide. All measurements specified are virtual. Figure 2 shows an example of the participant's view during the distance estimation task.
Each experiment employed 12 participants, none of whom was allowed to participate in more than one experiment. Participants ranged in age from 16 to 42 years (average, 22 years). Of the 36 total participants, 28 were men and 8 women. Participants were paid $5/h. Of the 42 applicants who had passed the vision screening, 6 did not complete the experimental session. Of these 6, 5 ceased participation because of motion-sickness-like symptoms, invariably within the first 20 min after donning the HMD. The other stopped participation after the practice trial, complaining that the HMD could not be adjusted to be physically comfortable.
VE independent variables. Three experiments were conducted under the constraints of a sequential experimentation paradigm with a common data point included in all experiments. We manipulated 11 independent variables across these three experiments. These variables were scene update rate, visual display resolution, field of view, sound, textures, head tracking, stereopsis, virtual personal risk, interactions, presence of a second user, and object detail.
Scene update rate was measured in hertz (Hz) and referred to the rate at which the VE itself was updated (not the raster of the display). Visual display resolution was measured in arcminutes per pixel and field of view was measured in degrees. If sound was available, participants were given auditory feedback when they bumped into objects and clicked on interactable objects. When texture mapping was present in the VE, objects were given context-appropriate textures (e.g., wood grain for the doors) to the extent that preservation of the minimum scene update rate allowed. Participants were able to control their viewpoint rotationally by moving and turning their heads when head tracking was present. Unique left-eye and right-eye views were presented to each eye when stereopsis was present, whereas the same scene was presented to both eyes when stereopsis was absent.
During trials in which virtual personal risk was included, the rear doors of an elevator in the VE were absent and were replaced by a yellow- and black-striped warning bar, allowing participants to view the room and hallway for five (virtual) stories below as they descended. It should be noted that the actual risk felt by participants (if any) was not measured and likely varied widely depending on individual differences both in depth of presence and susceptibility to acrophobia. Although the number of objects in the environment remained constant in all conditions, the extent of interaction with these objects varied. In the high-interaction condition, participants were able to open and close doors and drawers, click on light switches to turn lights on and off, open and close a briefcase, and so forth. In the low-interaction condition, these objects did not respond to participants' attempts to activate them. Participants were informed that they were free to interact with [TABULAR DATA FOR TABLE 2 OMITTED] objects in the environment as much or as little as they wished as long as such interaction did not interfere with performance of timed tasks.
If a second user was present, the experimenter (as the second user) controlled and was represented by the human figure seen during the distance estimation task. The experimenter virtually accompanied the participant throughout the trial and interacted with the participant as if both were present in the VE. In general, this interaction consisted of conversation about events in the virtual environment, virtual views of each other's movement, occasionally humping into one another, and opening doors for one another. Object detail referred to the number of facets composing objects in the environment. In the high-object-detail condition, the most complex iteration of each object was apparent, whereas in the low-object-detail condition, the least complex iterations of objects were seen.
Experiment 1. The first experiment combined scene update rate, visual display resolution, and field of view in a 3 x 3 x 2, within-subject, factorial design. There were three levels of scene update rate (8, 12, and 16 Hz), three levels of field of view (48 [degrees] horizontally [H] x 36 [degrees] vertically [V], 36 [degrees] H x 27 [degrees] V, and 24 [degrees] H x 18 [degrees] V), and two levels of visual display resolution (corresponding to pixel addressabilities of 320 H x 200 V and 640 H x 480 V). Three levels of scene update rate and field of view were employed so that the quadratic effects of both factors could be tested. The two levels of visual display resolution that were tested represented the full range of capability of the VE hardware/software system used in the experiments. The experimental design used in Experiment 1 is summarized in Table 2, in which [X.sub.1], [X.sub.2], and [X.sub.3] are the factors manipulated in this experiment and [X.sub.4] through [X.sub.11] are factors that were held constant at the levels specified in this table.
[TABULAR DATA FOR TABLE 3 OMITTED]
Experiment 2. The second experiment combined sound, texture mapping, head tracking, stereopsis, and virtual personal risk in a within-subject, Resolution V, [2.sup.5-1] fractional-factorial design. Each of the variables in this study had two levels: present or absent. In this design, higher-order interaction effects (i.e., three way and four way) could not be assessed because [TABULAR DATA FOR TABLE 4 OMITTED] they were aliases of main effects and two-way interactions. The sacrifice of these higher-order interactions was intentional and acceptable given the goal of construction of second-order empirical models. The experimental design used in Experiment 2 is summarized in Table 3, in which [X.sub.4], [X.sub.5], [X.sub.6], [X.sub.7], and [X.sub.8] are the five factors manipulated in this experiment and [X.sub.1], [X.sub.2], [X.sub.3], [X.sub.9], [X.sub.10], and [X.sub.11] are factors that were held constant at the levels specified in this table.
Experiment 3. Another 3 x 3 x 2, within-subject, factorial design was used in Experiment 3 to investigate the number of different interactions possible in the VE (6, 12, and 18), object detail (high, medium, and low), and presence of a second user in the VE (present or absent). The experimental design used in Experiment 3 is summarized in Table 4, in which [X.sub.9], [X.sub.10], and [X.sub.11] are the factors manipulated in this experiment and [X.sub.1] through [X.sub.8] are factors that were held constant at the levels specified in this table.
Each experimental session began with participants reading an introduction to the study. They were then screened for normal visual acuity, color perception, and stereoacuity, and their standing eye height was measured so that their virtual eye height could be set accordingly. They then read and signed an informed consent form and read experimental instructions. Participants became familiar with the concept and process of magnitude estimation by estimating the lengths of 20 lines presented in random order on 20 sheets of paper. Next, participants were shown how to don the HMD and were given a demonstration trial in which the experimenter donned the HMD and went through the entire VE, demonstrating each task in sequence and verbally pointing out objects with which interaction was possible. During this demonstration, participants were able to view the VE on a nearby monitor. Each participant then donned the HMD and went through one practice trial that included all five subtasks.
Participants were told that they could ask any questions they wished during the demonstration and practice thais but that the experimenter would attempt to remain silent during the data trials to follow. The experimental condition of the demonstration and data trials was the same as that common to all three experiments. Following this practice trial, participants were given a short (3-5 min) break during which they could take off the HMD, sit down, get a drink of water, and so forth. Participants were given similar breaks after every fourth data trial in all three experiments.
Trials averaged roughly 7 min in length. Participants performed a number of trials equivalent to the number of conditions in the experiment: one trial from each condition. The order of presentation of these conditions was determined randomly. Performance feedback was given at the end of each trial for each of the five subtasks. After viewing this feedback, participants provided a free-modulus magnitude estimate of their level of perceived presence during the trial. Specifically, they were asked to assign a number to their feeling of how much they felt as if they were actually present in the virtual environment during performance of the tasks in that trial.
The length of each experimental session varied from participant to participant and between experiments. The time spent in the HMD averaged approximately 2.5 h. Participants were closely monitored throughout each session for adverse reactions to the VE system and were verbally reminded by the experimenter of their freedom to withdraw if they demonstrated or reported any adverse side effects of immersion.
Data were analyzed separately for each of the three sequential experiments and were also combined into an integrated data set. A complete discussion of the performance measures results is provided by Snow (1996) but is beyond the scope of the current paper. In general, the results indicate a positive relationship between perceived presence and performance on each of the five subtasks. However, significant correlations between measures of performance and perceived presence were generally low ([absolute value of r] [less than] .3 in all cases).
Modulus Equalization of Perceived Presence
The distribution of estimates that people give in magnitude estimation experiments is usually positively skewed. These distributions are typically lognormal, with error increasing as stimulus magnitude increases (Stevens, 1971). Therefore, the geometric mean rather than the arithmetic mean is taken because this produces an unbiased estimate of the expected value of a lognormal variate (Cross, 1974). This is fairly standard procedure in the analysis of magnitude estimation data, but it begs the question of what to do when one wishes to do something other than plot a regression line through measures of central tendency at different stimulus values.
Stevens (1971) used a data reduction process for free-modulus magnitude estimation data designed to (a) adjust the data to a common modulus, thereby reducing variability in the data arising purely from participants' selection of different moduli and number ranges, and (b) provide normally distributed data suitable for analysis using parametric statistics. This procedure, called modulus equalization by Stevens (1971), was refined by Snow (1996) and used on all magnitude estimation data collected in this experimental series. Modulus equalization of the perceived presence scores from each experiment to the common modulus of that experiment was done by the following method:
1. Convert all responses to their logarithmic values. This converts the distribution of responses from lognormal to normal.
2. Calculate the mean of each individual's responses. This yields the individual's modulus.
3. Calculate the mean of all responses to yield the common modulus.
4. Subtract the result of Step 3 from the result of Step 2 for each individual. This yields a set of constants representing the offsets of each individual modulus from the common modulus.
5. Subtract the constant obtained in Step 4 for each individual from that individual's scores obtained in Step 1. This yields a data set that is normally distributed and adjusted to a single, common modulus.
6. Take the antilog of all values obtained in Step 5. This yields a ratio-scale data set (i.e., one with an origin of zero in which the ratios of scale values are meaningful) in the same units as individuals' original responses (albeit still in units of subjective magnitude).
Prior to building empirical models of perceived presence, we calculated the exponent of the power function of the free-modulus magnitude estimates of line length during practice. The plot of the power function relating log actual line length (y) to log estimated line length (x) resulted in a slope of 1.00, which is expected if participants understand the procedures for using free-modulus magnitude estimation (Gescheider, 1985). The resulting regression equation is y = 1.002x + 0.63, with [R.sup.2] = .998.
Empirical Models of Perceived Presence
All data and regressors were standardized before performing each polynomial regression. Consequently, the resulting parameter estimates (the [Mathematical Expression Omitted]) are all in the same units (i.e., z scores), thereby allowing direct comparison of the influence of the various regressors on the predicted response in each regression equation. In addition, standardized regressions reduce multicollinearity among the regressors resulting from inclusion of second-order terms (Montgomery & Peck, 1992). Regressions were calculated for all possible combinations of all regressors for each predicted response. The empirical models reported represent the regression equation with the combination of smallest residual mean squares (equivalent to selecting the model with the largest adjusted [R.sup.2]) and smallest prediction error sum of squares (prediction residual sum of squares [PRESS] statistic).
Experiment 1. An analysis of variance (ANOVA) was conducted on the effects of scene update rate, visual display resolution, and field of view on the equalized estimates of perceived presence. The main effects of scene update rate, visual display resolution, and field of view were significant (p [less than] .05), but none of their interactions was significant (p [greater than] .05). Mean presence in the low visual display resolution condition was 6.40 and mean presence in the high visual display resolution condition was 7.85. The mean presence of the 8-, 12-, and 16-Hz scene update rate conditions was 6.74, 6.97, and 7.66, respectively. For field of view, the mean presence of the low, medium, and high conditions was 4.02, 7.39, and 9.97, [TABULAR DATA FOR TABLE 5 OMITTED] respectively. The standard errors of these means were between 0.20 and 0.50.
The best resulting second-order, standardized empirical model of the effects of scene update rate, visual display resolution, field of view, and minutes spent in the VE is shown in Equation 1:
PP = 0.10[X.sub.1] + 0.17[X.sub.2] + 0.59[X.sub.3] + 0.21[X.sub.12] + 0.10[[X.sub.12].sup.2], (1)
where PP = equalized magnitude estimate of perceived presence in the VE, [X.sub.1] = scene update rate, [X.sub.2] = visual display resolution, [X.sub.3] = field of view, and [X.sub.12] = minutes spent in the VE. The standard errors and t tests of significance of each parameter estimate in the empirical model are summarized in Table 5.
Experiment 2. The ANOVA for the effects of sound, texture mapping, head tracking, stereopsis, and virtual personal risk on presence resulted in statistically significant main effects of sound, texture mapping, head tracking, and stereopsis (p [less than] .05), as summarized in Table 6. Neither the main effect of virtual personal risk nor any of the two-way interactions tested in this experiment was statistically significant (p [greater than] .05).
TABLE 6: Effects of Sound, Texture Mapping, Head Tracking, and Stereopsis in Experiment 2 Off On Effect Mean Standard Error Mean Standard Error Sound 7.47 0.42 13.73 0.74 Texture mapping 9.79 0.61 11.42 0.73 Head tracking 8.12 0.54 13.08 0.71 Stereopsis 9.65 0.65 11.56 0.70
The best resulting second-order, standardized empirical model of the effect of sound, texture mapping, head tracking, stereopsis, and virtual personal risk on perceived presence is shown in Equation 2:
PP = 0.47[X.sub.4] + 0.12[X.sub.5] + 0.37[X.sub.6] + 0.14[X.sub.7] + 0.09[X.sub.4][X.sub.6], (2)
where PP = equalized magnitude estimate of perceived presence in the VE, [X.sub.4] = sound, [X.sub.5] = texture mapping, [X.sub.6] = head tracking, and [X.sub.7] = stereopsis. The standard errors and t tests of significance of each parameter estimate in the empirical model are summarized in Table 7.
Experiment 3. The ANOVA on the effects of interactions, presence of a second user, and object detail on estimates of perceived presence showed that the presence of a second user and the interaction between object detail and number of interactions were both statistically significant (p [less than] .05). When the second [TABULAR DATA FOR TABLE 7 OMITTED] user was absent, mean presence was 13.29 with a standard error of 0.39. With the second user present, mean presence was 14.58 with a standard error of 0.34.
The best resulting second-order, standardized empirical model of interaction in the VE, presence of a second user, object detail, and minutes spent in the VE is shown in Equation 3:
PP = 0.13[X.sub.9] + 0.20[X.sub.10] + 0.51[X.sub.12] + 0.11[[X.sub.11].sup.2] - 0.09[[X.sub.12].sup.2], (3)
where PP = equalized magnitude estimate of perceived presence in the VE, [X.sub.9] = interactions in the VE, [X.sub.10] = presence of second user, [X.sub.11] = object detail, and [X.sub.12] = minutes spent in the VE. The standard errors and t tests of significance of each parameter estimate in the empirical model are summarized in Table 8.
Integrated empirical model. An ANOVA of perceived presence for the common data point in the three experiments resulted in a significant [TABULAR DATA FOR TABLE 8 OMITTED] difference, F(2, 33) = 4.46, p [less than] .019, between experiments. Scheffe tests demonstrated that the mean perceived presence in Experiment 3 was significantly lower than in Experiments 1 and 2 (p [less than] .05). Further, the standard deviation of presence estimates in Experiment 3 (3.86) was much smaller than that in Experiments 1 and 2 (5.89 and 6.80, respectively). Therefore, it was concluded that the data collected in the three experiments do not represent a uniform data set sampled from the same population. Given this conclusion and a notable dearth of significant two-way interactions found in the first two experiments, it was decided that further time, effort, and money spent in data collection would not be productive.
The data from the first two experiments were combined and used to perform a polynomial regression predicting perceived presence from the eight variables manipulated in these two experiments plus minutes spent in the VE. The best resulting second-order, [TABULAR DATA FOR TABLE 9 OMITTED] standardized empirical model generated from the integrated data set is shown in Equation 4:
PP = 0.13[X.sub.2] + 0.52[X.sub.3] + 0.47[X.sub.4] + 0.15[X.sub.5] + 0.40[X.sub.6] + 0.12[X.sub.7] + 0.15[X.sub.12] - 0.08[[X.sub.12].sup.2] + 0.06[[X.sub.1].sup.2] + 0.06[X.sub.4][X.sub.5] + 0.09[X.sub.4][X.sub.6], (4)
where PP = equalized magnitude estimate of perceived presence in the VE, [X.sub.1] = scene update rate, [X.sub.2] = visual display resolution, [X.sub.3] = field of view, [X.sub.4] = sound, [X.sub.5] = texture mapping, [X.sub.6] = head tracking, [X.sub.7] = stereopsis, and [X.sub.12] = minutes in the VE. The standard errors and t tests of significance of each parameter estimate in the empirical model are summarized in Table 9.
The significant main effects of the factors shown in Table 9 are of varying importance, as indicated by the relative magnitudes of the standardized parameter estimates. The percentage increases in perceived presence associated with the difference in presence estimates between the lowest and highest level of the variables manipulated in each experiment were field of view (148%), sound (83%), head tracking (63%), visual display resolution (23%), stereopsis (20%), texture mapping (17%), and scene update rate (14%).
Although the exact weightings of the same parameters differed across the various empirical models because of the effects of multicollinearity among regressors, the relative magnitude of their weightings remained consistent. The 11 VE factors manipulated in this series of experiments had primarily first-order main effects in the resulting four empirical models. Only a limited number of linear-by-linear interactions among sound, texture mapping, and head tracking were significant. Consequently, second-order polynomial regressions provided an adequate fit to the data. In general, an increase in the effect of any of the factors manipulated in the VE resulted in a positive increase in perceived presence in the VE.
Even though these empirical models are not causative models, they may provide a useful tool for designers and purchasers of VE systems interested in trading-off system parameters to achieve a desired level of perceived presence in a VE. It seems the value of achieving presence in VE applications other than for entertainment has yet to be empirically established. However, use of these models and the technique of free-modulus magnitude estimation may provide an opportunity to explore the relationship, if any, between perceived presence in a VE and user performance in the various task domains to which VE technology is now being applied (e.g., education and telemedicine). As VEs are used increasingly for training and design simulation, the use of these empirical models in making design trade-off decisions may become particularly important.
The regression analyses of the empirical models from Experiment 1, Experiment 2, and the integrated data set consistently accounted for over 40% of the variance in the magnitude estimates of perceived presence data. Comparison of the standardized parameter estimates associated with each term in the model shows that field of view, sound, and head tracking had the largest effects on perceived presence. The parameter estimates associated with these variables show that they each had roughly three times as much influence on estimates of perceived presence in comparison with smaller effects found for visual display resolution, texture mapping, stereopsis, and time spent in the VE. Finally, small but statistically significant effects were found for the quadratic effects of scene update rate and time spent in the VE and the interaction between sound and head tracking.
Although these empirical models cannot be interpreted as theoretical models of perceived presence, the results of the sequential experiments suggest parameters that need to be emphasized in building a theoretical model. Certainly, increases in field of view as well as the inclusion of sound and head tracking appear to facilitate a sense of immersion in VEs and need to be included in any model of perceived presence.
Of the theories and schema proposed to account for presence in VEs, perhaps the environmental factors proposed by Steuer (1992) are most appropriate. His ideas and predictions concerning vividness and interactivity generally match the findings of this research. The large effects of field of view and sound nicely fit the concepts of sensory depth and sensory breadth encompassed within vividness, as do the effects of visual display resolution, texture mapping, and stereopsis. The effects of head tracking, scene update rate, and presence of a second user seem to match his conception of interactivity. The one finding of this research that seems inconsistent with his theory is that perceived presence was not affected by the number of interactions possible in the VE in the third experiment. This could be explained by the possibility that some participants focused strictly on task performance and did not explore possible interactions, or the unnaturalness of interacting with the VE through the use of a 2D mouse. It may be that the number of possible interactions in a VE has little or no effect on perceived presence. One common theme among many theories is that time spent in the VE or adaptation to the VE causes an increase in the experience of presence. This also seems consistent with the results of this research.
This research supported the value of using a common data point in sequential experimentation designs. Without this common data point, there would have been no means of testing the comparability of the separate data sets before combining data across experiments. Given the outcome of this research, it seems especially important to design a common data point in experimental sequences in which a subjective measure is the primary dependent variable.
Reasons for the difference between perceived presence in the third experiment and perceived presence in the other two experiments remain unclean The experimental procedure, apparatus, and the experimenter were all identical. Post-hoc explorations of the sample populations with respect to age and gender revealed no discernible differences. Perhaps the most likely explanation for this difference is the variation in range of perceived presence in the experiments. In the third experiment, only the effect of presence of a second user showed a statistically significant effect on perceived presence, and this effect was relatively small. It may be that participants simply did not notice much difference from trial to trial in this experiment. This is contrasted with the other two experiments, in which, given the effects found, it may be surmised that participants typically experienced significant increases or decreases in presence from trial to trial.
Future research taking advantage of sequential experimentation strategies should include a common data point in each experiment in the series. This is especially true if the dependent variable under study is subjective in nature. Sequential experimentation seems an efficient tool to use when a large number of independent variables are to be examined simultaneously, especially when compared with the usual alternative: a single, prohibitively large factorial experiment. However, care must be taken in its use, and testing of its underlying assumptions is imperative prior to empirical modeling. Second-order polynomial regressions provided a convenient way of constructing empirical models that designers of VE systems can use to increase the perceived presence of users of these systems.
Free-modulus magnitude estimation is a straightforward procedure that allows accurate measurement of users' sense of presence in the VE. This metric can be predicted by several VE system variables and can be used as a standard metric for evaluating and predicting perceived presence in a variety of VE task applications.
Michael Snow completed this research as part of his dissertation research under the Palace Knight Program of the U.S. Air Force. John Reising served as his U.S. Air Force mentor in this program. The VE equipment used in this research was provided through a research initiative grant from the National Science Foundation.
Arthur, K. W., & Booth, K. S. (1993). Evaluating 3D task performance for fish tank virtual worlds. ACM Transactions on Information Systems, 11 (3), 239-265.
Atherton. P. R., & Caporael. L. R. (1985). A subjective judgment study of polygon based curved surface imagery. In CHI '85 Proceedings (pp. 27-34). New York: Association for Computing Machinery.
Barfield, W., & Weghorst, S. (1993). The sense of presence within virtual environments: A conceptual framework. ln G. Salvendy & M. Smith (Eds.), Human-computer interaction: Software and hardware interfaces (pp. 699-704). Amsterdam: Elsevier.
Box, G. E. P., & Draper, N. R. (1987). Empirical model building and response surfaces. New York: Wiley.
Cha, K., Horch, K. W., & Normann, R. A. (1992). Mobility performance with a pixelized vision system. Vision Research, 32, 1367-1372.
Chung, J. C. (1992). A comparison of head-tracked and non-headtracked steering modes in the targeting of radiotherapy treatment beams. In Proceedings of the 1992 ACM Symposium on Interactive 3D Graphics (pp. 193-196). New York: Association for Computing Machinery.
Cross, D. V. (1974). Some technical notes on psychophysical scaling. In H. R. Moskowitz, B. Scharf, & J. C. Stevens (Eds.), Sensation and measurement: Papers in honor of S. S. Stevens (pp. 23-35). Dordrecht, Holland: D. Reidel.
Ehrlich, J. A., & Singer, M. J. (1994). Are stereoscopic displays necessary for virtual environments? In Proceedings of the Human Factors and Ergonomics Society 38th Annual Meeting (p. 952). Santa Monica, CA: Human Factors and Ergonomics Society.
Ellis, S. R. (1993). What are virtual environments? In ICAT '93 (pp. 81-89). Tokyo: International Conference on Artificial Reality and Tele-existence.
Gescheider, G. A. (1985). Psychophysics: Method, theory. and application. Mahwah, NJ: Erlbaum.
Gilkey, R. H., & Weisenberger, J. M. (1995). The sense of presence in the suddenly deafened adult: Implications for virtual environments. Presence. 4, 357-363.
Harada, T., Sakata, H., & Kusaka. H. (1980). Psychophysical analysis of the "sensation of reality" induced by a visual widefield display. SMPTE Journal, 89, 560-569.
Heeter, C. (1992). Being there: The subjective experience of presence. Presence, 1, 262-271.
Held, R. M., & Durlach, N. I. (1992). Telepresence. Presence, 1, 109-112.
Hendrix, C., & Barfield, W. (1996). Presence within virtual environments as a function of visual display parameters. Presence, 5, 274-290.
Hsu, J., Pizlo, Z., Babbs, C. F., Chelberg, D. M., & Delp, E. J. (1994). Design of studies to test the effectiveness of stereo imaging. Truth or dare: Is stereo viewing really better? In Proceedings of the SPIE - The International Society for Optical Engineering, 2177 (pp. 211-222). Bellingham, WA: SPIE.
Lampton, D. R., Knerr, B. W., Goldberg, S. L., Bliss, J. P., Mosheli, J. M., & Blau, B. S. (1994). The virtual environment performance assessment battery (VEPAB): Development and evaluation. Presence, 3, 145-157.
Loomis, J. M. (1992). Distal attribution and presence. Presence, 1, 113-119.
McGreevy, M. W. (1993). The presence of field geologists in a Mars-like terrain. Presence, 1, 375-403.
McGreevy. M. W. (1994). Ethnographic object-oriented analysis of explorer presence in a volcanic terrain environment: Claims and evidence (Tech. Report NAS 1.15:108823). Moffett Field, CA: National Aeronautics and Space Administration.
Montgomery, D.C., & Peck, E. A. (1992). Introduction to linear regression analysis. New York: Wiley.
Padmos, P., & Milders, M. V. (1992). Quality criteria for simulator images: A literature review. Human Factors, 3, 727-748.
Piantanida. T. P., Boman, D., Larimer, J., Gille, J., & Reed, C. (1992). Studies of the field-of-view/resolution trade-off in virtual-reality systems. In Human vision, visual processing and digital display III (pp. 448-456). Bellingham, WA: SPIE.
Psotka, J., Davison, S. A., & Lewis, S. A. (1993). Exploring immersion in virtual space. Virtual Reality Systems, 1(2), 70-84.
Reinhart, W. F., Beaton, R. J., & Snyder, H. L. (1990). Comparison of depth cues for relative depth judgments. ln Proceedings of the SPIE - The International Society for Optical Engineering, 1256 (pp. 12-21). Bellingham, WA: SPIE.
Sanders, M. S. & McCormick, E. J. (1987). Human factors in engineering and design (6th ed.). New York: McGraw-Hill.
Sheridan, T. B. (1992). Musings on telepresence and virtual presence. Presence, 1, 120-126.
Sheridan, T. B. (1996). Further musings on the psychophysics of presence. Presence, 5, 241-246.
Singer, M. J., Witmer, B. G., & Bailey, J. H. (1994). Development of "presence" measures for virtual environments. In Proceedings of the Human Factors and Ergonomics Society 38th Annual Meeting (p. 983). Santa Monica. CA: Human Factors and Ergonomics Society.
Slater, M., & Usoh, M. (1993a). The influence of a virtual body on presence in immersive virtual environments. In Virtual Reality International 93: Proceedings of the Third Annual Conference on Virtual Reality (pp. 34-42). London, UK: Meckler.
Slater, M., & Usoh, M. (1993b). Representation systems, perceptual position and presence in virtual environments. Presence, 2, 221-233.
Snow, M. P. (1996). Charting presence in virtual environments and its effects on performance. Unpublished doctoral dissertation, Virginia Polytechnic Institute and State University. [Online]. Available: http://scholar.lib.v .edu/theses/public/etd-19311417119625510/etd-title.html.
Steuer, J. (1992). Defining virtual reality: Dimensions determining telepresence. Journal of Communication, 42(4), 73-93.
Stevens, S.S. (1953). On the brightness of lights and loudness of sounds. Science, 118, 576.
Stevens, S. S. (1971). Issues in psychophysical measurement. Psychological Review, 78, 426-450.
Wells, M. L. & Venturino, M. (1990). Performance and head movements using a helmet-mounted display with different sized fields of view. Optical Engineering, 29, 870-877.
Wenzel, E. M. (1992). Localization in virtual acoustic displays. Presence, 1, 80 107.
Williges, R. C. (1981). Development and use of research methodologies for complex system/simulation experimentation. In M. Morral & K. Kraiss (Eds.), Manned system design (pp. 59-87). New York: Plenum.
Williges, R. C., & Williges, B. H. (1989). Integrated research paradigm for complex experimentation. In Proceedings of the Human Factors Society 33rd Annual Meeting (pp. 606-610). Santa Monica, CA: Human Factors and Ergonomics Society.
Williges, R. C., Williges, B. H., & Han, S. H. (1993). Sequential experimentation in human-computer interface design. In H. R. Hartson & D. Hix (Eds.), Advances in human-computer interaction (pp. 1-30). Norwood, NJ: Ablex.
Zeltzer, D. (1992). Autonomy, interaction, and presence. Presence, 1, 127-152.
Zenyuh, J., Reising, J. M., Walchli, S., & Biers, D. (1988). A comparison of a stereoscopic 3-D display versus a 2-D display using advanced air-to-air format. In Proceedings of the Human Factors Society 32nd Annual Meeting (pp. 53-57). Santa Monica. CA: Human Factors and Ergonomics Society.
Michael E Snow earned a doctorate in industrial and systems engineering (human factors engineering option) from Virginia Polytechnic Institute and State University in 1996. He is currently a crew systems research engineer for the Crew System Interface division of the U.S. Air Force Research Laboratory.
Robert C. Williges is the Ralph H. Bogle Professor of Industrial and Systems Engineering at Virginia Polytechnic Institute and State University. He is also a professor of psychology and computer science and director of the Human-Computer Interaction Laboratory and of the Usability Methods Research Laboratory. He received a Ph.D. in 1968 in engineering psychology from Ohio State University.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Special Section: Virtual Environments: Models, Methodology, and Empirical Studies|
|Author:||Snow, Michael P.; Williges, Robert C.|
|Date:||Sep 1, 1998|
|Previous Article:||Ocular vergence measurement in projected and collimated simulator displays.|
|Next Article:||Effects of variation in system responsiveness on user performance in virtual environments.|