Attention distribution and decision making in tactical air combat.
More than a decade ago, Flathers, Giffen, and Rockwell (1982) discussed the paucity of data available on pilot decision making. "Though pilot judgment has long been recognized as an important factor in the safety of flight, research in the area has largely been limited to observations made from outside the cockpit" (p. 959). Much of the existing research, furthermore, has involved retrospective analyses of the decision making of pilots involved in accidents or incidents and most frequently has focused on civil or commercial pilots.
The shortage is even greater for research on decision making related to military aviation. Despite a concerted thrust to provide military pilots with decision making aids through programs like the U.S. Air Force's Pilot's Associate (for a review, see Hammer and Small, 1995), information on how pilots of tactical aircraft actually process their environment and make decisions has remained largely anecdotal and reliant on recall.
The realm of the tactical fighter embodies many characteristics that make effective decision making challenging for even the best pilots. Houck, Whitaker, and Kendall (1993) provided an excellent overview of the tasks and difficulties involved in tactical aircraft operations. "The inherent nature of the BVR combat environment places enormous demands on the pilot's cognitive resources because of task saturation, time compression and incomplete or unconfirmed information" (p. 9). The incredible speed and maneuvering capabilities of high-performance aircraft lead to the need for very rapid decision times. In addition, hostile aircraft may have equally rapid flight paths and capabilities, compounding the problem. The consequences of error under these conditions are very high - frequently lethal.
The inherent complexity of the environment can also impose a major burden. In addition to the flight tasks to which most pilots must attend, the fighter pilot must work to employ the aircraft tactically against an opponent or opponents in an effort to accomplish prescribed mission goals. Frequently they lack direct, complete, or even accurate information about these opponents and often must operate in a beyond visual range (BVR) mode. Using on-board sensors and avionics systems, the pilot must actively work to obtain remote information, integrate that information, and form appropriate tactics and actions. This has been likened to "looking at the world through a soda straw," even though current graphically based color tactical situation displays (TSD) provide a far more direct spatial analog of the tactical environment than do traditional b-scope radar displays.
Even with the handicap of constrained information about the environment, there is a tremendous problem with information overload. Piecemeal addition of systems and lack of integration of information are often cited as major contributing factors. In a combat environment there may also be a high density of aircraft operating in a limited arena, leading to serious concerns about overload.
An additional challenge lies in a lack of predictability in the environment (even with detailed intelligence data). Carefully formulated premission plans must often be scrapped or revised in flight because of rapid changes in the situation. Generally the problems that pilots encounter are largely unstructured, and a wide range of actions are possible.
Like many experts, pilots often find it difficult to verbalize rules that can fully explain the breadth and depth of decision knowledge they have acquired experientially over time. In this environment pilots may rely more on the processing of dynamic spatial relationships and pattern matching than on fixed rules. Furthermore, understanding tactical aviation decision making is complicated because there is little agreement on what constitutes a "right" decision, even among the highly experienced. Although pilots can often point out poor decision making, there appears to be considerable room for diversity in what they do. Being unpredictable to the enemy is actually one of their goals. Tactics and strategies are taught in the military training process, but anecdotal information indicates that there is a wide range of discretion in implementation.
These domain characteristics map closely to those included in an area of research that has been labeled naturalistic decision making (Orasanu & Connolly, 1993). Research on naturalistic decision making seeks to develop descriptive models of how people - usually experts - make decisions about the dynamic, unstructured problems encountered in real-world settings. This is in contrast to more traditional decision theory research, based largely on static laboratory decision-making tasks, which assumes a normative model of generating multiple alternatives that are then evaluated based on some criteria (Raiffa, 1970).
Research indicates that people in real-world environments make rapid decisions using a process of situation recognition and pattern matching to memory structures (Dreyfus, 1981; Endsley, 1995; Klein, 1989, 1993; Klein, Calderwood, & Clinton-Cirocco, 1986). A naturalistic model of decision making describes a process in which pattern-matching mechanisms draw on long-term memory to classify a situation based on schemata of prototypical situations (Endsley, 1995). Stored responses or scripts are frequently tied to these situation classifications, yielding almost immediate response selection from memory. This process bears little relation to that advocated by normative decision models (Beach & Lipshitz, 1993), in which the bulk of the decision process focuses on criteria and procedures for evaluating among multiple decision options. The decision maker's emphasis in a naturalistic model is primarily on classifying the situation; very little effort is devoted to an examination of multiple alternatives (Klein, 1989).
In searching for descriptive information about how pilots make decisions in tactical environments, some useful information can be derived from the excellent studies that have been done on aviation decision making. Amalberti and Deblon (1992) provided a descriptive analysis of decision activities and strategies based on a study of 12 experienced and 12 inexperienced fighter pilots involved in mission navigation. They found extensive premission preparation in both groups; however, the expert group tended to have better "metaknowledge" for optimizing planning and was able to restrict consideration to more likely eventualities. Even though the flight plan was never executed as planned, these pilots had solved a large set of potential problems in advance, which simplified the decision process when deviations were necessary.
A large portion of in-flight decision making, then, becomes focused on collecting enough information to confirm with confidence that the pilot's internal picture of the situation is accurate (Amalberti & Deblon, 1992). In this scenario the preflight mission plan forms a set of expectancies that direct the search for information and interpretation of that information (Endsley, 1995). Based on this understanding of the situation, pilots will attempt to project likely future occurrences, allowing them to prepare responses in advance or to take actions to avoid such events. Amalberti and Deblon reported that during free periods, when not engaged in other flight tasks, 90% of the pilot's reasoning time was devoted to anticipation. Therefore, sudden, unexpected abnormal events were rare.
This strategy acts to minimize the load associated with in-flight responses to the unexpected, which may often be accompanied by stress and short decision times. Other studies have shown limitations on pilot decision making in such situations. McKinney (1993) reported that flight leads of fighter aircraft are particularly poor at responding to unexpected engine malfunctions. He attributes this to the effects of stress, the higher task load of the flight lead, and overconfidence. Wickens, Stokes, Barnett, and Hyman (1988) found that optimality of performance in a simulated flight task was significantly related to stress for tasks with high spatial demand. Avoiding the need to make on-the-spot decisions in stressful situations through anticipation and advanced response development can be seen as an effective strategy for coping with the demands of this environment.
In the present study the goal was to further explore decision making involving fighter pilots in a tactical task. As an approach to investigating tactical aviation decision making, we employed the research paradigm used by Chase and Simon (1973) and de Groot (1965) for investigating decision making in expert chess players. In early work de Groot found that expertise in chess was apparently unrelated to major differences in thought processes (number of moves, search heuristics, depth of search, etc.). Chase and Simon found that the key to understanding mastery of this highly complex tactical task
seems to lie in the immediate perceptual processing, for it is here that the game is structured, and it is here in the static analysis that the good moves ace generated for subsequent processing. Behind this perceptual analysis, as with all skills (cf., Fitts and Posner, 1967), lies an extensive cognitive apparatus amassed through years of constant practice. What was once accomplished by slow, conscious deductive reasoning is now arrived at by fast, unconscious perceptual processing. (1973, p. 56)
Chase and Simon employed a memory paradigm in which the players were required to reconstruct a chess position from memory after brief exposure to it. In the present study a similar task was employed which required pilots to reconstruct a presented tactical situation display. This task was combined with a requirement to report their tactical decisions in such a situation. In addition to inducing the pilots to process the displays in a relevant manner, the decision task also provides decision information that is directly comparable to data on how the pilots perceive and process the tactical environment. Based on a naturalistic view of decision making, several hypotheses can be generated regarding this research.
First, within a naturalistic model of decision making, the emphasis is placed on the features in the situation that are relevant for situation classification (decisions following directly from this). It is expected, therefore, that pilots' attention in processing the displays will be guided by relevance to tactical decisions and that information on the factors influencing decision making in this domain can be directly obtained from a systematic examination of attention allocation in processing the displays. Indications of the "perceptual analysis" discussed by Chase and Simon should provide useful reflections of the decision process.
Second, it is expected that relevant features of the situation for decision making in a tactical task will include the number of targets (threat aircraft) and the spatial distribution of those targets. Although it may be obvious that the location of a given target (in terms of range and bearing from ownship) should influence a pilot's decision (to attack that target or conduct an evasive maneuver, for instance), within this study we also sought to examine the way in which the distribution of multiple targets would have an influence on how pilots process the displays. It is expected that the way in which multiple targets are distributed in the display (in terms of their proximity and orientation to ownship and to one another) will influence the decision behavior of pilots.
Third, it is expected that pilots will employ chunking mechanisms as a means of dealing with large numbers of targets in a display, similar to that found by Chase and Simon (1973). It is expected that these chunking mechanisms will be reflected by the participants in their reporting of tactical decisions and in their behavior in the target replacement task.
Fourth, certain features of decision making generated from previous studies can also be examined. Specifically, it is expected that, following a naturalistic model of decision making, little or no consideration of multiple decision alternatives will be observed by the experts involved in this task. Based on the literature reviewed, it is also expected that expert pilots will make decisions that involve planning their actions into the future (as opposed to deciding on a single action at a time).
Overall, the present study was conducted with the goal of obtaining information on how pilots deploy their attention in processing complex and cluttered tactical displays to make necessary decisions. This information is relevant to both the design of systems to support tactical decision making and to furthering our understanding of that process.
A tactical situation display (TSD), shown in Figure 1, was presented on a computer screen. On each trial 3 to 12 targets were presented on the display for 5 s and then blanked. The participants were instructed to orally report the tactical action they would take if presented with the situation displayed and to indicate the location of each of the blanked targets using a mouse attached to the computer. Target entries could be moved on the display or removed completely using the mouse buttons until participants were satisfied with their response. A 2-s delay followed by a tone was implemented after the targets were blanked. During this period participants could not enter target locations, thus ensuring that they could not totally rely on iconic memory but, rather, would be required to cognitively process the display.
Each participant was presented with 490 target sets in a random order during the test. The decision to proceed to the next target set was self-paced. The target sets were presented to each participant over three 1-h sessions. Some participants required additional sessions to complete all the target sets. A 5-min break was provided halfway through each test session.
To ensure that all participants interpreted the displays based on the same set of assumptions, the following scenario was described before the test:
You are in the blue aircraft in the center of the display. You are on a fighter sweep mission. You are currently in enemy territory. You are at 20 000 feet and 1.2 Mach. All targets are at co-altitude, have equal airspeeds and closing velocities. All targets are MIGs with all-aspect, beyond visual range (BVR) weapons capability.
Participants were instructed to orally report the action they would take and, after the tone, to designate the location of the targets as accurately as possible using the mouse.
A Silicon Graphics 4D computer with a 19-inch color monitor was used for the test. The TSD was displayed on an area measuring 6 3/8 by 6 3/8 inches in the center of the screen. Participants were seated in front of the display approximately 17 inches from the screen.
The targets presented were approximately 1/8 inch wide and 1/4 inch long. A missile launch envelope (MLE) was shown projecting from the front of each target, which was approximately 1 1/4 inches wide and 1/2 inch from the nose of the target at the farthest point of the are. All targets were presented in red and heading toward the ownship symbol in the center of the display. The MLEs were presented in white and the ownship symbol was presented in blue. All lettering was in white. The display was configured at a set range of 80 miles on all trials. The compass shown on the display was at a distance of 40 miles from the ownship symbol.
We hypothesized that processing of the display and decision making would be affected by the number of targets and their distribution across the display. Ten target quantities were considered: between 3 and 12 targets per target set.
We created 49 density patterns, which covered the possible distributions of aircraft across the display. Seven azimuth zone distributions were considered: all targets in the front, sides, back, front and sides, front and back, sides and back, or front and sides and back. Seven range zone distributions were also considered: all targets in the near, middle, far, near and middle, near and far, middle and far, or near and middle and far zones. In total, this produced 49 possible range/azimuth zone combinations or density patterns.
Using all possible combinations of density pattern and target quantity, a total of 490 target sets were constructed. In each target set distribution of actual target locations was random within the prescribed density pattern. (The density zone designations were used only to construct the target sets and were not displayed to the participants.)
Ten male former military fighter pilots participated in the test. Their mean age was 42.1 years (range 32-52). On average they had 16.3 years (range 10-25) and 3080 hours (range 200-4500) of flight experience. Five were former U.S. Air Force pilots and five were former U.S. Navy pilots. Five reported combat experience.
RESULTS AND DISCUSSION
Participant entries on the target replacement task were matched to targets using a best-fit algorithm. Errors of omission and commission and large errors (those responses greater than 3[Sigma] of the mean error away from the target) were analyzed separately. The remaining 32 769 target/entry pairings were analyzed on the basis of azimuth error (angular deviation of the entry from the target), range error (Linear distance of the entry from the center point, or ownship position, as compared with the linear distance of the target from the center point), and total distance error (absolute linear distance from the entry to the target).
Participants' oral reports of their tactical decisions were coded to determine the number of actions they reported, which targets they would attack, and whether their reported actions included an exit from the battle. These three measures of decision making were then compared with the characteristics of the target set and performance on the target replacement task.
Overall, participants did fairly well on the replacement task. Across all entries, 2165 errors of omission occurred, which amounts to approximately 5.9% of the targets presented. There were 483 errors of commission, which is 1.3% of the targets presented, and 1755 large errors, which accounts for approximately 4.7% of the targets presented. Overall azimuth error averaged - .29 degrees, which is near zero, indicating neither a large clockwise nor a large counterclockwise bias in responses. Range error was on average - 2.1 miles, which reveals participants' tendency to overestimate target ranges. Total distance error averaged 8.8 miles.
A considerable amount of variability was present on the decision task. For a given target set, the mean number of completely different decisions across the 10 participants was 4.44 (SD = 3.43). The mean number of actions was 2.61 (SD = 1.50), and the mean number of targets attacked was 1.29 (SD = 1.14). Participants mentioned an exit action 31.3% of the time on average.
A MANOVA was conducted to test for effects of number of targets, range zone, and azimuth zone distributions, and participant on each of the six dependent variables in the target replacement task (number of errors of omission, errors of commission, large errors, mean azimuth error, mean range error, and mean distance error) and the three dependent variables in the decision task (number of reported actions, number of targets attacked, and report of an exit) using the 4885 groups of target set data collected across the 10 participants. (Data from 15 target sets were omitted because of problems with data collection.) All four independent variables were significant (p [less than] .001) predictors of the nine dependent variables. Separate ANOVAs were conducted to examine the effects of number of targets, range zone and azimuth zone distribution, and participant on each of the dependent variables.
Number of Targets
As shown in Figure 2, the number of errors of omission increased dramatically with the number of targets in the target set, with the sharpest increase occurring at about 6 to 7 targets, F(9, 4854) = 197.31, p [less than] .001. Even with 12 targets presented, however, the number of errors of omission averaged only a little over one target per target set. The number of large errors similarly increased with the number of targets, F(9, 4854) = 98.79, p [less than] .001. The number of errors of commission was also significantly related to number of targets F(9, 4854) = 4.63, p [less than] .001; however, it showed a slightly different pattern, increasing up to about 8 targets and then decreasing. The number of errors of commission was fairly low over the entire experiment.
Average azimuth error, F(9, 4854) = 77.73, p [less than] .001, average range error, F(9, 4854) = 33.39, p [less than] .001, and average distance error, F(9, 4854) = 88.50, p [less than] .001, were all significantly related to the number of targets. Each showed the same pattern, increasing with the number of targets to a plateau at around 10 targets, as shown in Figure 2.
The number of targets in a target set was also significantly related to the number of targets attacked, F(9, 4854) = 7.99, p [less than] .001, as well as to the number of reported actions, F(9, 4854) = 23.36, p [less than] .001. As shown in Figure 2, there was an inverse relationship between the number of targets presented and both the number of targets attacked and the number of reported actions, indicating that participants used an offensive strategy less often as the number of opponents increased. In keeping with this finding, the likelihood of reporting an exit action significantly increased with the number of targets, F(9, 4854) = 7.87, p [less than] .001.
Azimuth zone. One of our hypotheses was that the distribution of the targets as a group would affect participants' processing of the displays. The effect of azimuth zone distribution on number of omissions was significant, F(6, 4854) = 10.90, p [less than] .001; the fewest omissions occurred when the targets were distributed on the sides. The most omissions occurred when targets were concentrated in the front or distributed across two or three zones simultaneously, as depicted in Figure 3. Azimuth zone distribution was not significantly related to the number of errors of commission, F(6, 4854) = .64, p = .69, but it was related to the number of large errors, F(6, 4854) = 11.05, p [less than] .001. The fewest large errors occurred when the targets were concentrated in the front or on the sides and the most when targets were distributed across all three zones.
Average azimuth error of participant entries was significantly related to the azimuth zone density pattern of the target set, F(6, 4854) = 45.27, p [less than] .001. As shown in Figure 3, the least amount of azimuth error occurred when the targets were concentrated in the front, on the sides, or in the rear zones. The most azimuth error occurred when targets were spread across all three zones or across both the side and rear zones.
Average range error was also significantly related to azimuth zone distribution, F(6, 4854) = 27.71, p [less than] .001. The least amount of range error occurred when targets were concentrated in the front, side, or rear zones. The most error occurred when the targets were distributed across all three zones. Average total distance error showed a similar pattern to that observed for average range error, F(6, 4854) = 49.45, p [less than] .001.
In general, participants were more accurate when all the targets were concentrated in one zone. More error was observed when participants were required to spread their attention across multiple zones. Density patterns incorporating targets in all three azimuth zones showed the most error, probably because it would be more difficult for participants to chunk or group targets when they are spread across larger distances. Therefore, more processing requirements would exist for a display that features less dense or more spread-out targets.
The only error measure that did not conform to this generality is errors of omission. Such errors were highest when all targets were concentrated in the front. It is possible that having too dense a concentration of targets in this area led to participants' overlooking some targets. Why this would occur more in the front than in the rear - an equal-sized area - is not apparent, however.
As shown in Figure 3, the azimuth zone distribution of targets was significantly related to the number of targets attacked in a particular target set, F(6, 4854) = 17.48, p [less than] .001. Participants were more likely to attack when targets were concentrated in the front azimuth zone or in a zone that included the front. They were least likely to attack when targets were concentrated in the rear zone.
Results also indicated a significant relationship between the number of pilot actions and the azimuth zone distribution of the target set, F(6, 4854) = 5.24, p [less than] .001. Pilots tended to report fewer actions when all targets were concentrated in the side or rear zone. Concentrations of targets in the front zone or in the front and side zone combined increased the probability of participants' reporting an exit action, F(6, 4854) = 5.17, p [less than] .001.
In terms of pilot decision making, concentrations of targets in the front corresponded to more actions, more attacks, and a greater likelihood of reporting an exit. Decreased reports of attacks and fewer actions accompanied target sets with concentrations of targets in the rear. This pattern corresponds fairly well to the pattern of errors observed for the target replacement task, in which less attention was deployed to the rear.
Range zone. We also examined the error patterns across the different range zone distributions. The ANOVA conducted on the number of errors of omission revealed that the range zone distribution was significant, F(6, 4854) = 3.04, p [less than] .01. A greater number of omissions occurred when all the targets were concentrated near the center of the display (near ownship) than when they were concentrated at the far range zone, as shown in Figure 4.
Errors of commission were significantly related to range zone, F(6, 4854) = 2.58, p [less than] .05. The greatest number of errors of commission occurred when the targets were concentrated in the near and middle range zones or in the near and far range zones. The fewest errors of commission occurred when targets were concentrated in the middle zone.
The number of large errors was also significantly related to range zone, F(6, 4854) = 64.72, p [less than] .001. The fewest large errors occurred when all targets were concentrated in the near zone, in the middle zone, or in both the near and middle zones.
The average azimuth error of entries was significantly related to range zone, F(6, 4854) = 113.88, p [less than] .001. The greatest azimuth error occurred when targets were concentrated in the near zone. The least azimuth error occurred when targets were concentrated in the far zone, as shown in Figure 4. Average range error, F(6, 4854) = 403.52, p [less than] .001, and average total distance error, F(6, 4854) = 567.51, p [less than] .001, were both significantly related to range zone. They showed a pattern opposite to that of azimuth error, however. Error was least when targets were concentrated in the near zone and greatest when targets were concentrated in the far zone on both measures.
Across all measures, two error patterns appear to be present. Errors of omission, errors of commission, and azimuth errors were least when all the targets were concentrated in the far zones and greatest when all targets were concentrated in the near zone. Large errors, average range error, and average distance error were least when all targets were concentrated in the near zone and greatest when all targets were concentrated in the far zone. This could be attributable to the fact that there is more total area in the far zone than the near zone.
The distribution of targets in the seven range zones was also significantly related to the number of targets attacked in a particular target set, F(6, 4854) = 5.50, p [less than] .001. Participants attacked more often when targets were concentrated in the far zone or in a combination of zones than in the near zone. There was also a significant relationship between the number of actions and the range zone distribution of the target set, F(6, 4854) = 23.34, p [less than] .001. Participants tended to report fewer actions when targets were concentrated in the near zone and more actions when targets were distributed across several zones.
The ANOVA to examine the relationship of range zone to exit actions was significant, F(6, 4854) = 2.33, p [less than] .05. Concentrations of targets in the middle, far, and middle and far zones combined increased the probability that participants reported an exit action. When the battle was more imminent (targets in the near zone), it appears that participants were less likely to think through their battle plans.
The overall distribution of targets had a significant impact on pilots' decisions. Concentrations of targets in the front zone and far zone increased the likelihood that participants would engage the targets in battle (attack) and would think their actions through to an exit from the battle. Target sets in which the targets were distributed across several zones also tended to increase the number of targets attacked and number of actions reported. Concentrations of targets in the rear zone or the near zone tended to reduce the number of reported actions and attacks. In this situation the participants often chose to attempt to outfly their opponents.
Individual participants also proved to be a significant factor in this test (p [less than] .001 on all dependent measures). Considerable variance among participants existed on each of the measures. For example, Participant 1 had the lowest mean number of errors of omission: .10 per target set, whereas Participant 7 had the highest, .74 per target set. Average distance error varied from 7.72 miles for Participant 8 to 9.97 miles for Participant 7. It is interesting to note that Participant 8 consistently had one of the lowest error scores on almost every measure, whereas Participant 7 had one of the highest.
There was also wide range of between-subject variation on the decision task, as indicated by the fairly large standard deviations on each of the measures. For example, Participant 10 mentioned an exit action only 0.6% of the time, whereas Participant 7 mentioned an exit action 75.5% of the time. Similarly, Participant 8 averaged 4.4 actions per target set, compared with Participant 6, who reported only 1.2 actions per target set. The mean number of targets attacked varied from 2 attacks per target set for Participant 9 to an average of .7 attacks per target set for Participant 4.
This wide variability is indicative of different pilot decision-making and information-processing strategies. From the mean number of actions, attacks, and exit data, it appears that there are two distinct response strategy groups. The first group (Participants 3, 5, 8, and 9) exhibited a tendency toward multiple attacks, actions, and a higher-than-average number of exits. It should be noted that the multiple actions and attacks almost never involved a consideration of different alternatives but, rather, a tendency to plan many actions in advance, frequently to the point of their exit from the battle. The second group (Participants 2, 4, 6, 7, and 10) had a more defensive decision strategy, which included none or a single attack and mean of fewer than two actions per target set. Again, there was no consideration of multiple alternatives, however. (Participant 1 fell between these two groups in responses.)
A follow-up analysis was conducted to look for possible sources of these individual differences in display processing and decision making. ANOVAs revealed that none of the factors of participant age, years of flight experience, hours of flight experience, or branch of military service was a significant predictor of any of the measures of performance on the target replacement task or decision task (p [greater than] .05). Whether or not the participant had served in combat, however, was significantly related to the mean number of actions reported, F(1, 7) = 13.35, p = .01, the mean number of aircraft attacked, F(1, 7) = 7.99, p = .03, and the likelihood of mentioning an exit action, F(1, 7) = 5.743, p = .05. Combat service was not significantly related to performance on the target replacement task, however. It appears that the experience of serving in combat had a significant influence on the degree to which participants planned ahead of the aircraft, considering many actions in advance. It can be hypothesized that the experience of serving under such stressful conditions was influential in developing this decision characteristic, given that making decisions while under stress (e.g., in the heat of the battle) can be significantly hindered.
Decisions and Target Replacement Task Interrelationship
In addition to examining the relationship between target set characteristics and the two tasks, we also directly examined the relationship between the decision task and the target replacement task. Regressions were performed to determine the relationship between placement error and the three decision variables.
There was a significant relationship between the number of targets attacked in a given target set and the mean distance error in the replacement task, F(1, 4883) = 4425.80, p [less than] .001; mean range error, F(1, 4883) = 4050.20, p [less than] .001; and mean azimuth error, F(1, 4883) = 3339.07, p [less than] .001. Mean error on all three variables was lower when more targets were attacked.
Regressions were significant which examined the relationship between the number of pilot actions per target set and the mean distance error, F(1, 4883) = 8.90, p [less than] .01; mean range error, F(1, 4883) = 7019.72, p [less than] .001; and mean azimuth error, F(1, 4883) = 6536.29, p [less than] .001 - all showed the same pattern of lower error with more reported actions. Report of an exit action was significantly related to the magnitude of distance error, F(1, 4883) = 5.20, p [less than] .05, and range error F(1, 4883) = 11.09, p [less than] .001, but not azimuth error, F(1, 4883) = 3.62, p = .06. The report of an exit action was associated with greater accuracy in target placement. Decision making within each target set was therefore highly related to accuracy in the target placement task, supporting the hypothesis that attention in processing the display (the perceptual analysis) would be directly linked to pilots' decisions.
Several other variables were also examined that might have an impact on participant error patterns and decision making. Specifically, the location of the targets themselves (in terms of their range and azimuth from ownship) and the order in which the participants entered the targets were examined to determine their effect on target replacement accuracy and the likelihood of attacking that target.
It was anticipated that participants would pay more attention to some targets than to others (because of either intentional strategy or biases), independent of the overall display density. The number of omissions appeared sharply higher for targets at azimuth angles of [+ or -]45 [degrees] off ownship heading, as shown in Figure 5. (Because of a loss of data, information on the azimuths of targets omitted at greater than 90 [degrees] and less than -90 [degrees] azimuth is not included in this analysis.) This would appear to indicate "dead spots" in the pilots' attention allocation across the display. The number of errors of commission showed no such pattern but was fairly randomly distributed across all target azimuths. The number of large errors also appears to be highest for targets at azimuth angles of [+ or -]45 [degrees] and [+ or -] 135 [degrees] off ownship heading, as shown in Figure 5. This pattern is markedly similar to that shown by the errors of omission.
Regressions were performed on the azimuth error, range error, and overall distance error for all 32 769 matched target/entry pairs to determine the relationship between placement error and target range and azimuth. Azimuth error varied systematically with target azimuth, F(1, 32767) = 4.53, p [less than] .05, going from a slightly positive (counterclockwise) error bias for targets on the left of ownship to a slightly negative (clockwise) error bias for targets on the right of ownship. This bias is very slight (within three degrees on average), however, and probably is not operationally relevant.
A second-order polynomial regression of target azimuth on range error was significant, F(2, 32766) = 36.79, p [less than] .001. Overall, range errors tended to be slightly negative (participants overestimated target range). This bias was slightly greater for targets behind ownship than for those in front of ownship, perhaps indicating more attention to those in front. Overall distance error also varied significantly across target azimuth, based on a second-order polynomial regression, F(2, 32766) = 75.80, p [less than] .001. The greatest amount of error occurred for targets located behind ownship. Placement error appeared to be fairly evenly distributed across the left and right hemispheres of the screen.
We also conducted an ANOVA to examine the effect of target azimuth on whether a target was attacked and found no significant relationship between target azimuth and being attacked, F(1, 32767) = 1.72, p = .19. Thus although target azimuth was related to attention distribution as indicated by target placement error, it did not prove significant to the likelihood that a target would be attacked.
The number of errors of omission appeared to be fairly randomly distributed across target range. The number of errors of commission conformed to an almost perfect normal distribution across target range, peaking at about 40 to 45 miles, as shown in Figure 6. The number of large errors appears to increase with target range, with a large peak at about 65 miles out. The dramatic peak for targets beyond 85 miles is for targets that were in the far corners of the display.
Azimuth error of the entries increased with target range based on a regression analysis, F(1, 32767) = 13446.68, p [less than] .001. Range error also varied significantly with target range, F(1, 32767) = 13.16, p [less than] .001. Participants tended to underestimate ranges of targets that were closer than 40 miles and to overestimate the ranges of targets beyond 40 miles. The tendency to overestimate the range of targets increased dramatically at farther ranges. This effect contributed to a similar rapid increase in overall distance error at farther ranges. A regression of target range on distance error was significant, F(1, 32767) = 9597.10, p [less than] .001. An ANOVA examining the effect of target range on whether a target was attacked showed that there was a significant relationship between these two variables, F(1, 32767) = 626.48, p [less than] .001. Pilots tended to attack the closer targets, as would be expected.
Participants were generally less accurate about target location as target range increased. They were most accurate around the 40 mile point, probably because a range ring was presented on the display at this point. That this point also had the highest number of errors of commission is rather odd. The tendency to underestimate the range of nearer targets may contribute to pilots' locking onto or shooting at targets too soon in an engagement, as has been noted by Venturino, Hamilton, and Dvorchak (1989). The tendency to increasingly overestimate the range of farther targets may reflect a general inattention to targets as target range increases. This can lead to omitting targets from tactical consideration for too long, creating the need to make rapid decisions when the targets suddenly seem much closer. It may also lead to a perceived disparity whereby farther targets appear to approach more rapidly than they actually are.
Participants managed to average an acceptable amount of range error (less than 5 miles) within a range of 65 miles. It should be noted that the number of large errors also appears to increase rapidly beyond target ranges of 65 miles. It is odd, however, that the number of omissions and commissions do not increase at these farther distances. A possible alternative to the decreased attention hypothesis is that participants continued to attend to targets at the far distances, but they did not feel it necessary to encode the targets with the same degree of accuracy. Subjective reports from pilots indicate that, indeed, operationally they do not desire to know target location as accurately at farther ranges.
Order of Entry
It is also possible that error scores might have increased across the interval when participants were entering targets. That is, it might be expected that those targets entered first would be more accurate than those entered last. This could be attributable to a decay of short-term memory or a strategy of entering more important targets first.
Azimuth error of the entries increased with the order of entry, according to a regression, F(1, 32767) = 1356.11, p [less than] .001, and range error was significantly greater for later entries, F(1, 32767) = 9.67, p [less than] .001. Total distance error, displayed in Figure 7, similarly increased with order of entry, F(2, 32767) = 57.672, p [less than] .001. Thus participants were more accurate with the earlier targets in the replacement task.
In light of the effect of order of entry on target placement accuracy, it seems prudent to examine the data for any biases that might affect entry order. A regression showed that participants did systematically tend to enter the nearer targets before the further targets, F(1, 32767) = 11.43, p [less than] .001, shown in Figure 8. It is difficult to say, therefore, whether the tendency for participants to show greater range errors for farther targets is a function of their range or of their later order of entry.
An interesting bias also exists with regard to target azimuth, based on a second-order polynomial regression, F(2, 32767) = 14.33, p [less than] .001, as shown in Figure 8. Participants tended to enter targets on the left side of the screen first (negative azimuths) and on the right side later. Whether this suggests an operational bias or simply the tendency to read from left to right cannot be determined. It is interesting to note that no corresponding error trend accompanies the tendency to enter targets in this fashion, nor does any bias in attacking targets.
The order of target entry was also significantly related to whether a target was attacked, F(1, 32767) = 1372.29, p [less than] .001. Participants tended to enter attacked targets earlier in the replacement task. It therefore appears that participants chose to enter the "less important" further targets later and with less encoded accuracy.
ANOVAs were conducted to determine whether target entry error (range error, azimuth error, and total distance error) was related to participants' decision to attack a given target. Range error was not significantly related to whether a target was attacked, F(1, 32767) = .04, p = .84. A significant relationship was found between whether a target was attacked and both azimuth error, F(1, 32767) = 32.61, p [less than] .001, and distance error, F(1, 32767)= 402.87, p [less than] .001. Participants were significantly more accurate in the placement of targets that were attacked compared with those that were not attacked.
Overall, these results indicate that participants had a tendency to more accurately enter attacked targets and to enter them early in the target entry task. (Inaccuracies in placement appear to be related more to range than to azimuth.) This finding makes sense, in that participants logically would pay more attention to aircraft that are considered an immediate and present danger and would be more likely to attack those targets.
Considerable data exist to support the notion that participants "chunk" pieces of information into small groups. In the cockpit environment, with highly cluttered displays and the possibility of high target densities, this is of particular concern. An issue of interest, therefore, is determining precisely how pilots choose to chunk targets on a display in order to cope with high target densities.
Chase and Simon (1973) reported that participants tended to enter items within a chunk sequentially and in less time than they would enter items between chunks. Their criterion - the mean time between entries - was used to determine chunks in this task. The cutoff point for determining chunks was to be 1.025 s, the mean time between successive entries. All entries that were made 1.025 s or less after the prior entry were considered to be of the same chunk of information. (This criterion agrees fairly well with the 2-s cutoff criteria used by Chase and Simon; they cited a time of approximately 1 s to manually replace pieces in their task. The task in this study, which used a mouse and computer screen, did not incur such a large amount of time for physical response.)
Using this criterion, 12 741 chunks were formed out of the 32 769 target/entry pairs. The majority of the chunks included only one target (40.3%); however, a considerable number of chunks contained between two and six targets (52.9%). The larger chunks were most frequently entered first, F(1, 12739) = 902.23, p [less than] .001. Accompanying this trend, average azimuth error, F(1, 12739) = 8.97, p [less than] .01; average range error, F(1, 12739) = 122.01, p [less than] .001; and average distance error, F(1, 12739) = 27.09, p [less than] .001, for the chunks were progressively larger as chunk size increased.
Figure 9 displays the number of chunks as a function of target set size. A regression was significant for the relationship between number of chunks and number of targets in a display, F(1, 4884) = 222.42, p [less than] .001, as would be expected. Even for the largest target sets, participants averaged just under three chunks per display. Figure 9 also shows a rapid rise in the average chunk size (targets per chunks) as the number of targets increased, F(1, 4884) = 1363.11, p [less than] .001. It appears that participants used both larger chunks and more chunks to handle an increase in the number of targets; however, greater chunk size accounted for most of the increase.
Analysis of variance was used to examine the relationship between time between entries as a chunking criterion and distance as an alternative criterion for determining a chunk. The mean distance between subsequently entered targets appears to have been greater between chunks than within chunks, F(1, 28066) = 5215.67, p [less than] .001, by almost twice as much (50.1 miles vs. 25.6 miles). Similarly the mean distance between subsequent entries was greater between chunks (55.7 miles) than within chunks (32.9 miles), F(1, 28066) = 3149.90, p [less than] .001. (Distance between subsequent targets and distance between subsequent entries are correlated with an [r.sup.2] of .938.) It would be expected, based on Fitts' law, that a direct relationship would exist between distance of movement and time to make that movement, independent of any chunking.
To search for independent confirmation of the Chase and Simon (1973) chunking criterion, we examined the orally reported tactical decisions. Participants' verbalizations frequently referred to groups of aircraft (e.g., "the two on the right," "the three-ship at 6 o'clock," etc.). This can be considered an oral confirmation of which aircraft they saw as a chunk, although only a partial one, given that one can make no assumptions about aircraft to which participants did not refer.
We randomly chose data from one participant for analysis. The time between subsequent entries was computed for targets that had been orally identified as a chunk. Of the 204 pairs of verbally chunked targets, only 7 (3.4%) were not sequentially entered, confirming the premise that targets within a chunk will be entered together. (An additional 27 pairs that were not entered sequentially were part of larger chunks in which there was an intervening entry.) Of the remaining 170 pairs, only half (85) were chunked by the mean time between entries criterion adapted from Chase and Simon (1973). Thus there was a zero correlation between the two chunking criteria. A considerable number of the between-entry times of orally chunked targets exceeded 1.025 s.
This finding is rather disturbing. Careful review of the chunked targets indicates that the oral cues do correspond to likely chunks. This would indicate that mean time between entries may not provide a good indication of which targets made a chunk, at least in this task. It is also possible that the introduction of an oral decision task in some way interfered with timing on the target entry task.
CONCLUSIONS AND RECOMMENDATIONS
This study had several limitations. First, it was conducted in a laboratory setting with static displays. Thus participants could not take advantage of aircraft dynamics in building up a tactical picture over time. The situations presented were also somewhat unrealistic, in that they involved a single aircraft (as opposed to a larger flight) that was highly outnumbered. Most of the participants mentioned that they would not want to find themselves in that position. In addition, the study was intentionally limited to investigate targets that were equal in all respects except spatially. Future studies should be conducted to expand these findings to targets of varying types, IDs, and headings so that a better understanding of how pilots deploy their attention with TSDs can be acquired. Given these limitations, however, the study revealed several factors related to attention distribution and biases that have operational and design implications.
There was a systematic relationship between participants' attention distribution as measured by the target replacement task and their tactical decisions, verifying the utility of this paradigm for exploring decision making in complex cognitive tasks. Targets that were tactically relevant were generally entered first and with improved accuracy, indicating that attention distribution was based on tactical considerations. The pilots' perceptual processing, as indicated by performance in the target replacement task, was closely linked to their reported decisions, which supported the first hypothesis.
Participant error scores were strongly related to the number of targets presented. Using a chunking strategy, participants seemed to be able to handle the number of targets presented in this study, but errors of omission averaged greater than 1 per target set when 11 or 12 targets were shown. Operationally, this could have a serious impact in some situations.
Density pattern was shown to have a significant effect on participants' ability to perform the task. Target placement error was affected by the number of targets and their distribution within the display. Participants also did worse when the targets were distributed widely across the screen. They generally appeared to favor the front zone. They were more accurate with regard to targets located nearer to ownship, although some errors were greater (errors of omission, errors of commission, and large errors) in the near zone, probably because of an interaction with overall display density.
Tactical decision making was also systematically affected by the number of targets and target distribution patterns in a consistent manner. This supports the second hypothesis, in that both the number of targets and their distribution across the display (leading to differing density patterns) largely affects both perceptual encoding and decision making in this task.
Participants demonstrated interesting systematic biases in attention. High errors of omission and large errors at azimuths of [+ or -]45 [degrees] and [+ or -]135 [degrees] are particularly alarming. Certainly this indicates a tactical weakness. Interestingly, this pattern of errors corresponds to that observed by McGreevy and Ellis (1986) and Ellis, Tharp, Grunwald, and Smith (1991) in work on egocentric and exocentric three-dimensional displays. These researchers attribute a higher number of errors at these azimuths as stemming from a misestimation of viewing direction. The present study, however, found a surprisingly similar lack of attention to targets in these same azimuths in a two-dimensional, God's-eye type display. No "viewing direction" was present. This may indicate that systematic errors in these azimuths may be caused by selective inattention rather than misperception.
High errors beyond 65 miles in range should also be taken into consideration. It may be desirable to design information management systems and "intelligent" displays to assist pilots with situation awareness on targets in these areas. Although targets at these ranges are beyond immediate concern, they do contribute to the pilot's global situation awareness.
More research should also be done on pilots' tendency to underestimate target ranges closer than 40 miles and to overestimate ranges of targets at farther distances. Specifically, it should be discovered whether this finding is specific to target range or is dependent on the total range displayed on the TSD, which was 80 miles in this study. Will participants always show this dichotomy at the halfway point, or will it remain at 40 miles? A second issue to be studied is the effect of this bias on perceived target closing/opening velocity and its relation to lock-on/shoot time estimation. A better understanding of this relationship is critical.
In addition, we derived some information on the chunks used to process and/or store information. Participants seemed to increase the size of chunks more rapidly than they increased the number of chunks, which averaged only three targets with even the largest target set. It would be interesting to discover whether this tendency would hold for even larger target sets and in other circumstances. These results on chunking should be viewed with caution, however, in light of the lack of confirmation of the time between entries chunking criterion based on the decision data. Unfortunately, it is difficult to propose another independent criterion for determining a chunk that does not involve participant verbalizations. This finding only partially supports the third hypothesis regarding a reflection of chunking strategy in the target replacement and decision data.
Finally, the present study also confirmed earlier indications of a high degree of variability in pilot decision making, even among the highly trained and experienced pilots involved in this study. We found confirmation of the hypothesis that many participants would plan ahead, as indicated by the number of reported actions and thinking through to an exit from the battle. It is particularly interesting that differences in decision style could be linked to combat experience. It may be desirable for pilots training in non-combat situations to receive additional instruction on the importance of working out actions in advance.
This study was also supportive of the naturalistic model of decision making, which supports the fourth hypothesis. Participant verbalizations almost never included discussion or consideration of multiple alternatives, as would be indicated by a decision-theoretic model.
In conclusion, the study supported the basic hypotheses regarding pilot decision making in tactical combat situations. It also provides useful descriptive information about features relevant to tactical display processing and decision making. It therefore supports the general goal of recent work in naturalistic decision making by proving descriptive (as opposed to normative) information on human decision making that can be applied to improve the design of systems for supporting decision making in complex, real-world environments.
This study was conducted while the first author was employed at the Northrop Corporation. We wish to acknowledge the contributions of Cheryl Bolstad, who helped with data collection, and James Cobasko, who provided programming support. We also wish to thank the many pilots at Northrop who participated in the study and several anonymous reviewers who commented on an earlier version of this paper.
Amalberti, R., & Deblon, F. (1992). Cognitive modeling of fighter aircraft process control: A step towards an intelligent on-board assistance system. International Journal of Man-machine Systems, 36, 639-671.
Beach, L. R., & Lipshitz, R. (1993). Why classical decision theory is an inappropriate standard for evaluating and aiding most human decision making. In G. A. Klein, J. Orasanu, R. Calderwood, & C. E. Zsambok (Eds.), Decision making in action: Models and methods (pp. 21-35). Norwood, NJ: Ablex.
Chase, W. G., & Simon, H. A. (1973). Perceptions in chess. Cognitive Psychology, 4, 55-81.
de Groot, A. (1965). Thought and choice in chess. The Hague, Netherlands: Mouton.
Dreyfus, S. E. (1981). Formal models vs. human situational understanding: Inherent limitations on the modeling of business expertise (ORC 81-3). Berkeley: Operations Research Center, University of California.
Ellis, S. R., Tharp, G. K., Grunwald, A. J., & Smith, S. (1991). Exocentric judgments in real environments and stereoscopic displays. In Proceedings of the Human Factors Society 35th Annual Meeting (pp.1442-1446). Santa Monica, CA: Human Factors and Ergonomics Society.
Endsley, M. R. (1995). Toward a theory of situation awareness. Human Factors, 37, 32-64
Fitts, P. M., & Posner, M. I. (1967). Human performance. Belmont, CA: Brooks/Cole.
Flathers, G. W., Giffin, W. C., & Rockwell, T. H. (1982). A study of decision making behavior of aircraft pilots deviating from a planned flight. Aviation, Space and Environmental Medicine, 53, 958-963.
Hammer, J. M., & Small, R. L. (1995). An intelligent interface in an associate system. In W. B. Rouse (Ed.), Human/technology interaction in complex systems (Vol. 7, pp. 1-44). Greenwich, CT: JAI Press.
Houck, M. R., Whitaker, L. A., & Kendall, R. R. (1993). An information processing classification of beyond-visual-range air intercepts (AL/HR-TR-1993-006]). Williams Air Force Base, AZ: Armstrong Laboratory, Human Resources Directorate, Aircrew Training Research Division, U.S. Air Force.
Klein, G. A. (1989). Recognition-primed decisions. In W. B. Rouse (Ed.), Advances in man-machine systems research (pp. 47-92). Greenwich, CT: JAI Press.
Klein, G. A. (1993). A recognition primed decision (RPD) model of rapid decision making. In G. A. Klein, J. Orasanu, R. Calderwood, & C. E. Zsambok (Eds.), Decision making in action: Models and methods (pp. 138-147). Norwood, NJ: Ablex.
Klein, G. A., Calderwood, R., and Clinton-Cirocco, A. (1986). Rapid decision making on the fire ground. In Proceedings of the Human Factors Society 30th Annual Meeting (pp. 576-580). Santa Monica, CA: Human Factors and Ergonomics Society.
McGreevy, M. W., & Ellis, S. R. (1986). The effect of perspective geometry on judged direction in spatial information instruments. Human Factors, 28, 439-456.
McKinney, E. H. (1993). Flight leads and crisis decision making. Aviation, Space and Environmental Medicine, 64, 359362.
Orasanu, J., & Connolly, T. (1993). The reinvention of decision making. In G. A. Klein, J. Orasanu, R. Calderwood, & C. E. Zsambok (Eds.), Decision making in action: Models and methods (pp. 3-20). Norwood, NJ: Ablex.
Raiffa, H. (1970). Decision analysis: Introductory lectures on choices under uncertainty. Reading, MA: Addison-Wesley.
Venturino, M., Hamilton, W. L., & Dvorchak, S. R. (1989). Performance-based measures of merit for tactical situation awareness. In Situation Awareness in Aerospace Operations (AGARD-CP-478) (pp. 4.1-4.5). Neuilly-sur-Seine, France: NATO-AGARD.
Wickens, C. D., Stokes, A. F., Barnett, B., & Hyman, F. (1988). Stress and pilot judgment: An empirical study using MIDIS, a microcomputer-based simulation. In Proceedings of the Human Factors Society 32nd Annual Meeting (pp. 173-177). Santa Monica, CA: Human Factors and Ergonomics Society.
Mica R. Endsley is an associate professor of industrial engineering at Texas Tech University and a visiting associate professor in aeronautics and astronautics at Massachusetts Institute of Technology. She received her Ph.D. in industrial and systems engineering with a specialization in human factors from the University of Southern California.
Robert P. Smith is a software systems engineer at AT&T Bell Labs. He received his Ph.D. in industrial and engineering psychology at Texas Tech University.
|Printer friendly Cite/link Email Feedback|
|Author:||Endsley, Mica R.; Smith, Robert P.|
|Date:||Jun 1, 1996|
|Previous Article:||Decision making in complex naval command-and-control environments.|
|Next Article:||Research with Patriot air defense officers: examining information order effects.|