Memory processes of flight situation awareness: interactive roles of working memory capacity, long-term working memory, and expertise.
The term situation awareness (SA) has been highlighted in the aviation world because of its prominent role in flight operations. Although it has been considered an essential prerequisite for the safe operation of aircraft, the term is inconsistently used in the domain of aviation. Its use is most often based on an intuitive understanding, and a commonly accepted definition is not provided. To fill this gap, aviation psychologists have focused on the cognitive components of SA because of the increasingly cognitive nature of the tasks operators should perform (Durso & Gronlund, 1999; Endsley, 1995b; Sarter & Woods, 1991 ; Wickens, 1999). Thus researchers in aviation psychology have sought to determine the cognitive requirements that constitute SA. Previous studies in the aviation domain have provided important support for the centrality of memory processes in SA, addressing the nature of individual differences in using memory processes during SA (Carretta, Perry, & Ree, 1996; Endsley & Bolstad, 1994; Gugerty & Tirre, 2000; Joslyn & Hunt, 1998; O'Hare, 1997). The objective of the present research was to determine the locus of individual differences in pilot ability to accommodate the demands that SA imposes on memory. Because experience may enhance SA by reducing the amount of mental resources required for building and maintaining SA, we examined the role of memory processes in SA as a function of pilot expertise.
MEMORY PROCESSES IN SITUATION AWARENESS
The role of memory is viewed as central to performing tasks that require dynamic and complex processing, such as SA (Durso & Gronlund, 1999; Endsley, 1995b). The content relevant to SA (e.g., tasks, systems, or hazards) is processed and resides in memory (Wickens, 1999). The accuracy of SA depends also on memory in which incoming information is integrated into coherent interpretation and prediction of aircraft status. Because of the importance of memory components of SA, many SA researchers have focused on the role of memory in a variety of task domains, including air traffic control (e.g., Gronlund, Ohrt, Manning, Dougherty, & Perry, 1998), driving (e.g., Gugerty, 1997) and instrument flight (e.g., Doane, Sohn, & Jodlowski, 2004).
In order to understand memory processes in SA, it is necessary to understand the construct of SA. The spirit of most definitions of SA can be incorporated into Endsley's (1995b) information processing view (Durso & Gronlund, 1999; Jones & Endsley, 2000; Wickens, 1999). Endsley's (1995b) view defines three levels of SA in terms of component cognitive processes. The first level involves perceiving environmental elements, such as other aircraft, terrain, system status, and warning lights. The second level involves information integration, a process of activating long-term memory (LTM) knowledge structures in order to organize the perceived situation elements into meaningful and recognizable configurations. The third level includes processes that enable projection of future flight status.
This third level of SA uses the goal-relevant activated knowledge structures formed in the second level of SA to predict the status of the aircraft. The accuracy of SA is a function of activating LTM knowledge structures that facilitate the integration of environmental information and result in a coherent interpretation of the current and future flight status. Recent work suggests that the integration of environmental information takes place in working memory (WM; Durso & Gronlund, 1999; Endsley & Garland, 2000). In summary, SA in Endsley's (1995b) view involves memory processes that dictate whether the resulting level of awareness will facilitate or detract from pilot performance.
RESEARCH ON MEMORY PROCESSES IN SITUATION AWARENESS
Many SA researchers suggest that understanding component processes is crucial to understanding failures in SA (e.g., Durso & Gronlund, 1999; Endsley, 1995b; Sarter & Woods, 1991). Adams, Tenney, and Pew (1995) suggested that componential analyses would address the practical need for the ability to predict failures in SA. As an example, they described a list of indicators of incomplete SA that was devised to help pilots detect SA problems.
Recent research suggests that componential analyses of memory processes are useful in predicting SA failures (see Durso & Gronlund, 1999, for a complete review of additional componential research). For example, in a study of U.S. Air Force F-15 pilots, Carretta et al. (1996) showed that cognitive factors such as verbal working memory, spatial working memory, spatial reasoning, and divided attention were the reliable predictors of SA after controlling for the effects of flight experience. Although Carretta et al. used cognitive factors based solely on WM as predictors of SA, Stokes, Kemper, and Kite (1997) used LTM-based knowledge representations as well as WM-based information processing abilities to predict decisionmaking performance on a simulated flight situation. Stokes et al. found that both LTM-based knowledge representation measures and WM-based spatial memory measures were predictive of flight decision making.
Although the previous studies pointed to memory processes as important components of SA, the respective roles of WM and LTM and their relationship in the context of SA have yet to be determined clearly. As an attempt to address this issue, the present research examined main and interactive effects of WM and LTM processes on SA performance in instrument flight tasks using theoretical analysis based on cognitive theories developed in a variety of domains (e.g., Baddeley, 1986; Ericsson & Kintsch, 1995; lust & Carpenter, 1992; Larkin, McDermott, Simon, & Simon, 1980; Shah & Miyake, 1996; Sohn & Doane, 1997). In testing the validity of possible accounts based on the current cognitive theories, this research provides an insight into the cognitive requirements that constitute SA and the effective training methods that optimize acquisition and maintenance of SA.
The specific research question addressed was whether individual capacity to maintain information in WM, individual ability to construct and use LTM retrieval structures, or both, are the loci of SA performance differences among pilots of varying expertise. This objective was achieved by analyzing pilot SA performance for expert and novice groups in the context of cognitive theories. Of interest was the ability of contrasting theories to explain and predict performance differences in flight SA.
The capacity theory of WM proposes that the locus of individual differences in performing a complex task is the domain-general fixed capacity to compute and maintain presented information in WM (e.g., Daneman & Carpenter; 1980; Just & Carpenter, 1992; Shah & Miyake, 1996). This capacity is assumed to be inherently different across individuals. Alternatively, the long-term working memory (LT-WM) theory proposes that the locus of the differences is the acquired domain-specific skill to encode the presented information efficiently in accessible form in LTM (e.g., Ericsson & Delaney, 1999; Ericsson & Kintsch, 1995). This account of memory process is referred to as LT-WM because it postulates a mechanism for extending WM that requires skilled use of storage in and retrieval from LTM. The LT-WM theory proposes that individuals can acquire retrieval structures through extensive knowledge built up from experience in a particular domain and can use them to dynamically increase the functional capacity of WM. Thus our research objective was accomplished by devising measures of WM capacity and LT-WM skill and then comparing the measures' ability to predict the differences in performance on SA tasks.
Overview of Experiment
The present experiment consisted of three tasks. First, the span tasks of Daneman and Carpenter (1980) and Shah and Miyake (1996) were modified to measure individual WM capacity. Many researchers hypothesize separate WM resources for different modalities of cognitive processes (e.g., Daneman & Tardif, 1987; Shah & Miyake, 1996). Based on this hypothesis, the span tasks assessed individual WM capacity for computation and storage of spatial and verbal information.
Second, a situation recall task analogous to the Chase and Simon (1975) chess experiment was devised to measure individual pilots' LT-WM skill. Their experiment showed that chess masters were able to reconstruct more chess pieces on a board than were novices if the pieces were originally arranged in a meaningful game position. In contrast, if the pieces were placed in a random configuration, chess master performance fell to novice levels. In the present experiment, the elements to be reconstructed were cockpit situational indices (e.g., altitude, heading, airspeed). The cockpit situations were represented in either spatial or verbal form to examine the effect of presentation modality on pilot ability to construct an accurate mental representation of cockpit situation. The spatial form included a pictorial snapshot of actual cockpit instruments, whereas the verbal form was a written description of instrument indications.
To measure encoding in and retrieval from LTM, participant memory for a cockpit situation was tested after a 50-s delay filled with an intervening task (e.g., Peterson & Peterson, 1959). The intervening task was considered to clear short-term storage of scanned situational information. To measure construction of retrieval structures, we presented participants with both meaningful and nonmeaningful cockpit situations, defined as whether or not the cockpit situation represented a plausible state of a real aircraft.
The difference in delayed-recall accuracy of meaningful and nonmeaningful situations is thought to reflect pilot ability to construct and use retrieval structures in LTM (e.g., Stokes et al., 1997). In the meaningful trials, pilots were expected to access their LTM retrieval structures to encode the situational information. In contrast, in the nonmeaningful trials, pilots were not expected to have a matching retrieval structure that could be accessed to aid encoding. The benefit of accessing a LTM retrieval structure during encoding can be calculated by comparing delayed-recall accuracy in meaningful situations to that for nonmeaningful situations. In sum, the differences in delayed-recall accuracy between the meaningful and nonmeaningful conditions were calculated as an index of situational information accessible in LT-WM.
It is worth noting that the LT-WM measure is more than an indicator of knowledge about aircraft state and instrument patterns. Effective use of the LT-WM mechanism has three prerequisites (Ericsson & Kintsch, 1995). First, individuals must have extensive task-relevant knowledge that enables rapid storage of incoming information in LTM. Specifically, a rich knowledge base is thought to facilitate the relating of incoming information to known task-relevant patterns and schemas, and this in turn enables rapid storage of incoming information into LTM. Knowledge alone, however, is not sufficient. Second, individuals must be familiar with the task so that they can accurately anticipate future demands for information retrieval. Third, individuals must be able to associate encoded information with appropriate retrieval structures. The three prerequisites are integrated into measuring LT-WM skill during the situation recall task.
Finally, the SA task participants performed in the present research was designed to assess pilot SA in instrument flight empirically. Although there is currently no agreement in the literature concerning the best SA assessment methodology, the most commonly used method, according to Adams et al. (1995), is query (e.g., Endsley, 1990). In this method, a task simulation is stopped at random points, and the system displays are blanked while the participant answers a set of questions about the situation. Such query methods involve the participant's ability to recall information about the situation from memory, and concerns have been raised that these methods are intrusive and rely too heavily on memory.
In the present experiment, we used a modified query method to measure SA in which the modifications reduced the experimenter's intrusion and the reliance of the measure on unnecessary memory. The specific task used to measure pilot SA involved multiple trials. On each trial, participants viewed consecutive screens that showed a goal description and two consecutive cockpits and then judged whether an aircraft depicted by the consecutive cockpit snapshots would reach the specified goal state in the next 5 s (see Figure 1). Participants indicated their judgments at their own pace, without the experimenter interruptions that accompany traditional query methods. The cockpit snapshot displaying the current flight situation remained available when participants began to make judgments, and this reduced demands on memory that are problematic for query methods. As another reduction of unnecessary memory demands, participants made a single judgment for a given trial rather than the series of judgments used in most query methods.
[FIGURE 1 OMITTED]
Although the present SA criterion task shows multiple static snapshots of a changing situation rather than continuous real-time changes, it was devised to reflect the three levels of cognitive components defining SA (Dominquez, 1994; Endsley, 1995b). To perform this task successfully, participants must perceive changes in flight situation elements across various instruments (e.g., status and changes in airspeed, heading, altitude), interpret and understand their meaning with respect to the goal (e.g., "1 am currently below my desired airspeed"), and predict their future implications given the goal state in mind (e.g., "Given my current state and rates and directions of change, I am headed toward my desired flight status").
It is important to note the distinction between the situation recall task and the SA task, although both tasks share a need for memory processes. The situation recall task requires storing two entire cockpit displays in memory and retrieving the display information during recall. The SA task also requires storage in memory; however, the information to be stored is dictated by the specified goal. In addition, the SA task requires the integration of display elements with the specified goal in order to project a future state of the aircraft. The SA task, then, is a measure of how pilots construct a memory representation according to their goals and use this to build a situation model of a future flight state. The three tasks used in this study are further detailed in the following sections.
Fifty-two pilots from the University of Illinois and the University of Connecticut participated in this experiment. All were paid $25 for their participation. As a function of the preexperiment questionnaire data obtained from each participant, novice and expert groups contained 25 and 27 participants, respectively. (For additional questionnaire and classification details, see Doane & Sohn, 2000.) Novices on average had a total flight time of 85.7 hr and a total flight time of 17.6 hr in the last 90 days. In contrast, experts averaged a total flight time of 1116.8 hr and a total flight time of 51.5 hr in the last 90 days. It is worth noting that 2 participants in the expert group had extremely high total flight times, of 6500 and 7200 hr, but they did not significantly differ from the other experts in their WM capacity and LT-WM measures, Fs < 3, and remained in the expert group.
Materials and Procedure
All participants completed three tasks: span, situation recall, and SA.
Span tasks. We used two span tasks--spatial and verbal--which required participants to process and store different modalities of information in WM. In the spatial span task, participants were shown a set of English capital letters (F, J, L, P, and R) and their mirror images one at a time, each appearing in different orientations. The task was to remember the orientation of each letter in the correct order while deciding whether each letter was normal or mirror imaged as quickly and accurately as possible. Each letter appeared for 2200 ms in one of seven possible orientations in 45[degrees] increments, excluding the upright orientation. Participants were asked to respond aloud to indicate whether the orientation was normal or mirror imaged. After the entire set of letters was presented in a trial, participants were asked to report the correct orientations in serial order by clicking on small buttons that indicated the possible orientations (for more details, see Shah & Miyake, 1996).
The structure of the verbal span task was identical to that for the spatial part, except that participants were shown a set of different letters and asked to recall the letters rather than the orientations. We added two letters (G and O) to the letter set (F, J, L, P, and R) that was used for the spatial span task in order to increase the variability of the possible set of letters to be recalled. Letter presentation was constrained such that the same letter could appear only once in a given trial. The participant's task was to remember the letters and their order of presentation while deciding whether each letter was normal or mirror imaged. At the end of a trial, the participant was prompted to type the letters on a keyboard in the order presented.
Each span task included a total of 20 letter sets, including 5 sets at each size ranging from two to five letters, and participants were presented with increasingly longer sets of letters. After the participants finished five trials with a letter set containing a given number of letters, they were informed of how many letters would appear in the next set of trials and were prompted to click the mouse to begin the next trial. The order of task performance was counterbalanced such that half of the participants started with the spatial task and the other half started with the verbal task.
Situation recall tasks. The participants' task on each trial was to view a sequence of two cockpit situations presented on the screen and to recall the situational information. Participants were presented either with pictorial cockpit snapshots, as shown in Figure 2a (spatial stimuli), or verbal lists that described cockpit situations, as shown in Figure 2b (verbal stimuli). A pair of cockpit situations (as depicted in Figure 2) that would appear consecutively in flight was shown together for 40 s, and then they were removed from the screen. Participants then completed an intervening task for a 30-s delay period, during which they counted backward aloud by threes as fast as possible for 30 s. For example, a prompt such as "count backward by 3: 528" was presented on the screen following the disappearance of the cockpit snapshots, and participants counted backward aloud by threes from 528: 525,522, 519, and so on. This intervening task interrupted maintenance and computation of the display indications in WM.
[FIGURE 2 OMITTED]
After the 30-s delay filled with the intervening task, the participants had to recall the flight situation. For the spatial situation trials, participants reconstructed the situation by manually filling in the indications of display instruments on a blank cockpit frame sheet of paper. For the verbal situation trials, they reported aloud the description of display indications to the experimenter. Participants were asked to recall the values that were presented in either the top or bottom situation, and the choice of situation to recall was randomly selected. When participants thought they were finished recalling all the indications they could remember for the trial, they pressed the space bar on the keyboard to view a pair of cockpit situations for the next trial.
As previously stated, we manipulated the meaningfulness of the cockpit situation. Figure 3 illustrates nonmeaningful situations. They show that each snapshot displayed an impossible flight situation, and the pair of snapshots did not depict consecutive flight situations. For example, as depicted in the bottom snapshot in Figure 3a, the attitude indicator indicates a level flight but the turn coordinator indicates a turn to the right is in progress. It is a physical impossibility for these events to occur simultaneously. Looking at the sequence of two snapshots, the altimeter indicates the altitude has decreased from 3700 to 2800 feet, although the vertical speed indicator indicates the climbing rate has increased from 0 to 500 feet per min. Such a situation cannot occur in the real world.
[FIGURE 3 OMITTED]
The situation recall tasks included nine trials of the spatial task and nine trials of the verbal task, six trials with meaningful situations and three trials with nonmeaningful situations for each task. The order of task performance was counterbalanced such that half of the participants started with the spatial task and the other half started with the verbal task. For each task, participants were first presented with a set of trials with meaningful situations and then with a set of trials with nonmeaningful situations. This was done to avoid any possible changes in viewing strategy caused by exposure to unexpected, nonmeaningful situations. We gave participants two practice trials with meaningful situations to familiarize them with the procedure. They received feedback on recall accuracy for the practice trials. We provided no accuracy feedback for the experimental trials so as to avoid any learning effect from the feedback.
SA tasks. As previously stated, a SA trial was composed of consecutive screens that showed a goal description and two consecutive cockpits (see Figure 1). The goal description indicated the desired state that an aircraft should reach in the near future (i.e., in approximately 5 s) on the three flight performance dements (e.g., "altitude 3500 feet, heading 180 [degrees], airspeed 90 knots").
The first screen displayed the goal description on the top of the screen and, immediately below this, the cockpit snapshot that depicted the initial flight situation (at Time 1) for 20 s, at which time the first cockpit snapshot disappeared and the second cockpit snapshot appeared. The goal state description remained on the top of the screen. The second snapshot depicted the "current" state of the aircraft (at Time 2) following changes caused by control movements (not specified to the participants) executed at Time 1. This second snapshot depicted the state of the aircraft approximately 5 s following the status depicted in the first snapshot. The 5-s flight interval between the two snapshots was sufficient to produce a noticeable amount of change in the cockpit display of aircraft status. (We confirmed this fact with flight instructors serving as consultants to the project.) The second snapshot remained on the screen until participants pressed a response key, at which time the goal description and the second snapshot disappeared.
Participants were asked to determine if the aircraft depicted in the cockpit snapshots would achieve the goal initially specified in the next 5 s without further control movements. That is, they had to determine whether the snapshot sequence indicated a movement toward the goal and whether the second snapshot indicated that movement toward the goal would continue. In essence, they had to mentally predict the state of the aircraft in the next 5 s with the constraint that no further control movements would be applied. The participants indicated whether the aircraft was moving in a manner consistent with achieving the goal, or in a manner inconsistent with achieving the goal, by pressing the key marked "C" for consistent or "I" for inconsistent. Inconsistent trials were created by manipulating one of three flight elements (i.e., altitude, heading, and airspeed) depicted in the second snapshot to render the "current" (Time 2) flight situation inconsistent with the specified goal. The materials included 23 consistent and 27 inconsistent snapshot sequences. The next trial began immediately after a consistency judgment was made.
Participants were given seven practice trials to familiarize themselves with the procedure before viewing the experimental trials. They received accuracy feedback during the practice trials, but feedback was not provided during the experimental trials. The computer recorded consistency judgment accuracy and the time that elapsed between the presentation of the second snapshot and the entry of the consistency judgment.
General procedure. The experiment took place in a single session lasting 2 to 2 1/2 hr. After signing an informed consent form, participants completed the flight background questionnaire. Then the three tasks were administered on a Macintosh G4 computer using a PsyScope program in the following order: the SA task, the span task, and the situation recall task.
SCORING AND DATA ANALYSIS
WM Capacity Measures
Scoring for spans'. We used procedures that Daneman and Carpenter (1980) and Shah and Miyake (1996) devised to score a participant's span. A participant's span score for the spatial task was defined as the highest set size for which all of the orientations were recalled in the correct order, for at least three of the five sets. If the participant's recall was accurate on two of the live sets at the next set size, half a point was added to the score. If a participant correctly recalled fewer than three of the five sets at a particular level but was able to recall two or more sets at a higher level (this was rare, n = 5), we used the average of the lower and the upper limits as the span score. The possible maximum and minimum scores for the spatial span measures were 5.0 and 1.0 (in cases in which only one or no set was correctly recalled at the two-item level), respectively.
For the verbal task, we used a higher criterion because this task was easier than the spatial task. A participant's verbal span score was defined as the highest set size for which all of the letters were recalled in the correct order, for at least four of the five sets. If the participant's recall was accurate on three of the five sets at the next set size, half a point was added to the score. If a participant correctly recalled fewer than four of the five sets at a particular level but was able to recall three or more sets at a higher level (this was rare, n = 4), we used the average of the lower and the upper limits as the span score. The possible maximum and minimum scores for the verbal span measures were 5.0 and 1.0 (in cases in which two or fewer sets were correctly recalled at the two-item level), respectively. Although the scoring criteria differed across span tasks because of their varied difficulty, they were applied equitably across participants and resulted in comparable span scores.
We did not take into account participant accuracy in judging whether letters were normal or mirror imaged. This aspect of span tasks is thought to reflect spatial visualization ability rather than WM capacity (Shah & Miyake, 1996). Regardless, the participants showed 70% or higher accuracy on judgment of normal or mirror-image orientation. This suggested that the participants made an effort to be accurate and, in so doing, used the processing component of WM demands imposed by the span tasks.
Span scores. A summary of descriptive statistics for the span measures for novices and experts is presented in Table 1. Of interest is whether measures of WM capacity are related to the amount of piloting experience. As previously mentioned, the capacity account proposes that individual differences in WM capacity are inherent in the individual and do not vary with expertise. To determine if there was a significant relationship between the measures of WM capacity and piloting expertise, mean span scores for novices and experts were compared. Two-tailed t tests resulted in no significant difference between the two groups in either the spatial span measure, t(50) = 0.872, p > .3, or the verbal span measure, t(50) = 0.150, p > .8. Furthermore, the effect sizes for the difference statistics were very small; the 95% confidence intervals (CIs) and Cohen's ds (Cohen, 1988) for the difference in the spatial span were CI = (.88, -.35), d = 0.24, and for the verbal span were CI = (.57, -.49), d = 0.04. This is consistent with the hypothesis that WM capacity does not change with expertise for a given individual. Another interpretation of this result is that there were no preexisting differences between the groups in WM capacity.
LT-WM Skill Measures
Scoring correct recall responses. The experimenter scored participant's indications for each of the eight flight elements (i.e., airspeed, heading, altitude, pitch, bank, power, rate of climb, and rate of turn) according to explicit rules. Responses for pitch and bank were scored separately, although they are indicated by one instrument (the attitude indicator). Responses were scored as correct and full points were given if they matched exactly all corresponding indications. The responses for pitch, bank, rate of climb, and rate of turn, which provide both directional and value indications, were given partial credit if only the directional indications were correct. However, no credit was given to the responses if only their value indications were correct because the value information in the wrong direction indicates a more severe loss of memory for the situation than does memory loss for the exact value but retention of the correct direction. For example, a pitch indication of one dot above the horizon when the correct indication is two dots above the horizon was given half a point because the pitch direction, above the horizon, is correct although the exact value of pitch is not correct. In contrast, a pitch indication of one dot above the horizon when the correct pitch indication is one dot below the horizon was given zero points because the direction (above vs. below) is incorrect, although the values (one dot) match.
LT-WM skill scores. LT-WM scores were calculated using the differences in recall accuracy between possible and impossible situations. Table 2 shows group mean control and performance LT-WM scores for spatial and verbal situation presentations. As shown in the table, each LT-WM measure was divided into control and performance submeasures to specify the nature of LTM retrieval structures constructed and used to facilitate SA performance. These submeasures were calculated based on the recall accuracy for control and performance flight elements, respectively. Control elements indicate input settings of control movements and include pitch, bank, and power. Performance elements indicate the behavior of the aircraft as a result of control movements and include altitude, heading, airspeed, and rates of climb and turn.
Table 2 shows that experts received higher LT-WM scores overall than did novices, F(1, 50) = 12.78, MSE = 0.030, p < .01. As proposed by LT-WM theory, this result can be attributed to a large body of relevant knowledge and a repertoire of flight states that expert pilots internalize. In particular, the group difference was significantly pronounced for the control LT-WM measure with spatial stimuli, F(1, 50) = 7.22, MSE = 0.042, p < .01, but not for the other LT-WM measures, Fs < 3. This suggests that experts rely more on meaningful configurations of control flight elements than do novices.
SA Performance Measures
To measure pilot SA performance, we analyzed accuracy data in the context of signal detection theory to determine how sensitive participants were to discriminating between consistent and inconsistent stimuli (Green & Swets, 1966). To determine the observer sensitivity of pilot consistency judgments, correct judgments for consistent stimuli and incorrect judgments for inconsistent stimuli represented hits and false alarms in d' calculations, respectively. Following procedures outlined in Green and Swets, we also calculated bias scores to determine if the group differences were influenced by different criteria novices and experts used for making "consistent" or "inconsistent" judgments. Judgment latency was also measured as another index of SA performance.
Table 3 shows mean judgment sensitivity, bias (a bias score of 0 indicates no bias; positive and negative scores indicate a bias to respond "inconsistent" and "consistent," respectively), and latency for novices and experts. Judgment sensitivity was higher for experts than for novices, F(I, 50) = 8.69, MSE = 0.209, p < .01, but there was no group difference in either bias or latency, Fs < 1. The results suggest that the SA task differentiated between novices and experts, although group performance differences were caused by judgment sensitivity, not by judgment criterion or latency differences. Because judgment sensitivity varied as a function of expertise, we used this performance measure as the criterion measure to be predicted by the WM capacity and LT-WM measures.
RESULTS AND DISCUSSION
The major question of interest in this research was which memory measures are valid indicators of SA performance and what roles WM capacity and LT-WM play in predicting SA performance. To address this question, we correlated each of two WM capacity measures (spatial and verbal spans) and four LT-WM measures (spatial control, spatial performance, verbal control, and verbal performance) with the SA performance measure. Then hierarchical regression analyses were conducted using WM capacity and LT-WM measures that were highly correlated with SA performance as predictors of SA peformance. These analyses allowed us to determine if the effect of one predictor on SA performance was significant when controlling for the other predictors (Cohen & Cohen, 1983).
Memory Measures Indicative of SA Performance
Table 4 shows the correlation between each of the memory measures and SA judgment sensitivity for novices and experts. The roles of specific memory measures varied as a function of expertise. Overall, WM capacity measures correlated highly with SA sensitivity for novices, whereas LT-WM measures correlated highly with SA sensitivity for experts. One explanation for this difference is that novices rely more on their inherent WM capacity because they have not acquired reliable LT-WM skills to overcome the WM overload imposed by the SA tasks. Thus novice pilots with higher WM capacity were able to be more sensitive performers. In contrast, experts with a higher level of LT-WM skill showed greater SA sensitivity.
In each type of memory measure, spatial span was the significant indicator of SA sensitivity for novices, whereas the spatial control LT-WM measure was the significant indicator of SA sensitivity for experts. This suggests that SA requires spatial representation more than verbal representation in WM. Given that LT-WM measures reflect efficiency of access to LTM, we can also infer that efficient LTM access is mediated by configurations of control flight elements.
Roles of WM Capacity and LT-WM Skill in Predicting SA Performance
To determine whether WM capacity, LT-WM skill, or their interaction predicted novice and expert SA performance, hierarchical regression analyses were performed. Because spatial WM capacity and spatial control LT-WM skill measures correlated significantly with SA judgment sensitivity, they were used as representative predictors of SA performance. (The reliability estimates ['or the spatial WM capacity and spatial control LT-WM skill measures were derived by computing split-half reliability coefficients. Specifically, we divided each task into two subsections corresponding to the odd- and even-number trials. The correlation of the scores for the two subsections served as the reliability estimate for the measure. The reliability estimates for the WM capacity measure were .87 for novices and .81 for experts. Those for the LT-WM skill measure were .66 for novices and .75 for experts. Thus the reliability of the measures used in the study was generally satisfactory.)
Hierarchical regression analyses were conducted in three steps. The LT-WM skill measure was entered in the first step; the WM capacity measure was entered in the second step; and the cross product of these two measures (LT-WM Skill x WM Capacity) was entered in the third step. This order of entry reflects our assumption that an increase in LT-WM skill would decrease the pilot's reliance on WM capacity as the determinant of SA performance. The cross product was entered only after the two variables that might be a source of interaction effect on SA performance had been entered. A significant increment in variance accounted for (referred to as [R.sup.2.sub.change]) by the variable entered in each step would indicate the unique contribution of that variable to SA performance.
The results of the hierarchical regression analyses for LT-WM skill, WM capacity, and their interaction predicting novice and expert SA performance are summarized in Tables 5 and 6, respectively. A check for collinearity between predictor variables indicated that LT-WM skill and WM capacity were tolerated in the model; tolerance = .95, r = .23, p > .10. To help explain, the relations of WM capacity to SA performance as a function of LT-WM skill for novices and experts are characterized in Figures 4a and 4b, respectively. In these figures, high LT-WM skill pilots are pilots with scores equal to their group mean or higher for the LT-WM skill measure, and low LT-WM skill pilots have scores lower than the group mean.
[FIGURE 4 OMITTED]
As shown in Table 5 and Figure 4a, only WM capacity (spatial measure) predicted novice SA performance, [R.sup.2.sub.change] = .23, B = 0.25, [F.sub.change](1, 22) = 7.10, p < .01. This suggests that the higher the WM capacity (spatial span), the better the performance on the SA task. The overall model with LT-WM skill, WM capacity, and their interaction marginally predicted novice SA performance, [R.sup.2] = .28, [R.sup.2.sub.adjusted] = .18, F(3, 21) = 2.75, p < .06. In contrast, Table 6 and Figure 4b show that LT-WM skill (spatial control measure) predicted expert SA performance, [R.sup.2.sub.change] = .21, B = 1.21, [F.sub.change](1, 25) = 6.68, p < .01, which suggests that the higher the LT-WM skill, the better the SA performance. Also, there was a significant ET-WM Skill x WM Capacity interaction in predicting expert SA performance, [R.sup.2.sub.change] = .18, B = -1.14, [F.sub.change](1, 23) = 6.89, p < .01. Figure 4b suggests that WM capacity facilitates SA performance for low LT-WM skill experts but not for high LT-WM skill experts. Moreover, for experts with high levels of LT-WM skill, relying on WM capacity decreases SA performance. The overall model with LT-WM skill, WM capacity, and their interaction significantly predicted expert SA performance, [R.sup.2] = .39, [R.sup.2.sub.adjusted] = .31, F(3, 23) = 4.97, p < .01.
To statistically support the conclusion that LT-WM skill, WM capacity, and their interaction affect novices' and experts' SA performance differently, we conducted the same regression analysis including expertise group as an additional predictor. As expected, the regression analysis with LT-WM skill, WM capacity, expertise group, and their interactions predicting pilot SA performance resulted in a significant Expertise Group x LT-WM Skill interaction, B = 4.46, t = 2.29, p < .02, and Expertise Group x LT-WM Skill x WM Capacity interaction, B = -1.25, t = -1.98, p < .05. These results suggest that the effect of LT-WM skill and the effect of the LT-WM Skill x WM Capacity interaction on SA performance vary with expertise. The overall model with LT-WM skill, WM capacity, expertise group, and their interactions significantly predicted expert SA performance, [R.sup.2] = .44, [R.sup.2.sub.adjusted] = .35, F(7, 44) = 4.84, p < .01.
The present research makes both theoretical contributions to understanding the cognitive constituents of SA and practical suggestions for developing pilot expertise and training. The findings from this research suggest that memory processes play a central role in building components of flight situation awareness and that the roles of WM and LT-WM change with expertise. These findings address an ongoing discussion among aviation researchers about the relationship between memory processes and SA during flight (e.g., Carretta et al., 1996; Durso & Gronlund, 1999; Endsley, 1995b). In the aviation psychology literature, there are contradictory ideas about the role of WM in SA. For example, Carretta et al. found that cognitive capacity measures based on WM, spatial reasoning, and divided attention predicted SA. In contrast, Durso and Gronlund argued that the correlation between WM and SA is the result not of WM capacity but of acquired strategies, such as LT-WM skills that allow experienced pilots to compensate for WM overload.
Our results indicate that both WM capacity and LT-WM skill play a role in flight SA, that LT-WM skill compensates for WM load, and that the respective roles of WM and LT-WM skill vary as a function of expertise. WM capacity was critical for novice pilots, whereas acquired LT-WM skills were important for expert pilots. From this pattern of results, we can infer that novices must do some explicit reasoning based on the instrument values to predict what the future instrument values will be, which would put a load on spatial WM. However, experts are more likely to have stored the relevant sequential patterns in LTM and can essentially retrieve the predicted state by associating incoming information with appropriate retrieval structures. This would put little load on their WM because they rely on the LT-WM skill instead of WM.
Stated differently, WM capacity was more predictive of performance in the earlier stage, whereas LT-WM skill was more predictive of performance in the later stage of the development of pilot expertise. This is consistent with the prediction of skill acquisition theories, which state that general cognitive ability plays a greater role in the initial stages of learning than do practiced skills (Ackerman, 1988). Our findings are also consistent with the previous works that emphasized the role of LTM and skilled memory for expert SA (Endsley, 1995a, 1995b: Fracker; 1988, 1991 ; Hartman & Secrist, 1991). For example, Endsley (1995a) found that expert pilots were able to keep situational information available to answer SA queries for quite a long time after a freeze of system simulation. In contrast, student participants (nonexperts) showed poor performance on similar SA queries (Fracker, 1991).
It is also interesting that this study examined the domain-general individual difference characteristics of WM. Our measures of WM capacity were independent of aviation knowledge. The capacity theory of WM makes the very strong prediction that the ability to maintain arbitrary stimuli in temporary memory reflects a process that modulates performance in many different domains. We tested this hypothesis in the aviation-related context and supported the ability of the domain-general WM capacity to predict performance in a variety of complex tasks.
This research has applied implications for pilot training. Gardner (1983) has argued that a person needs to have a certain predisposition in order to become an expert in a domain, whereas Ericsson and colleagues (Ericsson & Charness, 1994; Ericsson & Lehmann, 1996) contended that a person can learn to perform expertly in a domain through deliberate practice. Our data at least partially support the view of Ericsson and colleagues. The fact that LT-WM skill correlated significantly with flight hours (r = .30, p < .03) also supports the idea that LT-WM involves domain-specific expertise built up with practice. Although individual differences in WM capacity had a significant effect on acquisition of a high level of SA for novice pilots, the effect of such individual differences was attenuated for expert pilots with extensive knowledge and skills. Indeed, the reliance on WM capacity should be low for experts who have high levels of LT-WM skill in performing their domain SA tasks. However, because WM capacity measures predict novice pilot vulnerability to SA failures, pilot instructors may be able to use screening tests that assess WM capacity to customize their training according to the students' cognitive abilities.
The findings from this research also have potential implications for the training of expert pilots. Overall, expert SA performance was predicted by LT-WM skill. This suggests a possible future role for LT-WM skill as an assessment tool for pilots. Our results provide an insight into improving pilot LT-WM skill by pointing to pilot knowledge representation that varies with expertise. As suggested by our data, expert pilots arc likely to use configurations of control elements (e.g., bank, pitch, and power), rather than performance elements (e.g., heading, altitude, and airspeed), to represent their flight situations. Accordingly, operations on the control elements can be cognitively different from operations on the performance elements.
Thus one way to improve pilot SA would be to familiarize student pilots with the control features of cockpit situations for event-based training. One of the basic formulas taught in ground school training is "attitude + power = performance," which means that an appropriate combination of attitude (bank and pitch) and power results in desirable aircraft performance (see Dogan, 1991). Expanding student pilots' repertoires of meaningful control flight patterns of attitude and power can facilitate their skilled encoding and retrieval of information from LT-WM. This also suggests that further research is needed to determine whether such explicit training will improve individual LT-WM skill at a given level of expertise. For example, if low LT-WM pilots could be trained to organize and access their knowledge more efficiently, would this improve their LT-WM skill and, in turn, their flight SA performance?
In summary, the present research makes a significant contribution to the goal of determining the cognitive processes that support flight SA. Understanding the cognitive processes that support flight SA will allow aviation professionals to move beyond the retrospective diagnosis of SA failures. If they can anticipate when cognitive vulnerabilities and flight situations will interact and result in SA failures, they gain the opportunity to intervene and to remedy failures before they happen (Durso & Gronlund, 1999).
TABLE 1: Mean Spatial and Verbal Span Scores for Novices and Experts Novice Span Measures M SD Min. Max. Spatial span 3.15 1.08 1.00 5.00 Verbal span 3.59 0.92 2.00 5.00 Expert Span Measures M SD Min. Max. Spatial span 3.42 1.12 1.50 5.00 Verbal span 3.63 0.98 2.00 5.00 TABLE 2: Mean LT-WM Scores for Novices and Experts as a Function of Situation Presentation Modality (Spatial vs. Verbal) and Flight Element Type (Control vs. Performance) Novice Expert LTWM Measures M SD M SD Spatial presentation Control element .18 .22 .34 .19 Performance element .20 .12 .26 .13 Verbal presentation Control element .26 .17 .33 .18 Performance element .26 .11 .33 .15 TABLE 3: Mean Judgment Sensitivity, Bias Scores, and Correct Judgment Latency for Novices and Experts in the SA Task Novice Expert SA Measures M SD M SD Judgment sensitivity 1.20 0.53 1.62 0.50 Judgment bias -0.32 0.57 -0.24 0.54 Judgment latency 8.94 2.61 8.37 2.77 TABLE 4: Correlation between Memory Measures and SA Judgment Sensitivity for Novices and Experts Memory Measures Novice Expert WM capacity Spatial span .52 * .10 Verbal span .30 .10 LT WM skill Spatial control .22 .46 * Spatial performance .08 .24 Verbal control -.24 -.08 Verbal performance .17 .21 * p <.01. TABLE 5: Results of Hierarchical Regression Analyses Predicting Novice SA Performance Criterion Variable Predictor Variable [R.sub.2.sup.change] SA performance LT-WM skill .05 WM capacity .23 LT-WM skill x WM capacity .00 * p <.01. Criterion Variable B [F.sub.change] SA performance 0.54 1.16 0.25 7.10 * 0.11 0.06 TABLE 6: Results of Hierarchical Regression Analyses Predicting Expert SA Performance Criterion Variable Predictor Variable [R.sub.2.sup.change] SA performance LT WM skill .21 WM capacity .00 LT-WM skill x WM capacity .18 Criterion Variable B [F.sub.change] SA performance 1.21 6.68 * 0.01 0.01 -1.14 6.89 * * p <.01.
Ackerman, P. L. (1988). Determinants of individual differences during skill acquisition: A theory of cognitive abilities and information processing. Journal of Experimental Psychology: General, 117, 299-329.
Adams, M. J., Tenney, Y. J., & Pew, R. W. (1995). Situation awareness and the cognitive management of complex systems. Human Factors, 37, 85-104.
Baddeley, A. D. (1986). Working memory. New York: Oxford University Press.
Carretta, T. A., Perry, D. C., & Ree, M. J. (1996). Prediction of situation awareness in F-15 pilots. International Journal of Aviation Psychology, 6, 21-41.
Chase, W. G., & Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 4, 55-81.
Cohen, J. (1988). Statistical power and analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum.
Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behavior, 19, 450-466.
Daneman, M., & Tardif, T. (1987). Working memory and reading skill reexamined. In M. Coltheart (Ed.), Attention and performance XII: The psychology of reading (pp. 491-508). Hillsdale, NJ: Erlbaum.
Doane, S. M., & Sohn, Y. W. (2000). ADAPT: A predictive cognitive model of user visual attention and action planning. User Modeling and User-Adapted Interaction, 10, 1-45.
Doane, S. M., Sohn, Y. W., & Jodlowski, M. T. (2004). Pilot ability to anticipate the consequences of flight actions as a function of expertise. Human Factors, 46, 92-103.
Dogan, P. (1991). The instrument flight training manual. Seattle, WA: Aviation Book.
Dominquez, C. (1994). Can situation awareness be defined? In M. Vidulich, C. Dominquez, E. Vogl, & G. McMillan (Eds.), Situation awareness: Papers and annotated bibliography (AL/CF-TR--1994-0085, pp. 5-15). Wright-Patterson Air Force Base, OH: Armstrong Laboratory.
Durso, F. T., & Gronlund, S. D. (1999). Situation awareness. In F. T. Durso, R. Nickerson, R. Schvaneveldt, S. Dumais, S. Lindsay, & M. Chi (Eds.), The handbook of applied cognition (pp. 283-314). New York: Wiley.
Endsley, M. R. (1990). Predictive utility of an objective measure of situation awareness. In Proceedings of the Human Factors Society 34th Annual Meeting (pp. 41-45). Santa Monica, CA: Human Factors and Ergonomics Society.
Endsley, M. R. (1995a). Measurement of situation awareness in dynamic systems. Human Factors, 37, 65-84.
Endsley, M. R. (1995b). Toward a theory of situation awareness in dynamic systems. Human Factors, 37, 32-64.
Endsley, M. R., & Bolstad, C. A. (1994). Individual differences in pilot situation awareness. International Journal of Aviation Psychology, 4, 241-264.
Endsley, M. R., & Garland, D. J. (Eds.). (2000). Situation awareness analysis and measurement. Mahwah, NJ: Erlbaum.
Ericsson, K. A., & Charness, N. (1994). Expert performance: Its structure and acquisition. American Psychologist, 49, 725-747.
Ericsson, K. A., & Delaney, P. F. (1999). Long-term working memory as an alternative to capacity models of working memory in everyday skilled performance. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control (pp. 257-297). Cambridge, UK: Cambridge University Press.
Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review, 102, 211-245.
Ericsson, K. A., & Lehmann, A. C. (1996). Expert and exceptional performance: Evidence of maximal adaptation to task constraints. Annual Review of Psychology, 47, 273-305.
Fracker, M. L. (1988). A theory of situation assessment: Implications for measuring situation awareness. In Proceedings of the Human Factors Society 32nd Annual Meeting (pp. 102-106). Santa Monica, CA: Human Factors and Ergonomics Society.
Fracker, M. L. (1991). Measures of situation awareness: An experimental evaluation (AL-TR--1991-0127). Wright-Patterson Air Force Base, OH: Armstrong Laboratory.
Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York: Basic Books.
Green, D. L., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.
Gronlund, S. D., Ohrt, D. D., Manning, A. C., Dougherty, M. R. P., & Perry, J. L. (1998). Role of memory in air traffic control. Journal of Experimental Psychology: Applied, 4, 263-280.
Gugerty, L. I. (1997). Situation awareness during driving: Explicit and implicit knowledge in dynamic spatial memory. Journal of Experimental Psychology: Applied, 3, 42-66.
Gugerty, L. J., & Tirre, W. C. (2000). Individual differences in situation awareness. In M. R. Endsley & D. J. Garland (Eds.), Situation awareness analysis and measurement (pp. 249-276). Mahwah, NJ: Erlbaum.
Hartman, B. O., & Secrist, G. E. (1991). Situational awareness is more than exceptional vision. Aviation, Space, and Environmental Medicine, 62, 1084-1089.
Jones, D. G., & Endsley, M. R. (2000). Overcoming representational errors in complex environments. Human Factors, 42, 367-378.
Joslyn, S., & Hunt, E. (1998). Evaluating individual differences in response to time-pressure situations. Journal of Experimental Psychology: Applied, 4, 16-43.
Just, M. A., & Carpenter, P. A. (1992). A capacity theory of comprehension: Individual differences in working memory. Psychological Review, 99, 122-149.
Larkin, J., McDermott, J., Simon, D. E, & Simon, H. A. (1980). Expert and novice performance in solving physics problems. Science, 208, 1335-1342.
O'Hare, D. (1997). Cognitive ability determinants of elite pilot performance. Human Factors, 39, 540-552.
Peterson, L. R., & Peterson, M. (1959). Short-term retention of individual items. Journal of Experimental Psychology, 58, 193-198.
Sarter, N. B., & Woods, D. D. (1991). Situation awareness: A critical but ill-defined phenomenon. International Journal of Aviation Psychology, 1, 45-57.
Shah, P, & Miyake, A. (1996). The separability of working memory resources for spatial thinking and language processing: An individual differences approach. Journal of Experimental Psychology: General, 125, 4-27.
Sohn, Y. W., & Doane, S. M. (1997). Cognitive constraints on computer problem-solving skills. Journal of Experimental Psychology: Applied, 3, 288-312.
Stokes, A. F., Kemper, K., & Kite, K. (1997). Aeronautical decision making, cue recognition, and expertise under time pressure. In C. E. Zsambok & G. Klein (Eds.), Naturalistic decision making (pp. 183-196). Mahwah, NJ: Erlbaum.
Wickens, C. D. (1999). Cognitive factors in aviation. In P. T. Durso, R. S. Nickerson, R. W. Schvaneveldt, S. T. Dumais, D. S. Lindsay, & M. T. H. Chi (Eds.), Handbook of applied cognition (pp. 247-282). Chichester, UK: Wiley.
Date received: July 13, 2002
Date accepted: March 8, 2004
Young Woo Sohn is an assistant professor of psychology at Yonsei University. He received his Ph.D. in psychology from the University of Illinois at Urbana-Champaign in 1999.
Stephanie M. Doane is a professor of psychology at Mississippi State University. She received her Ph.D. in psychology from the University of California, Santa Barbara, in 1986.
Address correspondence to Young Woo Sohn, Department of Psychology, Yonsei University, Seoul, 120-749, Korea; firstname.lastname@example.org.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Cognitive Processes|
|Author:||Sohn, Young Woo; Doane, Stephanie M.|
|Date:||Sep 22, 2004|
|Previous Article:||Learning to make decisions in dynamic environments: effects of time constraints and cognitive abilities.|
|Next Article:||Characterizing the effects of droplines on target acquisition performance on a 3-D perspective display.|