# Trends in research design and data analytic strategies in organizational research.

In planning empirical research, choices have to be made about
research design (experimental vs. nonexperimental), research setting
(e.g., laboratory vs. natural setting), measures (e.g., questionnaires,
observations of behavior), and data analysis strategies (e.g., analysis
of variance, multiple regression, covariance structure analysis) and a
host of other factors (cf. Kerlinger, 1986; Runkel & McGrath, 1972;
Stone, 1978). In the coming years these choices may be influenced
increasingly by recent growth in the availability and ease of use of
covariance structure analysis (CSA)-based computer programs (e.g., EQS,
LISREL) for assessing the plausibility of models that: (1) posit causal
relationships between latent constructs on the basis of observed
covariances between observed variables; or (2) posit that scores on
observed variables are a function of underlying constructs. At present,
however, little is known about the effects that such CSA programs have
had on the choices that researchers make in planning empirical research.
Thus, the major purpose of the present study was to assess trends in
research design and data analytic strategies over the time period that
immediately preceded and followed the introduction of the LISREL
software (one of the two major programs for CSA) by Joreskog &
Sorbom (1976).

Causal modeling procedures have been used for several decades in the biological, social, and behavioral sciences (cf. Asher, 1976; Blalock, 1964, 1971; Bollen, 1989; James, Mulaik & Brett, 1982). Initial work on one of the earliest forms of causal modeling, what has been referred to as classical path analysis (Joreskog & Sorbom, 1989), was performed by Wright (1921, 1934, 1960; cf. Bollen, 1989; Joreskog & Sorbom, 1989). This early work was followed by refinements in classical path analysis procedures and the development of a variety of other correlation-based techniques (e.g., partial correlation, multiple correlation, cross-lagged panel correlation) for assessing the degree to which relationships between variables are consistent with an assumed causal model that links the variables (cf. Blalock, 1961; Duncan, 1966; Lazarsfeld, 1948, 1972; James et al., 1982; Kenny, 1979; Rozelle & Campbell, 1969; Simon, 1954, 1971). It deserves noting that subsequent to their development and popularization, several of these techniques have been shown to be ineffective in modeling presumed causal connections between variables and prone to yielding misleading results. For example, Rogosa (1980) demonstrated numerous problems with the cross-lagged panel correlation strategy that was once thought to be useful (e.g., Kenny, 1975; Rozelle & Campbell, 1969) in modeling causal processes using data from longitudinal studies.

In recent years, notable advancements in causal modeling procedures have stemmed from the work of a number of individuals, including Joreskog and his colleagues (e.g., Joreskog, 1970, 1973, 1978; Joreskog & Sorbom, 1976, 1989) and Bentler and his coworkers (e.g., Bentler, 1985). Joreskog and his associates developed structural equation model (SEM)-based procedures and related computer programs that rely on the analysis of observed covariances between measured variables. These covariance structure analysis (CSA)-based programs include the now popular LISREL8 program and its predecessors (e.g., Joreskog & Sorbom, 1976, 1989). Other programs for performing CSA-based analyses (i.e., EQS and EQS/PC) were developed by Bentler (1985).

Analyses performed by the LISREL and EQS programs consider two major issues: One is the extent to which a theory-based model describing hypothesized causal connections between latent variables is consistent with an observed set of covariances between measured variables. Such analyses are concerned with the testing of latent variable models. The second major issue considered by CSA programs is the degree to which observables are a function of a hypothesized set of latent variables. Such analyses are concerned with measurement models. Of course, SEM procedures may simultaneously consider both of these issues. In this case, general or full models are the focus of the analysis (cf. Bollen, 1989; Joreskog & Sorbom, 1989).

The availability of such CSA software as EQS and LISREL may have resulted in changes in the way that organizational researchers (i.e., individuals in such fields as management, industrial and organizational psychology, organizational behavior, organizational theory, organizational communication) and researchers concerned with a host of other phenomena have approached the task of data analysis. More specifically, the existence of CSA-based software may have led to an increase in the use of structural equation modeling (SEM) procedures by researchers.

The existence of such software also may have led to systematic changes in the research designs used by investigators in various academic disciplines. For example, the availability of CSA software may have motivated organizational researchers to use non-experimental designs more frequently than in previous years. Several factors may account for this: First, it is often difficult, if not impossible, to conduct experimental studies in organizational contexts. Thus, nonexperimental studies, often involving questionnaire measures, may be perceived as easier to perform than experimental studies. Second, researchers may be reluctant to perform experimental research in laboratory contexts because of fears that such studies may be viewed as having low levels of external validity and as less likely to be published than the findings of nonexperimental, field-based research. Third, some organizational researchers may assume, quite incorrectly, that it is appropriate to derive causal inferences from studies that use nonexperimental designs if the data from such studies are analyzed with CSA-based procedures (e.g., EQS, LISREL). In view of these factors, the popularity of nonexperimental studies using CSA may have increased in recent years.

Shifts in prevailing designs and data analytic strategies are important for several reasons. First, such shifts may signal the need for changes in the content of university-based training programs. For example, courses in CSA procedures might be required of all individuals in graduate-level training programs in the organizational sciences (e.g., management, industrial and organizational psychology, organizational theory, organizational communication, and organizational behavior) and other disciplines. Second, as a result of such shifts, more emphasis might be placed on nonexperimental research designs and strategies in graduate-level research methods courses in various disciplines, including the organizational sciences. Third, to the degree that CSA procedures are used by researchers, "gate-keepers" of various types (e.g., journal editors, program chairs of conferences of scientific societies, members of editorial boards, reviewers of conference papers) will need to have sufficient familiarity with both CSA methods and the characteristics of nonexperimental research to provide competent evaluations of papers that are submitted for review.

Unfortunately, at present, there are no data on the extent to which CSA procedures are being used by organizational researchers. Moreover, there are no data on the degree to which research designs of various types are being used by organizational researchers. Thus, the overall purpose of the present study was to provide answers to two major questions: First, has the availability of such techniques as path analysis, LISREL and EQS led to changes (over time) in the way that data are analyzed by researchers in the organizational sciences. Second, have the basic designs used by researchers in the organizational sciences shifted over time? In order to provide answers to these questions we conducted a content analysis of articles published in the Journal of Applied Psychology for the 1975-1993 period. The results of this analysis were used for two major purposes. First, we assessed trends in the use of basic research designs (i.e., experiments, nonexperiments, and other designs) over time. Second, we appraised trends in the use of data analytic strategies that involve: (1) the comparison of group means (e.g., t tests of mean differences, analysis of variance, multivariate analysis of variance); (2) the assessment of relationships between two or more variables (e.g., zero-order correlation, multiple regression); and (3) the analysis of covariance structure data (e.g., testing of structural equation models). The major interest in these assessments was to determine if the frequency of use of given procedures and/or designs has increased or decreased systematically over the 1975-1993 period. To assess this we computed zero-order correlation coefficients between the year articles were published (time) and annual usage indices (described below) for various research design and data analytic strategy types. In addition, we examined yearly usage data for high and low points. Moreover, in instances where there were noticeable shifts in usage we tested for the presence of changes using appropriate statistical techniques (e.g., test of differences between proportions). These tests considered either selected pairs of years or selected sets of years for which there were apparent differences in usage.

Method

Sample

Data used in the present study were derived from content analyses of 1,929 articles published in the Journal of Applied Psychology during the 1975-1993 period (total of 19 years). The number of studies published per year ranged from 78 to 150, and averaged 101.53. Table 1 shows sample sizes for each of the years in the period.

Measures

Each article was coded in terms of several criteria: The first criterion was basic design. The categories under this rubric were: (1) experimental studies, including randomized or true experiments (including statistical simulations) and quasi-experiments; (2) nonexperiments or passive observational studies; and (3) other designs, including meta-analyses, narrative literature reviews, and comments. Note that a single article might report the results of research using more than one design. For example, a multi-study article might consider the results of both a true experiment conducted in a laboratory context and a nonexperiment conducted in a field setting. In cases of multi-study articles we coded each design type that was used.

[TABULAR DATA FOR TABLE 1 OMITTED]

The second basis for categorization was the principal data analytic procedure(s) used in the study. Here the coding was restricted to procedures that were directly relevant to testing a study's hypotheses or answering its research questions. Thus, for example, in a study that used multiple regression to test hypotheses about relationships between predictor variables and a dependent variable, a researcher might report the means, standard deviations, and reliabilities of measured variables. However, for purposes of the present study, only the regression analysis was coded. All other analyses were regarded as ancillary to the study's main objective(s).

The categorization scheme for data analytic strategies considered the following 15 factors: (1) CSA procedures (e.g:, LISREL, EQS); (2) classical path analysis; (3) zero-order correlation; (4) multiple regression/correlation; (5) canonical correlation; (6) discriminant function analysis and multiple discriminant function analysis; (7) factor analysis (common factor and principal components); (8) cluster analysis; (9) analysis of variance (ANOVA); (10) analysis of covariance (ANCOVA); (11) chi-square based tests of association; (12) multivariate analysis of variance (MANOVA); (13) multivariate analysis of covariance (MANCOVA); (14) t tests of mean differences; and (15) other data analytic strategies (e.g., multidimensional scaling, nonparametric analysis of variance). Of course, in many instances more than one data analytic strategy was used in a given study. In such instances coding was performed for all principal strategies used.

Annual percentage use indices were computed for each of the design types and data analytic strategies. In these indices the frequency of use of a given type of research design or data analytic strategy was divided by the number of articles published per year and the quotient was multiplied by 100. For example, the percentage use index (PUI) for nonexperimental designs for 1975 was (88 / 150) x 100 = 58.67%. Note that a PUI can range from 0 to 100 and it reflects the percentage of articles that used a particular type of design or data analytic strategy in a given year.

Procedure

Each article was examined by one of two coders (Jennifer Glenar or Amy Weaver) and a coding sheet was completed for it that provided for its coding in terms of the above noted set of categories for research designs and data analytic strategies. All aspects of a study were coded. To avert at least one form of systematic bias in the coding, one of the coders coded articles for even years (e.g., 1976, 1978) while the other coded articles for the odd years (e.g., 1977, 1979).

In order to assess the consistency of the coding process (i.e., the reliability of the coders), articles in six issues of 1987, three issues of 1992, and two issues of 1993 were coded by both coders. Consistency (reliability) was assessed by calculating the percentage agreement between the coders. This was 88% for 1987, 86% for 1992, and 91% for 1993. The less than perfect reliability stemmed primarily from differences in the ways that the coders dealt with supplementary analyses. For example, in a study that used analysis of variance to test one or more hypotheses, one coder might have coded post hoc tests of mean differences while the other coder did not.

Analyses

In order to assess trends in the use of research design types and data analytic strategies we computed zero-order correlation coefficients between time period (year of study) and the annual PUI values for design types and analysis strategies. Although this analysis has the potential to reveal linear trends, it provides underestimates of nonlinear trends. Note, moreover, that because the study considered PUI indices for only 19 years, correlation coefficients had to equal or exceed .456 to be statistically significant at the .05 level (two-tailed test).

In cases where there appeared to be non-chance based differences in the PUI values for a research design type or a data analytic strategy, we tested for the statistical significance of such differences using either small or large sample based tests of the equality of proportions (cf. Marascuilo & Serlin, 1988, pp. 323-327). As appropriate, these tests involved the PUIs of either pairs of years or two sets of years.

Results

Research Designs Used in Coded Articles

Rows 2-4 of Table 1 show a yearly breakdown of articles in terms of the research design criterion. The corresponding PUI values are plotted in Figure 1. As this figure shows, the PUI for nonexperimental designs decreased considerably from 1975 to 1978, then fluctuated somewhat about a mean of 47.85% for the 1979 to 1991 period. Interestingly, in 1992 the PUI for nonexperimental designs was 43.96%, well below the 1991 PUI of 50.0%, and it fell even more by 1993 (i.e., to 36.73%). Note, moreover that the 1975 and 1993 PUIs for nonexperimental designs differed markedly from one another. In spite of this, as the results in Table 2 show, the relationship between time period and PUI for nonexperimental research (r = -.38) was not statistically significant.

The PUI for experimental designs (randomized or quasi) varied somewhat around a mean of 42.33% for the 1975-1982 period. Overall, this PUI decreased considerably from 1982 (PUI = 48.11%) to 1990 (PUI = 32.95): A test of the difference between the corresponding proportions yielded a Z of 2.14, p = .016. Interestingly, the PUI for experimental designs increased in the 1990-1993 period, equaling 43.88% at the end of the period, somewhat below the 1980 high of 49.45%. Overall, however, there was not a consistent upward or downward trend in the use of experimental designs: The correlation between time and the PUI for experimental designs was a mere -.20 (p [greater than] .05).

PUI values for studies using other designs (e.g., literature reviews, meta analyses) remained fairly low for the entire 1975-1993 period, the average PUI being 13.37% for this period. Note, moreover, that the PUIs were below 24% for the entire period, reaching a peak of 23.08% in 1983. Interestingly, as the [TABULAR DATA FOR TABLE 2 OMITTED] results in Figure I show, there appeared to be an increase in the use of other designs over the study period. Consistent with this apparent trend, the correlation between time and the PUI for other designs was .71 (p [less than] .01).

Data Analytic Strategies Used in Coded Articles

Rows 5-21 of Table 1 show PUIs for data analytic strategies used in the 1975-1993 period. Figure 2 plots the yearly PUI levels for articles that tested for univariate or multivariate mean differences. The PUIs for MANCOVA, ANCOVA, and MANOVA were generally below 10% for the entire period. Overall, however, there was a slight, but fairly steady increase in the use of the MANOVA strategy over the 1975-1993 period: The correlation between time and the PUIs for MANOVA was .84 (p [less than] .01). There was also an increase in the use of the ANCOVA strategy over the study period: The correlation between time and the PUIs for ANCOVA was .74 (p [less than] .01).

The mean PUIs for the MANOVA, ANCOVA, and MANCOVA strategies across the 1975-1993 period were 7.93%, 2.85%, and 0.73%, respectively (see Table 1). Compared to these PUIs, the mean PUIs for the t test strategy and the ANOVA strategy were considerably greater, i.e., 21.67% and 35.51%, respectively. Note, however, that over time there was a general decrease in the use of the ANOVA strategy: As the results in Table 2 indicate, there was a -.54 (p [less than]; .05) correlation between time and the PUIs for ANOVA. Note also that although the mean PUI for ANOVA was 38.42% for the 1975-1984 period it was only 32.05% for the 1985-1993 period, a statistically significant decrease (Z = 2.91,p = .002). In addition, the 1993 PUI of 35.71% for ANOVA was considerably below the 1980 high of 46.15% for this strategy.

Figure 3 displays PUIs for selected correlation-based data analytic strategies (i.e., zero-order correlation, multiple regression, classical path analysis, and factor analysis). As can be seen in this figure, the PUI for zero-order correlation varied considerably about a mean of 42.09% for the 1975-1993 period and equaled 38.78% in 1993, about the same as the 1975 PUI of 42.67%. Overall, as the results in Figure 3 show, the use of multiple regression increased considerably over the period considered by the present study: Consistent with this seeming trend, the zero-order correlation between time and the PUIs for multiple regression was .68 (p [less than] .01). The increase in the use of this data analytic strategy is further illustrated by the shift in the 1975 and 1993 PUIs, i.e., 10.67% and 24.49%, respectively. Note finally that although there was quite a gap between the PUIs for zero-order correlation and multiple regression in 1975, by 1993 this gap had narrowed substantially.

Interestingly the PUIs for classical path analysis were below 6% for the entire 1975-1993 period. Moreover, the PUI for this strategy dropped to 0% by 1993. Overall, however, there was no stable trend in the use of this technique over the 19 year study period: The zero-order correlation between time and the PUIs for path analysis was .42 (p [greater than] .05).

The PUIs for exploratory factor analysis were generally below 10% for the 1975-1993 period and averaged only 8.45% for the entire period. Although an inspection of Figure 3 suggests that there was a slight decrease in the use of exploratory factor analysis, the zero-order correlation between time and the PUIs for exploratory factor analysis was a mere -. 18 (p [greater than] .05).

Figure 4 shows trends in the use of CSA techniques that test three different models: (1) measurement models only; (2) structural models only; and (3) both measurement and structural models (full models). For purposes of comparison the same figure also shows PUI levels for classical path analysis and exploratory factor analysis. As can be seen in the figure, there were no tests of measurement models in the 1975-1984 period. However, by 1993 the PUI for CSA-based tests of measurement models was 8.16%. Overall, there was a consistent upward trend in the PUIs for CSA-based tests of measurement models: The zero-order correlation between time and the PUIs for CSA-based tests of such models was .86 (p [less than] .01).

At the same time as the use of CSA-based tests of measurement models increased, there was no similar trend in the use of exploratory factor analysis: The zero order correlation between time and the PUIs for exploratory factor analysis was -.18 (p [greater than] .05). Interestingly, however, although the PUIs for exploratory factor analyses reached levels of 15.25% in 1976, 16.04% in 1982, 11.34% in 1986, and 11.83% in 1987, in the 1990s the PUIs for this procedure never exceeded 10%. Moreover, the 1993 PUI for exploratory factor analysis was only 6.12%.

As the results in Figure 4 suggest, over the study period there was an increase in the use of CSA procedures for testing structural models: The zero-order correlation between time and the PUIs for CSA-based tests of structural models was .62 (p [less than] .01). Note also that there was: (1) a .68 (p [less than] .01) correlation between the PUIs for CSA-based tests of structural models and CSA-based tests of measurement models; and (2) a .84 (p [less than] .01) correlation between CSA-based tests of structural models and CSA-based tests of full models. These results suggest concomitant increases in the use of the three CSA-based procedures.

Figure 4 also shows that although there was only one report of a CSA-based test of a full model in the 1975-1985 period, by 1993, 10.20% of the published articles reported such tests. Moreover, the zero-order correlation between time and the PUIs for full model tests was .84 (p [less than] .01). Overall, there was a dramatic rise in the use of CSA-based tests of full models over the 1975-1993 period.

Some Other Noteworthy Findings

Two other general trends are worthy of note. First, as can be seen in Table 2, there was a -.64 (p [less than] .01) correlation between the PUIs for ANOVA and multiple regression. As the use of ANOVA decreased the use of multiple regression increased. In view of the fact that neither of these PUIs correlated with the PUIs for either experimental or nonexperimental designs, the -.64 correlation appears to reflect a shift in data analytic strategies rather than a shift in design type. Second, over time there was a trend toward the increased use of multivariate data analytic strategies. This is illustrated, for example, by several correlation coefficients in Table 2; including: (1) the .66 (p [less than] .01) correlation between the PUIs for MANOVA and multiple regression; (2) the .58 (p [less than] .01) correlation between the PUIs for ANCOVA and MANCOVA; and (3) the .72 (p [less than] .01) correlation between the PUIs for MANCOVA and multiple regression.

Discussion

The present study had two major purposes. One was to assess trends in the use of non-experimental versus experimental designs in research reported during the 1975-1993 period. The other was to measure trends in the use of various data analytic strategies over the same period. In order to accomplish these purposes, we coded information from 1,929 articles that were published in the Journal of Applied Psychology during the 1975-1993 period. Results of various analyses showed that although there were differences from one year to the next in the PUIs for nonexperimental and experimental designs, there did not appear to be consistent upward or downward trends in these indices. However, there was an increase in the use of other types of designs over the 1975-1993 period. In addition, the results showed notable changes in the PUIs for several of the statistical procedures that test for univariate or multivariate mean differences, relative stability in the PUIs for zero-order correlation, an increase in the PUIs for multiple regression and several other multivariate statistical procedures, and substantial growth in the PUIs for CSA-based data analytic techniques (e.g., EQS, LISREL). Implications of these findings are now considered.

Declines in the Use of Speciffc Data Analytic Strategies

Simple data analytic strategies. The PUIs in Table 1 and the correlation coefficients in Table 2 showed a gradual decline in the use of some of the simpler data analytic procedures as the major means of testing hypotheses or answering research questions. For example, over time the use of ANOVA decreased somewhat. Although the PUIs for ANOVA have shown a general decline over the 1975-1993 period, there was an increase in the PUIs for MANOVA over the same period. More specifically the PUIs for MANOVA were 4.00% in 1975 and 12.24% in 1993. There are several possible causes of this increase. One is that researchers are becoming increasingly sensitive to the need to consider: (1) the effects of treatments on multiple dependent variables; and (2) the correlations between such variables when assessing the effects of experimental treatments. Unfortunately, we have no direct evidence to support this speculation about the cause of the increase in the use of MANOVA. However, the explanation seems reasonable in view of the fact that the correlation coefficients in Table 2 show that the PUIs for several multivariate procedures increased concomitantly over the 19 year study period.

The results in Figure 3 and the corresponding yearly PUIs in Table 2 reveal that although the use of zero-order correlation as a strategy for testing hypotheses appeared to vary considerably from one year to the next in the 1975-1993 period, there was not a consistent upward or downward trend in the use of this strategy. However, over the same period there was an increase in the use of multiple correlation/regression by researchers. The zero-order correlation between time and the use of multiple regression/correlation was .68 (p [less than] .01). Note also that although from 1975 to 1982 the PUI for multiple regression reached a maximum value of 19.81%, after 1983 the PUI for this strategy was generally far above 20%. It reached a high of 37.63% in 1987 and equaled 24.49% in 1993. These results may be a function of several factors. First, there may be an increased awareness on the part of researchers that variability in measured levels of dependent variables is typically a function of more than a single putative cause. It may also be a function of increased demands for multivariate analyses by journal editors and reviewers of papers that are submitted for publication. Whatever the cause or causes of the upward trend in the use of selected multivariate statistical procedures (e.g., MANOVA, multiple regression), it appears that some of the relatively simple analyses that were used to test the hypotheses of many articles published in the 1960s and 1970s (e.g., ANOVA) will be used less and less in future years as the principal means of testing hypotheses or answering research questions.

Path analysis. Classical path analysis was used with varying frequency in the period covered by the present study. In no case, however, was this strategy used in over 5.32% of the articles published in any given year. Moreover, by 1993 there were no articles that reported the use of this strategy. Overall, therefore, it appears that the use of path analysis has all but ended. This may very well be attributable to the rapid rise in the availability of main-frame and personal computer versions of such CSA-based programs as LISREL and EQS. The CSA procedures accomplished by these programs have a far better capacity to assess the plausibility of assumed causal models than does classical path analysis.

Exploratory factor analysis. As noted above, the study's results suggest that the while the use of CSA-based procedures for conducting confirmatory factor analyses (i.e., testing measurement models) increased, the use of exploratory factor analysis did not. This seems to be a healthy trend. The aims of measurement are far better served by instrument development procedures that begin with a firm set of expectations about what is being assessed by a measure than they are by efforts to "discover" the structure inherent in the measure's items. In addition, construct validation efforts are better served by clearly stated hypotheses about relationships between and among measures of various constructs than they are by attempts to define higher-order constructs through exploratory methods.

Implications for Researchers Who Conduct Nonexperimental Studies

Overall, it appears that researchers who want to publish the results of nonexperimental studies in the major journals of the organizational sciences will have to learn to use and properly interpret the results of such CSA-based programs as LISREL and EQS. One reason for this is that it seems likely that journal editors, editorial board members, and reviewers of papers submitted for presentation at meetings of various scientific societies will increasingly require that researchers use CSA-based procedures to analyze data from passive observational studies that deal with multiple assumed causes and effects. Of course, this does not mean that all articles published in journals or presented at scientific conferences will have to use CSA-based methods. If current trends continue, however, not knowing CSA methods will leave researchers at a great disadvantage. Among the reasons for this are that CSA methods allow for inferences about both the direct and indirect "effects" of variables. Moreover, CSA procedures correct observed relationships between variables for the attenuation that is attributable to unreliable measurement (cf. Bollen, 1989; Joreskog & Sorbom, 1989).

A Caveat Concerning Causal Inferences

In spite of the notable advantages that CSA-based techniques have over other data analysis strategies, the results of such analyses are subject to improper interpretation. More specifically, although the findings of SEM analyses allow researchers to test the plausibility of theoretical models that link latent variables to one another and assume the existence of causal connections between such variables, the same findings do not allow for conclusions about either the existence of causality or the direction of causal flow. For example, a nonexperimental study may demonstrate that measures of variables X (a presumed cause) and Y (a presumed effect) covary. This covariance would be consistent with an underlying model that X causes Y. However, it would also be consistent with a model positing that Y causes X. In addition, it would be consistent with a model that specifies that X and Y covary, not because they affect one another, but because they share a common cause or set of causes (e.g., U, V, W). Thus, it is inappropriate to make causal inferences on the basis of data from nonexperimental research (cf. Bollen, 1989). Unfortunately, the results of CSA-based procedures that have been applied to data from nonexperimental studies have often been interpreted in a causal manner. That is, many researchers seem to operate on the basis of the incorrect belief that design deficiencies can be compensated for by sophisticated data analytic methods.

Some Limitations of the Present Study

Our study demonstrated what would appear to be some interesting and important trends in data analysis procedures for the 1975-1993 period. However, the study has two important limitations. One is that it used data from articles published in only one journal in the organizational sciences, i.e., the Journal of Applied Psychology. Articles published in this journal tend to focus more on phenomena studied at the level of the individual than at the level of the organization. Thus, the findings of our study may not be representative of research published in journals that are more macro in their orientation (e.g., Administrative Science Quarterly, Academy of Management Journal, Journal of Management). As a result, our findings may either underestimate or overestimate the use of CSA procedures in articles appearing in the more macro-oriented journals. On the one hand, underestimation might result from the fact that research concerned with macro issues is far less likely to be experimental in nature than research concerned with micro issues. As a consequence, SEM procedures would be more likely to be used in macro organizational research than the data analytic procedures that have typically been used to analyze data from experiments (e.g., ANOVA, ANCOVA, MANOVA, MANCOVA). On the other hand, our data might overestimate the use of SEM procedures in macro research for two reasons. First, macro-oriented researchers seem to have less of a psychometric orientation than micro-oriented investigators. Thus, macro-oriented researchers may be less likely than micro-oriented researchers to use SEM techniques. Second, the sample sizes of many macro-oriented studies may be too small to allow for the effective use of SEM procedures.

A second limitation of our study is that it spanned the 1975-1993 period. As a result, the correlation coefficients reported in Table 2 were all based on a sample size of 19. This greatly limited the capacity of our study to find statistically significant relationships between time and the PUIs for various research design types and data analytic strategies. For example, the -.38 correlation between time and the PUIs for nonexperimental designs, although of considerable magnitude, was not statistically significant using a two-tailed test and a .05 Type I error rate criterion. Thus, our study's results probably underestimate trends in the use of some design types and data analysis strategies.

A Final Note

Results of the present study showed evidence of what may very well be important trends in the way that researchers have analyzed data from empirical studies. More specifically, our study showed nontrivial increases in the use of multivariate data analytic strategies, including CSA-based procedures (e.g., LISREL, EQS). The latter procedures are of special importance because they allow researchers to assess (1) the plausibility of models that specify causal connections between latent variables and (2) the degree to which observed variables are a function of latent variables. If the data analytic trends revealed by the present study continue, there will need to be commensurate changes in the training of scientists and practitioners who study organizational phenomena. Moreover, many individuals who completed their graduate training before 1980 will need to learn to properly use CSA procedures and correctly interpret the output of CSA programs. We hope that our results serve to motivate this learning.

Acknowledgment: An abbreviated version of this paper was presented at the Causal Modeling Conference, Purdue University, West Lafayette, Indiana, March 3, 1994. The second and third authors contributed equally to the research reported in this article. Thus, the order of authorship for them was randomly determined. We thank Larry Williams and Dianna L. Stone for many helpful comments on an earlier version of this article.

References

Asher, H. B. (1976). Causal modeling. Beverly Hills, CA: Sage.

Bentler, P. M. (1985). Theory and implementation of EQS: A structural equations program. Los Angeles, CA: BMDP Statistical Software.

Blalock, H. M. (1961). Correlation and causality: The multivariate case. Social Forces, 39: 246-251.

-----. (1964). Causal inferences in nonexperimental research. New York: Norton.

----- (ed.) (1971). Causal models in the social sciences. Chicago, IL: Aldine.

Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.

Duncan, O. D. (1966). Path analysis: Sociological examples. American Journal of Sociology, 72: 1-16.

-----. (1975). Introduction to structural equation models. New York: Academic Press.

James, L. R., Mulaik, S. & Brett, J. (1982). Causal analysis: Assumptions, models, and data. Beverly Hills, CA: Sage.

Joreskog, K. G. (1970). A general method for analysis of covariance structures. Biometrika, 57: 239-251.

-----. (1973). A general method for estimating a linear structural equation system. Pp. 85-112 in A. S. Goldberger & O. D. Duncan (eds.), Structural equation models in the social sciences. New York: Academic Press.

-----. (1978). Structural analysis of covariance and correlation matrices. Psychometrika, 43: 443-477.

Joreskog, K. G & Sorbom, D. (1976). LISREL-Estimation of linear structural equation systems by the method of maximum likelihood. Chicago, IL: International Educational Services.

-----. (1989). LISREL 7 user's reference guide. Mooresville, IN: Scientific Software, Inc.

Kenny, D. A. (1975). Cross-lagged panel correlations: A test for spuriousness. Psychological Bulletin, 82: 887-903.

-----. (1979). Correlation and causality. New York: Wiley.

Kerlinger, F. N. (1986). Foundations of behavioral research, 3rd ed. New York: Holt, Rinehart & Winston.

Lazarsfeld, P. F. (1948). The use of panels in social research. Proceedings of the American Philosophical Society, 92: 405-410.

-----. (1959). Latent structure analysis. Pp. 476-543 in S. Koch (ed.), Psychology: A study of science, Vol. 3. New York: McGraw-Hill.

-----. (1972). Mutual effects of statistical variables. In P. F. Lazarsfeld, A. K. Pasanella & M. Rosenberg (eds.), Continuities in the language of social research. New York: Free Press.

Marascuilo, L. A. & Serlin, R. C. (1988). Statistical methods for the social and behavioral sciences. New York: W. H. Freeman.

Rogosa, D. (1980). A critique of cross-lagged correlation. Psychological Bulletin, 88: 245-258.

Rozelle, R. M. & Campbell, D. T. (1969). More plausible rival hypotheses in the cross-lagged panel correlation technique. Psychological Bulletin, 71: 74-80.

Runkel, P. J. & McGrath, J. E. (1972). Research on human behavior: A systematic guide to method. New York: Holt, Rinehart & Winston.

Simon, H. A. (1954/ 1971). Spurious correlation: A causal interpretation. In H. M. Blalock (ed.), Causal models in the social sciences. Chicago, IL: Aldine-Atherton.

Stone, E. F. (1978). Research methods in organizational behavior. Homewood, IL: Scott, Foresman.

Wright, S. (1921). Correlation and causation. Journal of Agricultural Research, 20: 557-585.

-----. (1934). The method of path coefficients. Annals of Mathematical Statistics, 5: 161-215.

-----. (1960). Path coefficients and path regressions: Alternative or complementary concepts. Biometrics, 16: 189-202.

Causal modeling procedures have been used for several decades in the biological, social, and behavioral sciences (cf. Asher, 1976; Blalock, 1964, 1971; Bollen, 1989; James, Mulaik & Brett, 1982). Initial work on one of the earliest forms of causal modeling, what has been referred to as classical path analysis (Joreskog & Sorbom, 1989), was performed by Wright (1921, 1934, 1960; cf. Bollen, 1989; Joreskog & Sorbom, 1989). This early work was followed by refinements in classical path analysis procedures and the development of a variety of other correlation-based techniques (e.g., partial correlation, multiple correlation, cross-lagged panel correlation) for assessing the degree to which relationships between variables are consistent with an assumed causal model that links the variables (cf. Blalock, 1961; Duncan, 1966; Lazarsfeld, 1948, 1972; James et al., 1982; Kenny, 1979; Rozelle & Campbell, 1969; Simon, 1954, 1971). It deserves noting that subsequent to their development and popularization, several of these techniques have been shown to be ineffective in modeling presumed causal connections between variables and prone to yielding misleading results. For example, Rogosa (1980) demonstrated numerous problems with the cross-lagged panel correlation strategy that was once thought to be useful (e.g., Kenny, 1975; Rozelle & Campbell, 1969) in modeling causal processes using data from longitudinal studies.

In recent years, notable advancements in causal modeling procedures have stemmed from the work of a number of individuals, including Joreskog and his colleagues (e.g., Joreskog, 1970, 1973, 1978; Joreskog & Sorbom, 1976, 1989) and Bentler and his coworkers (e.g., Bentler, 1985). Joreskog and his associates developed structural equation model (SEM)-based procedures and related computer programs that rely on the analysis of observed covariances between measured variables. These covariance structure analysis (CSA)-based programs include the now popular LISREL8 program and its predecessors (e.g., Joreskog & Sorbom, 1976, 1989). Other programs for performing CSA-based analyses (i.e., EQS and EQS/PC) were developed by Bentler (1985).

Analyses performed by the LISREL and EQS programs consider two major issues: One is the extent to which a theory-based model describing hypothesized causal connections between latent variables is consistent with an observed set of covariances between measured variables. Such analyses are concerned with the testing of latent variable models. The second major issue considered by CSA programs is the degree to which observables are a function of a hypothesized set of latent variables. Such analyses are concerned with measurement models. Of course, SEM procedures may simultaneously consider both of these issues. In this case, general or full models are the focus of the analysis (cf. Bollen, 1989; Joreskog & Sorbom, 1989).

The availability of such CSA software as EQS and LISREL may have resulted in changes in the way that organizational researchers (i.e., individuals in such fields as management, industrial and organizational psychology, organizational behavior, organizational theory, organizational communication) and researchers concerned with a host of other phenomena have approached the task of data analysis. More specifically, the existence of CSA-based software may have led to an increase in the use of structural equation modeling (SEM) procedures by researchers.

The existence of such software also may have led to systematic changes in the research designs used by investigators in various academic disciplines. For example, the availability of CSA software may have motivated organizational researchers to use non-experimental designs more frequently than in previous years. Several factors may account for this: First, it is often difficult, if not impossible, to conduct experimental studies in organizational contexts. Thus, nonexperimental studies, often involving questionnaire measures, may be perceived as easier to perform than experimental studies. Second, researchers may be reluctant to perform experimental research in laboratory contexts because of fears that such studies may be viewed as having low levels of external validity and as less likely to be published than the findings of nonexperimental, field-based research. Third, some organizational researchers may assume, quite incorrectly, that it is appropriate to derive causal inferences from studies that use nonexperimental designs if the data from such studies are analyzed with CSA-based procedures (e.g., EQS, LISREL). In view of these factors, the popularity of nonexperimental studies using CSA may have increased in recent years.

Shifts in prevailing designs and data analytic strategies are important for several reasons. First, such shifts may signal the need for changes in the content of university-based training programs. For example, courses in CSA procedures might be required of all individuals in graduate-level training programs in the organizational sciences (e.g., management, industrial and organizational psychology, organizational theory, organizational communication, and organizational behavior) and other disciplines. Second, as a result of such shifts, more emphasis might be placed on nonexperimental research designs and strategies in graduate-level research methods courses in various disciplines, including the organizational sciences. Third, to the degree that CSA procedures are used by researchers, "gate-keepers" of various types (e.g., journal editors, program chairs of conferences of scientific societies, members of editorial boards, reviewers of conference papers) will need to have sufficient familiarity with both CSA methods and the characteristics of nonexperimental research to provide competent evaluations of papers that are submitted for review.

Unfortunately, at present, there are no data on the extent to which CSA procedures are being used by organizational researchers. Moreover, there are no data on the degree to which research designs of various types are being used by organizational researchers. Thus, the overall purpose of the present study was to provide answers to two major questions: First, has the availability of such techniques as path analysis, LISREL and EQS led to changes (over time) in the way that data are analyzed by researchers in the organizational sciences. Second, have the basic designs used by researchers in the organizational sciences shifted over time? In order to provide answers to these questions we conducted a content analysis of articles published in the Journal of Applied Psychology for the 1975-1993 period. The results of this analysis were used for two major purposes. First, we assessed trends in the use of basic research designs (i.e., experiments, nonexperiments, and other designs) over time. Second, we appraised trends in the use of data analytic strategies that involve: (1) the comparison of group means (e.g., t tests of mean differences, analysis of variance, multivariate analysis of variance); (2) the assessment of relationships between two or more variables (e.g., zero-order correlation, multiple regression); and (3) the analysis of covariance structure data (e.g., testing of structural equation models). The major interest in these assessments was to determine if the frequency of use of given procedures and/or designs has increased or decreased systematically over the 1975-1993 period. To assess this we computed zero-order correlation coefficients between the year articles were published (time) and annual usage indices (described below) for various research design and data analytic strategy types. In addition, we examined yearly usage data for high and low points. Moreover, in instances where there were noticeable shifts in usage we tested for the presence of changes using appropriate statistical techniques (e.g., test of differences between proportions). These tests considered either selected pairs of years or selected sets of years for which there were apparent differences in usage.

Method

Sample

Data used in the present study were derived from content analyses of 1,929 articles published in the Journal of Applied Psychology during the 1975-1993 period (total of 19 years). The number of studies published per year ranged from 78 to 150, and averaged 101.53. Table 1 shows sample sizes for each of the years in the period.

Measures

Each article was coded in terms of several criteria: The first criterion was basic design. The categories under this rubric were: (1) experimental studies, including randomized or true experiments (including statistical simulations) and quasi-experiments; (2) nonexperiments or passive observational studies; and (3) other designs, including meta-analyses, narrative literature reviews, and comments. Note that a single article might report the results of research using more than one design. For example, a multi-study article might consider the results of both a true experiment conducted in a laboratory context and a nonexperiment conducted in a field setting. In cases of multi-study articles we coded each design type that was used.

[TABULAR DATA FOR TABLE 1 OMITTED]

The second basis for categorization was the principal data analytic procedure(s) used in the study. Here the coding was restricted to procedures that were directly relevant to testing a study's hypotheses or answering its research questions. Thus, for example, in a study that used multiple regression to test hypotheses about relationships between predictor variables and a dependent variable, a researcher might report the means, standard deviations, and reliabilities of measured variables. However, for purposes of the present study, only the regression analysis was coded. All other analyses were regarded as ancillary to the study's main objective(s).

The categorization scheme for data analytic strategies considered the following 15 factors: (1) CSA procedures (e.g:, LISREL, EQS); (2) classical path analysis; (3) zero-order correlation; (4) multiple regression/correlation; (5) canonical correlation; (6) discriminant function analysis and multiple discriminant function analysis; (7) factor analysis (common factor and principal components); (8) cluster analysis; (9) analysis of variance (ANOVA); (10) analysis of covariance (ANCOVA); (11) chi-square based tests of association; (12) multivariate analysis of variance (MANOVA); (13) multivariate analysis of covariance (MANCOVA); (14) t tests of mean differences; and (15) other data analytic strategies (e.g., multidimensional scaling, nonparametric analysis of variance). Of course, in many instances more than one data analytic strategy was used in a given study. In such instances coding was performed for all principal strategies used.

Annual percentage use indices were computed for each of the design types and data analytic strategies. In these indices the frequency of use of a given type of research design or data analytic strategy was divided by the number of articles published per year and the quotient was multiplied by 100. For example, the percentage use index (PUI) for nonexperimental designs for 1975 was (88 / 150) x 100 = 58.67%. Note that a PUI can range from 0 to 100 and it reflects the percentage of articles that used a particular type of design or data analytic strategy in a given year.

Procedure

Each article was examined by one of two coders (Jennifer Glenar or Amy Weaver) and a coding sheet was completed for it that provided for its coding in terms of the above noted set of categories for research designs and data analytic strategies. All aspects of a study were coded. To avert at least one form of systematic bias in the coding, one of the coders coded articles for even years (e.g., 1976, 1978) while the other coded articles for the odd years (e.g., 1977, 1979).

In order to assess the consistency of the coding process (i.e., the reliability of the coders), articles in six issues of 1987, three issues of 1992, and two issues of 1993 were coded by both coders. Consistency (reliability) was assessed by calculating the percentage agreement between the coders. This was 88% for 1987, 86% for 1992, and 91% for 1993. The less than perfect reliability stemmed primarily from differences in the ways that the coders dealt with supplementary analyses. For example, in a study that used analysis of variance to test one or more hypotheses, one coder might have coded post hoc tests of mean differences while the other coder did not.

Analyses

In order to assess trends in the use of research design types and data analytic strategies we computed zero-order correlation coefficients between time period (year of study) and the annual PUI values for design types and analysis strategies. Although this analysis has the potential to reveal linear trends, it provides underestimates of nonlinear trends. Note, moreover, that because the study considered PUI indices for only 19 years, correlation coefficients had to equal or exceed .456 to be statistically significant at the .05 level (two-tailed test).

In cases where there appeared to be non-chance based differences in the PUI values for a research design type or a data analytic strategy, we tested for the statistical significance of such differences using either small or large sample based tests of the equality of proportions (cf. Marascuilo & Serlin, 1988, pp. 323-327). As appropriate, these tests involved the PUIs of either pairs of years or two sets of years.

Results

Research Designs Used in Coded Articles

Rows 2-4 of Table 1 show a yearly breakdown of articles in terms of the research design criterion. The corresponding PUI values are plotted in Figure 1. As this figure shows, the PUI for nonexperimental designs decreased considerably from 1975 to 1978, then fluctuated somewhat about a mean of 47.85% for the 1979 to 1991 period. Interestingly, in 1992 the PUI for nonexperimental designs was 43.96%, well below the 1991 PUI of 50.0%, and it fell even more by 1993 (i.e., to 36.73%). Note, moreover that the 1975 and 1993 PUIs for nonexperimental designs differed markedly from one another. In spite of this, as the results in Table 2 show, the relationship between time period and PUI for nonexperimental research (r = -.38) was not statistically significant.

The PUI for experimental designs (randomized or quasi) varied somewhat around a mean of 42.33% for the 1975-1982 period. Overall, this PUI decreased considerably from 1982 (PUI = 48.11%) to 1990 (PUI = 32.95): A test of the difference between the corresponding proportions yielded a Z of 2.14, p = .016. Interestingly, the PUI for experimental designs increased in the 1990-1993 period, equaling 43.88% at the end of the period, somewhat below the 1980 high of 49.45%. Overall, however, there was not a consistent upward or downward trend in the use of experimental designs: The correlation between time and the PUI for experimental designs was a mere -.20 (p [greater than] .05).

PUI values for studies using other designs (e.g., literature reviews, meta analyses) remained fairly low for the entire 1975-1993 period, the average PUI being 13.37% for this period. Note, moreover, that the PUIs were below 24% for the entire period, reaching a peak of 23.08% in 1983. Interestingly, as the [TABULAR DATA FOR TABLE 2 OMITTED] results in Figure I show, there appeared to be an increase in the use of other designs over the study period. Consistent with this apparent trend, the correlation between time and the PUI for other designs was .71 (p [less than] .01).

Data Analytic Strategies Used in Coded Articles

Rows 5-21 of Table 1 show PUIs for data analytic strategies used in the 1975-1993 period. Figure 2 plots the yearly PUI levels for articles that tested for univariate or multivariate mean differences. The PUIs for MANCOVA, ANCOVA, and MANOVA were generally below 10% for the entire period. Overall, however, there was a slight, but fairly steady increase in the use of the MANOVA strategy over the 1975-1993 period: The correlation between time and the PUIs for MANOVA was .84 (p [less than] .01). There was also an increase in the use of the ANCOVA strategy over the study period: The correlation between time and the PUIs for ANCOVA was .74 (p [less than] .01).

The mean PUIs for the MANOVA, ANCOVA, and MANCOVA strategies across the 1975-1993 period were 7.93%, 2.85%, and 0.73%, respectively (see Table 1). Compared to these PUIs, the mean PUIs for the t test strategy and the ANOVA strategy were considerably greater, i.e., 21.67% and 35.51%, respectively. Note, however, that over time there was a general decrease in the use of the ANOVA strategy: As the results in Table 2 indicate, there was a -.54 (p [less than]; .05) correlation between time and the PUIs for ANOVA. Note also that although the mean PUI for ANOVA was 38.42% for the 1975-1984 period it was only 32.05% for the 1985-1993 period, a statistically significant decrease (Z = 2.91,p = .002). In addition, the 1993 PUI of 35.71% for ANOVA was considerably below the 1980 high of 46.15% for this strategy.

Figure 3 displays PUIs for selected correlation-based data analytic strategies (i.e., zero-order correlation, multiple regression, classical path analysis, and factor analysis). As can be seen in this figure, the PUI for zero-order correlation varied considerably about a mean of 42.09% for the 1975-1993 period and equaled 38.78% in 1993, about the same as the 1975 PUI of 42.67%. Overall, as the results in Figure 3 show, the use of multiple regression increased considerably over the period considered by the present study: Consistent with this seeming trend, the zero-order correlation between time and the PUIs for multiple regression was .68 (p [less than] .01). The increase in the use of this data analytic strategy is further illustrated by the shift in the 1975 and 1993 PUIs, i.e., 10.67% and 24.49%, respectively. Note finally that although there was quite a gap between the PUIs for zero-order correlation and multiple regression in 1975, by 1993 this gap had narrowed substantially.

Interestingly the PUIs for classical path analysis were below 6% for the entire 1975-1993 period. Moreover, the PUI for this strategy dropped to 0% by 1993. Overall, however, there was no stable trend in the use of this technique over the 19 year study period: The zero-order correlation between time and the PUIs for path analysis was .42 (p [greater than] .05).

The PUIs for exploratory factor analysis were generally below 10% for the 1975-1993 period and averaged only 8.45% for the entire period. Although an inspection of Figure 3 suggests that there was a slight decrease in the use of exploratory factor analysis, the zero-order correlation between time and the PUIs for exploratory factor analysis was a mere -. 18 (p [greater than] .05).

Figure 4 shows trends in the use of CSA techniques that test three different models: (1) measurement models only; (2) structural models only; and (3) both measurement and structural models (full models). For purposes of comparison the same figure also shows PUI levels for classical path analysis and exploratory factor analysis. As can be seen in the figure, there were no tests of measurement models in the 1975-1984 period. However, by 1993 the PUI for CSA-based tests of measurement models was 8.16%. Overall, there was a consistent upward trend in the PUIs for CSA-based tests of measurement models: The zero-order correlation between time and the PUIs for CSA-based tests of such models was .86 (p [less than] .01).

At the same time as the use of CSA-based tests of measurement models increased, there was no similar trend in the use of exploratory factor analysis: The zero order correlation between time and the PUIs for exploratory factor analysis was -.18 (p [greater than] .05). Interestingly, however, although the PUIs for exploratory factor analyses reached levels of 15.25% in 1976, 16.04% in 1982, 11.34% in 1986, and 11.83% in 1987, in the 1990s the PUIs for this procedure never exceeded 10%. Moreover, the 1993 PUI for exploratory factor analysis was only 6.12%.

As the results in Figure 4 suggest, over the study period there was an increase in the use of CSA procedures for testing structural models: The zero-order correlation between time and the PUIs for CSA-based tests of structural models was .62 (p [less than] .01). Note also that there was: (1) a .68 (p [less than] .01) correlation between the PUIs for CSA-based tests of structural models and CSA-based tests of measurement models; and (2) a .84 (p [less than] .01) correlation between CSA-based tests of structural models and CSA-based tests of full models. These results suggest concomitant increases in the use of the three CSA-based procedures.

Figure 4 also shows that although there was only one report of a CSA-based test of a full model in the 1975-1985 period, by 1993, 10.20% of the published articles reported such tests. Moreover, the zero-order correlation between time and the PUIs for full model tests was .84 (p [less than] .01). Overall, there was a dramatic rise in the use of CSA-based tests of full models over the 1975-1993 period.

Some Other Noteworthy Findings

Two other general trends are worthy of note. First, as can be seen in Table 2, there was a -.64 (p [less than] .01) correlation between the PUIs for ANOVA and multiple regression. As the use of ANOVA decreased the use of multiple regression increased. In view of the fact that neither of these PUIs correlated with the PUIs for either experimental or nonexperimental designs, the -.64 correlation appears to reflect a shift in data analytic strategies rather than a shift in design type. Second, over time there was a trend toward the increased use of multivariate data analytic strategies. This is illustrated, for example, by several correlation coefficients in Table 2; including: (1) the .66 (p [less than] .01) correlation between the PUIs for MANOVA and multiple regression; (2) the .58 (p [less than] .01) correlation between the PUIs for ANCOVA and MANCOVA; and (3) the .72 (p [less than] .01) correlation between the PUIs for MANCOVA and multiple regression.

Discussion

The present study had two major purposes. One was to assess trends in the use of non-experimental versus experimental designs in research reported during the 1975-1993 period. The other was to measure trends in the use of various data analytic strategies over the same period. In order to accomplish these purposes, we coded information from 1,929 articles that were published in the Journal of Applied Psychology during the 1975-1993 period. Results of various analyses showed that although there were differences from one year to the next in the PUIs for nonexperimental and experimental designs, there did not appear to be consistent upward or downward trends in these indices. However, there was an increase in the use of other types of designs over the 1975-1993 period. In addition, the results showed notable changes in the PUIs for several of the statistical procedures that test for univariate or multivariate mean differences, relative stability in the PUIs for zero-order correlation, an increase in the PUIs for multiple regression and several other multivariate statistical procedures, and substantial growth in the PUIs for CSA-based data analytic techniques (e.g., EQS, LISREL). Implications of these findings are now considered.

Declines in the Use of Speciffc Data Analytic Strategies

Simple data analytic strategies. The PUIs in Table 1 and the correlation coefficients in Table 2 showed a gradual decline in the use of some of the simpler data analytic procedures as the major means of testing hypotheses or answering research questions. For example, over time the use of ANOVA decreased somewhat. Although the PUIs for ANOVA have shown a general decline over the 1975-1993 period, there was an increase in the PUIs for MANOVA over the same period. More specifically the PUIs for MANOVA were 4.00% in 1975 and 12.24% in 1993. There are several possible causes of this increase. One is that researchers are becoming increasingly sensitive to the need to consider: (1) the effects of treatments on multiple dependent variables; and (2) the correlations between such variables when assessing the effects of experimental treatments. Unfortunately, we have no direct evidence to support this speculation about the cause of the increase in the use of MANOVA. However, the explanation seems reasonable in view of the fact that the correlation coefficients in Table 2 show that the PUIs for several multivariate procedures increased concomitantly over the 19 year study period.

The results in Figure 3 and the corresponding yearly PUIs in Table 2 reveal that although the use of zero-order correlation as a strategy for testing hypotheses appeared to vary considerably from one year to the next in the 1975-1993 period, there was not a consistent upward or downward trend in the use of this strategy. However, over the same period there was an increase in the use of multiple correlation/regression by researchers. The zero-order correlation between time and the use of multiple regression/correlation was .68 (p [less than] .01). Note also that although from 1975 to 1982 the PUI for multiple regression reached a maximum value of 19.81%, after 1983 the PUI for this strategy was generally far above 20%. It reached a high of 37.63% in 1987 and equaled 24.49% in 1993. These results may be a function of several factors. First, there may be an increased awareness on the part of researchers that variability in measured levels of dependent variables is typically a function of more than a single putative cause. It may also be a function of increased demands for multivariate analyses by journal editors and reviewers of papers that are submitted for publication. Whatever the cause or causes of the upward trend in the use of selected multivariate statistical procedures (e.g., MANOVA, multiple regression), it appears that some of the relatively simple analyses that were used to test the hypotheses of many articles published in the 1960s and 1970s (e.g., ANOVA) will be used less and less in future years as the principal means of testing hypotheses or answering research questions.

Path analysis. Classical path analysis was used with varying frequency in the period covered by the present study. In no case, however, was this strategy used in over 5.32% of the articles published in any given year. Moreover, by 1993 there were no articles that reported the use of this strategy. Overall, therefore, it appears that the use of path analysis has all but ended. This may very well be attributable to the rapid rise in the availability of main-frame and personal computer versions of such CSA-based programs as LISREL and EQS. The CSA procedures accomplished by these programs have a far better capacity to assess the plausibility of assumed causal models than does classical path analysis.

Exploratory factor analysis. As noted above, the study's results suggest that the while the use of CSA-based procedures for conducting confirmatory factor analyses (i.e., testing measurement models) increased, the use of exploratory factor analysis did not. This seems to be a healthy trend. The aims of measurement are far better served by instrument development procedures that begin with a firm set of expectations about what is being assessed by a measure than they are by efforts to "discover" the structure inherent in the measure's items. In addition, construct validation efforts are better served by clearly stated hypotheses about relationships between and among measures of various constructs than they are by attempts to define higher-order constructs through exploratory methods.

Implications for Researchers Who Conduct Nonexperimental Studies

Overall, it appears that researchers who want to publish the results of nonexperimental studies in the major journals of the organizational sciences will have to learn to use and properly interpret the results of such CSA-based programs as LISREL and EQS. One reason for this is that it seems likely that journal editors, editorial board members, and reviewers of papers submitted for presentation at meetings of various scientific societies will increasingly require that researchers use CSA-based procedures to analyze data from passive observational studies that deal with multiple assumed causes and effects. Of course, this does not mean that all articles published in journals or presented at scientific conferences will have to use CSA-based methods. If current trends continue, however, not knowing CSA methods will leave researchers at a great disadvantage. Among the reasons for this are that CSA methods allow for inferences about both the direct and indirect "effects" of variables. Moreover, CSA procedures correct observed relationships between variables for the attenuation that is attributable to unreliable measurement (cf. Bollen, 1989; Joreskog & Sorbom, 1989).

A Caveat Concerning Causal Inferences

In spite of the notable advantages that CSA-based techniques have over other data analysis strategies, the results of such analyses are subject to improper interpretation. More specifically, although the findings of SEM analyses allow researchers to test the plausibility of theoretical models that link latent variables to one another and assume the existence of causal connections between such variables, the same findings do not allow for conclusions about either the existence of causality or the direction of causal flow. For example, a nonexperimental study may demonstrate that measures of variables X (a presumed cause) and Y (a presumed effect) covary. This covariance would be consistent with an underlying model that X causes Y. However, it would also be consistent with a model positing that Y causes X. In addition, it would be consistent with a model that specifies that X and Y covary, not because they affect one another, but because they share a common cause or set of causes (e.g., U, V, W). Thus, it is inappropriate to make causal inferences on the basis of data from nonexperimental research (cf. Bollen, 1989). Unfortunately, the results of CSA-based procedures that have been applied to data from nonexperimental studies have often been interpreted in a causal manner. That is, many researchers seem to operate on the basis of the incorrect belief that design deficiencies can be compensated for by sophisticated data analytic methods.

Some Limitations of the Present Study

Our study demonstrated what would appear to be some interesting and important trends in data analysis procedures for the 1975-1993 period. However, the study has two important limitations. One is that it used data from articles published in only one journal in the organizational sciences, i.e., the Journal of Applied Psychology. Articles published in this journal tend to focus more on phenomena studied at the level of the individual than at the level of the organization. Thus, the findings of our study may not be representative of research published in journals that are more macro in their orientation (e.g., Administrative Science Quarterly, Academy of Management Journal, Journal of Management). As a result, our findings may either underestimate or overestimate the use of CSA procedures in articles appearing in the more macro-oriented journals. On the one hand, underestimation might result from the fact that research concerned with macro issues is far less likely to be experimental in nature than research concerned with micro issues. As a consequence, SEM procedures would be more likely to be used in macro organizational research than the data analytic procedures that have typically been used to analyze data from experiments (e.g., ANOVA, ANCOVA, MANOVA, MANCOVA). On the other hand, our data might overestimate the use of SEM procedures in macro research for two reasons. First, macro-oriented researchers seem to have less of a psychometric orientation than micro-oriented investigators. Thus, macro-oriented researchers may be less likely than micro-oriented researchers to use SEM techniques. Second, the sample sizes of many macro-oriented studies may be too small to allow for the effective use of SEM procedures.

A second limitation of our study is that it spanned the 1975-1993 period. As a result, the correlation coefficients reported in Table 2 were all based on a sample size of 19. This greatly limited the capacity of our study to find statistically significant relationships between time and the PUIs for various research design types and data analytic strategies. For example, the -.38 correlation between time and the PUIs for nonexperimental designs, although of considerable magnitude, was not statistically significant using a two-tailed test and a .05 Type I error rate criterion. Thus, our study's results probably underestimate trends in the use of some design types and data analysis strategies.

A Final Note

Results of the present study showed evidence of what may very well be important trends in the way that researchers have analyzed data from empirical studies. More specifically, our study showed nontrivial increases in the use of multivariate data analytic strategies, including CSA-based procedures (e.g., LISREL, EQS). The latter procedures are of special importance because they allow researchers to assess (1) the plausibility of models that specify causal connections between latent variables and (2) the degree to which observed variables are a function of latent variables. If the data analytic trends revealed by the present study continue, there will need to be commensurate changes in the training of scientists and practitioners who study organizational phenomena. Moreover, many individuals who completed their graduate training before 1980 will need to learn to properly use CSA procedures and correctly interpret the output of CSA programs. We hope that our results serve to motivate this learning.

Acknowledgment: An abbreviated version of this paper was presented at the Causal Modeling Conference, Purdue University, West Lafayette, Indiana, March 3, 1994. The second and third authors contributed equally to the research reported in this article. Thus, the order of authorship for them was randomly determined. We thank Larry Williams and Dianna L. Stone for many helpful comments on an earlier version of this article.

References

Asher, H. B. (1976). Causal modeling. Beverly Hills, CA: Sage.

Bentler, P. M. (1985). Theory and implementation of EQS: A structural equations program. Los Angeles, CA: BMDP Statistical Software.

Blalock, H. M. (1961). Correlation and causality: The multivariate case. Social Forces, 39: 246-251.

-----. (1964). Causal inferences in nonexperimental research. New York: Norton.

----- (ed.) (1971). Causal models in the social sciences. Chicago, IL: Aldine.

Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.

Duncan, O. D. (1966). Path analysis: Sociological examples. American Journal of Sociology, 72: 1-16.

-----. (1975). Introduction to structural equation models. New York: Academic Press.

James, L. R., Mulaik, S. & Brett, J. (1982). Causal analysis: Assumptions, models, and data. Beverly Hills, CA: Sage.

Joreskog, K. G. (1970). A general method for analysis of covariance structures. Biometrika, 57: 239-251.

-----. (1973). A general method for estimating a linear structural equation system. Pp. 85-112 in A. S. Goldberger & O. D. Duncan (eds.), Structural equation models in the social sciences. New York: Academic Press.

-----. (1978). Structural analysis of covariance and correlation matrices. Psychometrika, 43: 443-477.

Joreskog, K. G & Sorbom, D. (1976). LISREL-Estimation of linear structural equation systems by the method of maximum likelihood. Chicago, IL: International Educational Services.

-----. (1989). LISREL 7 user's reference guide. Mooresville, IN: Scientific Software, Inc.

Kenny, D. A. (1975). Cross-lagged panel correlations: A test for spuriousness. Psychological Bulletin, 82: 887-903.

-----. (1979). Correlation and causality. New York: Wiley.

Kerlinger, F. N. (1986). Foundations of behavioral research, 3rd ed. New York: Holt, Rinehart & Winston.

Lazarsfeld, P. F. (1948). The use of panels in social research. Proceedings of the American Philosophical Society, 92: 405-410.

-----. (1959). Latent structure analysis. Pp. 476-543 in S. Koch (ed.), Psychology: A study of science, Vol. 3. New York: McGraw-Hill.

-----. (1972). Mutual effects of statistical variables. In P. F. Lazarsfeld, A. K. Pasanella & M. Rosenberg (eds.), Continuities in the language of social research. New York: Free Press.

Marascuilo, L. A. & Serlin, R. C. (1988). Statistical methods for the social and behavioral sciences. New York: W. H. Freeman.

Rogosa, D. (1980). A critique of cross-lagged correlation. Psychological Bulletin, 88: 245-258.

Rozelle, R. M. & Campbell, D. T. (1969). More plausible rival hypotheses in the cross-lagged panel correlation technique. Psychological Bulletin, 71: 74-80.

Runkel, P. J. & McGrath, J. E. (1972). Research on human behavior: A systematic guide to method. New York: Holt, Rinehart & Winston.

Simon, H. A. (1954/ 1971). Spurious correlation: A causal interpretation. In H. M. Blalock (ed.), Causal models in the social sciences. Chicago, IL: Aldine-Atherton.

Stone, E. F. (1978). Research methods in organizational behavior. Homewood, IL: Scott, Foresman.

Wright, S. (1921). Correlation and causation. Journal of Agricultural Research, 20: 557-585.

-----. (1934). The method of path coefficients. Annals of Mathematical Statistics, 5: 161-215.

-----. (1960). Path coefficients and path regressions: Alternative or complementary concepts. Biometrics, 16: 189-202.

Printer friendly Cite/link Email Feedback | |

Author: | Stone-Romero, Eugene F.; Weaver, Amy E.; Glenar, Jennifer L. |
---|---|

Publication: | Journal of Management |

Article Type: | Bibliography |

Date: | Mar 22, 1995 |

Words: | 6111 |

Previous Article: | Testing for cross-situational-consistency: a confirmatory factor analytic approach. |

Next Article: | The impact of method effects on structural parameters in validation research. |

Topics: |